# EPA ECHO Facility Compliance Scraper (`automation-lab/epa-echo-facility-compliance-scraper`) Actor

Export EPA ECHO Clean Water Act facility compliance records by state, county, city, or ZIP using the official public EPA API.

- **URL**: https://apify.com/automation-lab/epa-echo-facility-compliance-scraper.md
- **Developed by:** [Stas Persiianenko](https://apify.com/automation-lab) (community)
- **Categories:** Business
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## EPA ECHO Facility Compliance Scraper

Export EPA ECHO Clean Water Act facility compliance records from the official public EPA API.

Use this Apify Actor to collect facility names, permit/source IDs, addresses, EPA program details, coordinates, demographic indicators, design-flow values, effective dates, and query provenance by state, county, city, ZIP code, and active-status filter.

### What does EPA ECHO Facility Compliance Scraper do?

EPA ECHO Facility Compliance Scraper queries EPA's Enforcement and Compliance History Online (ECHO) Clean Water Act REST service and saves normalized facility rows to an Apify dataset.

It is built for repeatable compliance, ESG, due-diligence, market research, insurance, and industrial site-screening workflows where analysts need structured facility lists instead of manual downloads.

### Who is it for?

- 🏭 Environmental compliance teams screening regulated facilities.
- 🌎 ESG analysts monitoring Clean Water Act exposure.
- 🏢 Industrial real-estate and site-selection researchers.
- 🧾 Due-diligence consultants collecting facility evidence for reports.
- 🛡️ Insurers and risk teams reviewing regulated-site geography.
- 📊 Data teams that need scheduled EPA ECHO exports in a warehouse.

### Why use this actor?

- Uses official EPA ECHO public endpoints; no browser or login is required.
- Produces clean JSON rows that are easier to join, filter, and export than raw CSV.
- Captures EPA query metadata so every record can be traced back to a query.
- Runs on Apify, so you can schedule, export, call by API, or connect it to automations.

### Data source

The actor uses EPA ECHO CWA REST services:

- `cwa_rest_services.get_facilities` for query metadata.
- `cwa_rest_services.get_download` for CSV rows.

Version 1 is deliberately CWA-first for reliability. Air, RCRA, SDW, and deeper violation detail expansion can be added later after the stable CWA path passes QA.

### How much does it cost to scrape EPA ECHO facility records?

This actor uses pay-per-event pricing:

- Start event: a small one-time run fee.
- Facility saved: a tiered per-record fee for every dataset item produced.

Exact prices are shown on the Apify Store pricing panel. Set `maxItems` to a small number for test runs and increase it for production exports.

### Input options

| Field | Type | Description |
| --- | --- | --- |
| `state` | string | Two-letter US state code, e.g. `CA`, `TX`, `NY`. |
| `county` | string | Optional county filter. |
| `city` | string | Optional city filter. |
| `zip` | string | Optional ZIP code filter. |
| `activeOnly` | boolean | Request active facilities only. |
| `maxItems` | integer | Maximum facility records to save. |

### Example input

```json
{
  "state": "CA",
  "activeOnly": true,
  "maxItems": 100
}
````

### Example county workflow

```json
{
  "state": "TX",
  "city": "AUSTIN",
  "activeOnly": true,
  "maxItems": 250
}
```

### Output fields

| Field | Description |
| --- | --- |
| `facilityName` | Facility name from EPA ECHO. |
| `sourceId` | EPA/source system identifier. |
| `program` | Program emitted by this actor; currently `CWA`. |
| `statute` | Statute value from ECHO, usually `CWA`. |
| `street`, `city`, `state`, `county` | Facility location fields. |
| `stateDistrict` | State district when provided. |
| `federalAgencyName` | Federal agency name when provided. |
| `longitude` | Facility longitude from ECHO. |
| `totalDesignFlow` | CWA design-flow number where available. |
| `percentPeopleOfColor` | ACS/EJ demographic indicator. |
| `acsPopulationDensity` | ACS population density. |
| `indianCountryFlag` | EPA Indian Country flag. |
| `indianSpatialFlag` | EPA spatial flag. |
| `effectiveDate` | Permit/effective date from the export. |
| `queryId`, `queryRows` | EPA query metadata. |
| `pageNumber` | Download page used by the actor. |
| `scrapedAt` | ISO timestamp for the actor run. |

### Example output item

```json
{
  "facilityName": "150 EL CAMINO DRIVE OFFICE BUILDING",
  "sourceId": "CAC320379",
  "program": "CWA",
  "statute": "CWA",
  "street": "150 EL CAMINO",
  "city": "BEVERLY HILLS",
  "state": "CA",
  "county": "LOS ANGELES COUNTY",
  "longitude": -118.39986,
  "percentPeopleOfColor": 42.08,
  "effectiveDate": "11/05/2024"
}
```

### How to run

1. Open the actor on Apify.
2. Enter a state code and optional geography filters.
3. Choose `maxItems` based on your budget and export needs.
4. Start the run.
5. Download the dataset as JSON, CSV, Excel, XML, or RSS.

### Tips for best results

- Start with a state-only run to verify the volume returned by EPA ECHO.
- Use `maxItems` for quick samples before scheduling large exports.
- Keep `activeOnly` enabled for current facility monitoring.
- Save `queryId` with your downstream data for reproducibility.

### Integrations

- 📥 Send scheduled results to Google Sheets or Airtable.
- 🏗️ Load CSV/JSON exports into a data warehouse.
- 🔔 Trigger alerts when a state or region export changes.
- 🧩 Join `sourceId` with internal facility or permit systems.
- 📝 Feed rows into due-diligence report generation workflows.

### API usage: Node.js

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/epa-echo-facility-compliance-scraper').call({
  state: 'CA',
  activeOnly: true,
  maxItems: 100,
});
console.log(run.defaultDatasetId);
```

### API usage: Python

```python
from apify_client import ApifyClient

client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/epa-echo-facility-compliance-scraper').call(run_input={
    'state': 'CA',
    'activeOnly': True,
    'maxItems': 100,
})
print(run['defaultDatasetId'])
```

### API usage: cURL

```bash
curl -X POST 'https://api.apify.com/v2/acts/automation-lab~epa-echo-facility-compliance-scraper/runs?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"state":"CA","activeOnly":true,"maxItems":100}'
```

### MCP usage

Use Apify MCP to call this actor from Claude Desktop or Claude Code.

MCP URL:

```text
https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper
```

Claude Code CLI setup:

```bash
claude mcp add apify-epa-echo "https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper"
```

Claude Desktop JSON config:

```json
{
  "mcpServers": {
    "apify-epa-echo": {
      "url": "https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper"
    }
  }
}
```

Example prompts:

- "Run the EPA ECHO Facility Compliance Scraper for active CWA facilities in California and summarize the counties represented."
- "Export 100 EPA ECHO CWA facilities for Texas and identify facilities with missing county values."
- "Create a due-diligence checklist from these EPA ECHO facility records."

### Scheduling

Schedule the actor weekly or monthly to maintain repeatable facility exports for a state or region. Store run IDs and query IDs so your compliance team can compare historical snapshots.

### Data quality notes

EPA ECHO fields can be blank for some facilities. Null values in the dataset usually mean the official export did not provide that field for that record.

### Limitations

- Version 1 focuses on CWA facility rows.
- Coordinates currently include longitude as exposed by the CWA CSV export.
- Some optional filters depend on EPA ECHO parameter support and may return fewer records than broad state runs.

### Legality

EPA ECHO is a public government data source. You are responsible for using exported data in line with applicable laws, regulations, and your organization's compliance policies.

### FAQ

#### Can I scrape all EPA ECHO programs?

Version 1 focuses on Clean Water Act facility exports. Use it when CWA facility coverage is the priority; ask for Air, RCRA, SDW, or detailed penalties if your workflow requires those programs.

#### Is this official EPA data?

The actor uses public EPA ECHO REST and CSV endpoints and normalizes the returned rows into Apify dataset records.

### Troubleshooting

#### Why did my run return zero records?

Try a broader state-only query first. Then add city, county, or ZIP filters one at a time. EPA ECHO's supported filter vocabulary may differ from common display names.

#### Why are some output fields null?

The official EPA export does not populate every field for every facility. Null values preserve that distinction instead of inventing data.

### Related scrapers

Explore related automation-lab actors for compliance, public records, business enrichment, and government data workflows on Apify.

### Changelog

- v0.1: Initial CWA facility export using EPA ECHO public API.

# Actor input Schema

## `state` (type: `string`):

Two-letter US state code, for example CA, TX, NY, or FL.

## `county` (type: `string`):

Optional county name as used by ECHO, for example LOS ANGELES. Leave blank to search the whole state.

## `city` (type: `string`):

Optional facility city filter.

## `zip` (type: `string`):

Optional ZIP code filter.

## `activeOnly` (type: `boolean`):

When enabled, requests only active Clean Water Act facilities from EPA ECHO.

## `maxItems` (type: `integer`):

Maximum number of facility rows to save to the dataset.

## Actor input object example

```json
{
  "state": "CA",
  "activeOnly": true,
  "maxItems": 20
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "state": "CA",
    "activeOnly": true,
    "maxItems": 20
};

// Run the Actor and wait for it to finish
const run = await client.actor("automation-lab/epa-echo-facility-compliance-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "state": "CA",
    "activeOnly": True,
    "maxItems": 20,
}

# Run the Actor and wait for it to finish
run = client.actor("automation-lab/epa-echo-facility-compliance-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "state": "CA",
  "activeOnly": true,
  "maxItems": 20
}' |
apify call automation-lab/epa-echo-facility-compliance-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "EPA ECHO Facility Compliance Scraper",
        "description": "Export EPA ECHO Clean Water Act facility compliance records by state, county, city, or ZIP using the official public EPA API.",
        "version": "0.1",
        "x-build-id": "g8GGaA75Lc5xFhKc4"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/automation-lab~epa-echo-facility-compliance-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-automation-lab-epa-echo-facility-compliance-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/automation-lab~epa-echo-facility-compliance-scraper/runs": {
            "post": {
                "operationId": "runs-sync-automation-lab-epa-echo-facility-compliance-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/automation-lab~epa-echo-facility-compliance-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-automation-lab-epa-echo-facility-compliance-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "state": {
                        "title": "State",
                        "minLength": 2,
                        "maxLength": 2,
                        "type": "string",
                        "description": "Two-letter US state code, for example CA, TX, NY, or FL.",
                        "default": "CA"
                    },
                    "county": {
                        "title": "County",
                        "type": "string",
                        "description": "Optional county name as used by ECHO, for example LOS ANGELES. Leave blank to search the whole state."
                    },
                    "city": {
                        "title": "City",
                        "type": "string",
                        "description": "Optional facility city filter."
                    },
                    "zip": {
                        "title": "ZIP code",
                        "type": "string",
                        "description": "Optional ZIP code filter."
                    },
                    "activeOnly": {
                        "title": "Active facilities only",
                        "type": "boolean",
                        "description": "When enabled, requests only active Clean Water Act facilities from EPA ECHO.",
                        "default": true
                    },
                    "maxItems": {
                        "title": "Maximum facilities",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum number of facility rows to save to the dataset.",
                        "default": 20
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
