EPA ECHO Facility Compliance Scraper avatar

EPA ECHO Facility Compliance Scraper

Pricing

Pay per event

Go to Apify Store
EPA ECHO Facility Compliance Scraper

EPA ECHO Facility Compliance Scraper

Export EPA ECHO Clean Water Act facility compliance records by state, county, city, or ZIP using the official public EPA API.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Export EPA ECHO Clean Water Act facility compliance records from the official public EPA API.

Use this Apify Actor to collect facility names, permit/source IDs, addresses, EPA program details, coordinates, demographic indicators, design-flow values, effective dates, and query provenance by state, county, city, ZIP code, and active-status filter.

What does EPA ECHO Facility Compliance Scraper do?

EPA ECHO Facility Compliance Scraper queries EPA's Enforcement and Compliance History Online (ECHO) Clean Water Act REST service and saves normalized facility rows to an Apify dataset.

It is built for repeatable compliance, ESG, due-diligence, market research, insurance, and industrial site-screening workflows where analysts need structured facility lists instead of manual downloads.

Who is it for?

  • ๐Ÿญ Environmental compliance teams screening regulated facilities.
  • ๐ŸŒŽ ESG analysts monitoring Clean Water Act exposure.
  • ๐Ÿข Industrial real-estate and site-selection researchers.
  • ๐Ÿงพ Due-diligence consultants collecting facility evidence for reports.
  • ๐Ÿ›ก๏ธ Insurers and risk teams reviewing regulated-site geography.
  • ๐Ÿ“Š Data teams that need scheduled EPA ECHO exports in a warehouse.

Why use this actor?

  • Uses official EPA ECHO public endpoints; no browser or login is required.
  • Produces clean JSON rows that are easier to join, filter, and export than raw CSV.
  • Captures EPA query metadata so every record can be traced back to a query.
  • Runs on Apify, so you can schedule, export, call by API, or connect it to automations.

Data source

The actor uses EPA ECHO CWA REST services:

  • cwa_rest_services.get_facilities for query metadata.
  • cwa_rest_services.get_download for CSV rows.

Version 1 is deliberately CWA-first for reliability. Air, RCRA, SDW, and deeper violation detail expansion can be added later after the stable CWA path passes QA.

How much does it cost to scrape EPA ECHO facility records?

This actor uses pay-per-event pricing:

  • Start event: a small one-time run fee.
  • Facility saved: a tiered per-record fee for every dataset item produced.

Exact prices are shown on the Apify Store pricing panel. Set maxItems to a small number for test runs and increase it for production exports.

Input options

FieldTypeDescription
statestringTwo-letter US state code, e.g. CA, TX, NY.
countystringOptional county filter.
citystringOptional city filter.
zipstringOptional ZIP code filter.
activeOnlybooleanRequest active facilities only.
maxItemsintegerMaximum facility records to save.

Example input

{
"state": "CA",
"activeOnly": true,
"maxItems": 100
}

Example county workflow

{
"state": "TX",
"city": "AUSTIN",
"activeOnly": true,
"maxItems": 250
}

Output fields

FieldDescription
facilityNameFacility name from EPA ECHO.
sourceIdEPA/source system identifier.
programProgram emitted by this actor; currently CWA.
statuteStatute value from ECHO, usually CWA.
street, city, state, countyFacility location fields.
stateDistrictState district when provided.
federalAgencyNameFederal agency name when provided.
longitudeFacility longitude from ECHO.
totalDesignFlowCWA design-flow number where available.
percentPeopleOfColorACS/EJ demographic indicator.
acsPopulationDensityACS population density.
indianCountryFlagEPA Indian Country flag.
indianSpatialFlagEPA spatial flag.
effectiveDatePermit/effective date from the export.
queryId, queryRowsEPA query metadata.
pageNumberDownload page used by the actor.
scrapedAtISO timestamp for the actor run.

Example output item

{
"facilityName": "150 EL CAMINO DRIVE OFFICE BUILDING",
"sourceId": "CAC320379",
"program": "CWA",
"statute": "CWA",
"street": "150 EL CAMINO",
"city": "BEVERLY HILLS",
"state": "CA",
"county": "LOS ANGELES COUNTY",
"longitude": -118.39986,
"percentPeopleOfColor": 42.08,
"effectiveDate": "11/05/2024"
}

How to run

  1. Open the actor on Apify.
  2. Enter a state code and optional geography filters.
  3. Choose maxItems based on your budget and export needs.
  4. Start the run.
  5. Download the dataset as JSON, CSV, Excel, XML, or RSS.

Tips for best results

  • Start with a state-only run to verify the volume returned by EPA ECHO.
  • Use maxItems for quick samples before scheduling large exports.
  • Keep activeOnly enabled for current facility monitoring.
  • Save queryId with your downstream data for reproducibility.

Integrations

  • ๐Ÿ“ฅ Send scheduled results to Google Sheets or Airtable.
  • ๐Ÿ—๏ธ Load CSV/JSON exports into a data warehouse.
  • ๐Ÿ”” Trigger alerts when a state or region export changes.
  • ๐Ÿงฉ Join sourceId with internal facility or permit systems.
  • ๐Ÿ“ Feed rows into due-diligence report generation workflows.

API usage: Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/epa-echo-facility-compliance-scraper').call({
state: 'CA',
activeOnly: true,
maxItems: 100,
});
console.log(run.defaultDatasetId);

API usage: Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/epa-echo-facility-compliance-scraper').call(run_input={
'state': 'CA',
'activeOnly': True,
'maxItems': 100,
})
print(run['defaultDatasetId'])

API usage: cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~epa-echo-facility-compliance-scraper/runs?token=YOUR_APIFY_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"state":"CA","activeOnly":true,"maxItems":100}'

MCP usage

Use Apify MCP to call this actor from Claude Desktop or Claude Code.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper

Claude Code CLI setup:

$claude mcp add apify-epa-echo "https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper"

Claude Desktop JSON config:

{
"mcpServers": {
"apify-epa-echo": {
"url": "https://mcp.apify.com/?tools=automation-lab/epa-echo-facility-compliance-scraper"
}
}
}

Example prompts:

  • "Run the EPA ECHO Facility Compliance Scraper for active CWA facilities in California and summarize the counties represented."
  • "Export 100 EPA ECHO CWA facilities for Texas and identify facilities with missing county values."
  • "Create a due-diligence checklist from these EPA ECHO facility records."

Scheduling

Schedule the actor weekly or monthly to maintain repeatable facility exports for a state or region. Store run IDs and query IDs so your compliance team can compare historical snapshots.

Data quality notes

EPA ECHO fields can be blank for some facilities. Null values in the dataset usually mean the official export did not provide that field for that record.

Limitations

  • Version 1 focuses on CWA facility rows.
  • Coordinates currently include longitude as exposed by the CWA CSV export.
  • Some optional filters depend on EPA ECHO parameter support and may return fewer records than broad state runs.

Legality

EPA ECHO is a public government data source. You are responsible for using exported data in line with applicable laws, regulations, and your organization's compliance policies.

FAQ

Can I scrape all EPA ECHO programs?

Version 1 focuses on Clean Water Act facility exports. Use it when CWA facility coverage is the priority; ask for Air, RCRA, SDW, or detailed penalties if your workflow requires those programs.

Is this official EPA data?

The actor uses public EPA ECHO REST and CSV endpoints and normalizes the returned rows into Apify dataset records.

Troubleshooting

Why did my run return zero records?

Try a broader state-only query first. Then add city, county, or ZIP filters one at a time. EPA ECHO's supported filter vocabulary may differ from common display names.

Why are some output fields null?

The official EPA export does not populate every field for every facility. Null values preserve that distinction instead of inventing data.

Explore related automation-lab actors for compliance, public records, business enrichment, and government data workflows on Apify.

Changelog

  • v0.1: Initial CWA facility export using EPA ECHO public API.