# EPA Facility Scraper — Violations & Penalties | $9.9/1K (`bovi/epa-data-scraper`) Actor

Search US EPA regulated facilities by state. Returns compliance status, violation history, enforcement actions, and penalties across Clean Water Act, Clean Air Act, RCRA, and Drinking Water programs.

- **URL**: https://apify.com/bovi/epa-data-scraper.md
- **Developed by:** [Vitalii Bondarev](https://apify.com/bovi) (community)
- **Categories:** Lead generation, Business, MCP servers
- **Stats:** 1 total users, 1 monthly users, 0.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $9.60 / 1,000 facility records

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### EPA Facility Scraper — Violations, Enforcement & Penalties

Search US EPA regulated facilities by state. Returns compliance status, violation history, enforcement actions, and penalties across Clean Water Act, Clean Air Act, RCRA, and Drinking Water programs.

#### What it does
This actor extracts structured public data via official/public endpoints — no login, no API key required from you. Results are returned as clean JSON dataset items, one row per record, with a `parse_confidence` field for quality.

#### Use cases
- Lead generation, market research, and competitive intelligence.
- Feeding pipelines, dashboards, and AI agents with fresh structured data.
- Monitoring changes over time on a schedule.

#### Input
Provide the search/query parameters (see the input schema). Bound the run with `maxItems`.

#### Fields returned
Each record includes the facility name and address, registry ID, current compliance status, counts of recent violations and formal enforcement actions, total assessed penalties, the regulatory programs in scope (Clean Water Act, Clean Air Act, RCRA hazardous waste, Safe Drinking Water Act), and a `parse_confidence` score so you can filter on data quality.

#### Data source
Data comes from the EPA's official public compliance and enforcement records — the same authoritative source regulators publish — so results are accurate and require no credentials or scraping of locked pages.

#### Output
One dataset item per record with the key fields and `parse_confidence`. Export to JSON/CSV/Excel or pull via the Apify API/MCP. Schedule the actor to track compliance changes over time and feed alerts, dashboards, or AI agents with fresh structured environmental data.


### 💰 Pricing & how we compare

**Pay-per-result (PPE): $9.90 / 1K facilities ($0.0099 each).** You are billed per `facility-record` actually returned — plus the tiny
`apify-actor-start` fee Apify waives for short runs. No subscription, no API key, no proxy fee on top.

**Our edge:** Official EPA public compliance data · violations + enforcement + penalties + programs · `parse_confidence`.

**Pricing examples** (pay only for what you get, minus Apify's 20%):

| Volume | Cost |
|---|---|
| 100 facilities | $0.99 |
| 1,000 facilities | $9.90 |
| 10,000 facilities | $99.00 |

#### How rivals price the same job (live Apify Store, checked 2026-06-09)

| Actor | Their price | What they lack vs us |
|---|---|---|
| `fortuitous_pirate/epa-echo-facilities` | $0.01 / facility | we charm-undercut to $0.0099 + add parse_confidence |
| `ryanclinton/epa-echo-search` | $0.03 / result | 3× our price |
| `compute-edge/epa-echo-scraper` | $0.004 / result | cheaper but 0 monthly users / unproven |

_Prices above are competitors' live Store prices at the time of writing; ours is set to sit just
below the strongest comparable while returning richer, quality-scored data._

### 🤖 Use with AI agents (MCP)

This actor is agent-ready (category **MCP_SERVERS**). Point any MCP client (Claude Desktop, Cursor,
n8n AI, LangGraph) at it:

```json
{
  "mcpServers": {
    "apify": {
      "url": "https://mcp.apify.com/?actors=bovi/epa-data-scraper",
      "headers": { "Authorization": "Bearer <YOUR_APIFY_TOKEN>" }
    }
  }
}
````

# Actor input Schema

## `state` (type: `string`):

US state abbreviation to filter facilities (e.g. TX, CA, NY). Leave blank to search nationwide (slower).

## `program` (type: `string`):

Which EPA enforcement program to query. CWA = Clean Water Act (NPDES permits). AIR = Clean Air Act. RCRA = Hazardous Waste. ALL = union of all four programs (slower, more comprehensive).

## `violationsOnly` (type: `boolean`):

If true, return only facilities currently in violation or with recent enforcement actions. If false, return all regulated facilities (compliant + non-compliant).

## `significantOnly` (type: `boolean`):

If true, return only facilities with Significant Non-Compliance (SNC) status — the highest-priority violators. Useful for focused outreach. Requires violationsOnly=true.

## `maxResults` (type: `integer`):

Maximum number of facility records to return. 0 = unlimited (may be large).

## Actor input object example

```json
{
  "state": "TX",
  "program": "CWA",
  "violationsOnly": true,
  "significantOnly": false,
  "maxResults": 100
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing Epa Data Scraper records (facility\_name, program, city, state, violation\_status, quarters\_with\_violations, formal\_enforcement\_count, total\_penalties\_usd, date\_last\_inspection, permit\_status, epa\_report\_url, parse\_confidence).

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("bovi/epa-data-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("bovi/epa-data-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call bovi/epa-data-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=bovi/epa-data-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "EPA Facility Scraper — Violations & Penalties | $9.9/1K",
        "description": "Search US EPA regulated facilities by state. Returns compliance status, violation history, enforcement actions, and penalties across Clean Water Act, Clean Air Act, RCRA, and Drinking Water programs.",
        "version": "0.1",
        "x-build-id": "HfZSsQU8NUIH6jj3E"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/bovi~epa-data-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-bovi-epa-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/bovi~epa-data-scraper/runs": {
            "post": {
                "operationId": "runs-sync-bovi-epa-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/bovi~epa-data-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-bovi-epa-data-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "state": {
                        "title": "State (2-letter code)",
                        "type": "string",
                        "description": "US state abbreviation to filter facilities (e.g. TX, CA, NY). Leave blank to search nationwide (slower).",
                        "default": "TX"
                    },
                    "program": {
                        "title": "EPA Program",
                        "enum": [
                            "CWA",
                            "AIR",
                            "RCRA",
                            "ALL"
                        ],
                        "type": "string",
                        "description": "Which EPA enforcement program to query. CWA = Clean Water Act (NPDES permits). AIR = Clean Air Act. RCRA = Hazardous Waste. ALL = union of all four programs (slower, more comprehensive).",
                        "default": "CWA"
                    },
                    "violationsOnly": {
                        "title": "Violations only",
                        "type": "boolean",
                        "description": "If true, return only facilities currently in violation or with recent enforcement actions. If false, return all regulated facilities (compliant + non-compliant).",
                        "default": true
                    },
                    "significantOnly": {
                        "title": "Significant Non-Compliance (SNC) only",
                        "type": "boolean",
                        "description": "If true, return only facilities with Significant Non-Compliance (SNC) status — the highest-priority violators. Useful for focused outreach. Requires violationsOnly=true.",
                        "default": false
                    },
                    "maxResults": {
                        "title": "Max results",
                        "minimum": 1,
                        "maximum": 50000,
                        "type": "integer",
                        "description": "Maximum number of facility records to return. 0 = unlimited (may be large).",
                        "default": 100
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
