# EPA Safe Drinking Water Scraper (`compute-edge/epa-drinking-water-scraper`) Actor

Scrapes EPA's Safe Drinking Water Information System (SDWIS) via the Envirofacts API. Get water system data, violations, and enforcement actions by state.

- **URL**: https://apify.com/compute-edge/epa-drinking-water-scraper.md
- **Developed by:** [Compute Edge](https://apify.com/compute-edge) (community)
- **Categories:** Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## EPA Safe Drinking Water (SDWIS) Scraper

Extract **drinking water system data** from the EPA's **Safe Drinking Water Information System (SDWIS)** — the most comprehensive public database of drinking water systems and their regulatory compliance in the United States. This Actor wraps the official EPA Envirofacts REST API to deliver structured data on over **433,000 public water systems**, their violations, and enforcement actions.

SDWIS is the authoritative source for **water quality compliance monitoring**, **environmental due diligence**, **public health research**, and **infrastructure investment analysis**. Every public water system in the US is tracked by the EPA under the Safe Drinking Water Act. This Actor makes that data instantly accessible in JSON, CSV, or Excel format.

### Key Features

| Feature | Description |
|---------|-------------|
| **Three datasets** | Water Systems (registry), Violations (compliance issues), Enforcement Actions |
| **State filtering** | Filter by any US state abbreviation (NY, CA, FL, etc.) |
| **433,000+ water systems** | Every public water system in the United States |
| **Violation tracking** | Health-based and monitoring violations with contaminant details |
| **No authentication** | Uses the free public EPA Envirofacts API — no API key needed |
| **Automatic pagination** | Handles large result sets across multiple API pages |

### What Data Can You Extract?

#### Water Systems Dataset

| Field | Description |
|-------|-------------|
| `pwsid` | Public Water System ID |
| `pwsName` | System name |
| `stateCode` | US state abbreviation |
| `cityName` | City served |
| `zipCode` | ZIP code |
| `populationServedCount` | Population served by the system |
| `serviceConnectionsCount` | Number of service connections |
| `ownerTypeCode` | Owner type (Federal, State, Local, Private) |
| `gwSwCode` | Water source type (Ground Water, Surface Water) |
| `addressLine1` | System address |

#### Violations Dataset

| Field | Description |
|-------|-------------|
| `pwsid` | Public Water System ID |
| `violationCode` | Violation type code |
| `violationCategoryCode` | MCL, Monitoring, Treatment Technique, etc. |
| `isHealthBasedInd` | Whether the violation is health-based |
| `contaminantCode` | Contaminant involved |
| `complPerBeginDate` | Compliance period start |
| `complPerEndDate` | Compliance period end |

#### Enforcement Actions Dataset

| Field | Description |
|-------|-------------|
| `pwsid` | Public Water System ID |
| `enforcementId` | Enforcement action identifier |
| `enforcementDate` | Date of enforcement |
| `enforcementActionTypeCode` | Type of enforcement action |

### How to Scrape EPA Drinking Water Data

1. **Go to this Actor's page** on the Apify Store
2. **Click "Start"** to open the input form
3. **Set your filters:**
   - Select a **Data Type**: Water Systems, Violations, or Enforcement Actions
   - Enter a **State Code** (e.g., `NY` for New York) — this field is required
   - Set **Max Results** (default: 500; set to 0 for all records)
4. **Click "Start"** to run the Actor
5. **Download your data** in JSON, CSV, or Excel format from the Dataset tab

### Input Example

```json
{
    "dataType": "waterSystems",
    "state": "CA",
    "maxResults": 1000
}
````

### Output Example

```json
{
    "pwsid": "CA0103002",
    "pwsName": "EAST BAY MUNICIPAL UTILITY DISTRICT",
    "stateCode": "CA",
    "cityName": "OAKLAND",
    "populationServedCount": 1400000,
    "serviceConnectionsCount": 395000,
    "ownerTypeCode": "L",
    "gwSwCode": "SW",
    "addressLine1": "375 11TH STREET"
}
```

### Pricing

This Actor uses **pay-per-result** pricing:

| Event | Price |
|-------|-------|
| Actor start | $0.00005 |
| Per result | $0.002 |

The EPA API is free and public. You only pay for Apify compute resources plus the per-result fee above. A typical run of 1,000 records costs approximately $2.00 in Actor fees plus minimal compute costs.

### Use Cases

- **Environmental Consulting**: Identify water systems with violations for compliance advisory services
- **Real Estate Due Diligence**: Check drinking water quality data for properties and communities
- **Water Treatment Sales**: Find water systems by size, source type, and location for targeted outreach
- **Public Health Research**: Analyze violation patterns, contaminant exposure, and enforcement trends
- **Infrastructure Investment**: Evaluate water system capacity and compliance for infrastructure funding decisions
- **Journalism & Investigations**: Data-driven reporting on drinking water safety (like the Flint water crisis)

### Integrations

Connect this Actor to your existing workflows:

- Export to **Google Sheets** for collaborative analysis
- Send results to **Slack** or **email** for automated alerts
- Feed into **Zapier**, **Make**, or **n8n** for custom automation
- Use the Apify API to integrate directly with your application

### FAQ

#### Is it legal to scrape EPA drinking water data?

Yes. EPA SDWIS data is publicly available through the official EPA Envirofacts API, a free public service provided by the U.S. Environmental Protection Agency. The data is in the public domain and freely available for any use.

#### How much does it cost to scrape EPA drinking water data?

The Actor charges $0.002 per result plus a $0.00005 Actor start fee. A typical run of 1,000 records costs approximately $2.00 in Actor fees plus minimal compute costs. See the pricing table above for details.

#### Can I export EPA drinking water data to Excel or CSV?

Yes. Apify supports exporting data in JSON, CSV, Excel, XML, HTML, and RSS formats. After the Actor run completes, go to the Dataset tab and choose your preferred export format.

#### How often is the EPA drinking water data updated?

You can schedule this Actor to run at any interval — daily, weekly, or monthly. The EPA updates the SDWIS database on a quarterly basis, reflecting the latest reporting from state primacy agencies.

#### What types of water systems are included?

All public water systems regulated under the Safe Drinking Water Act are included: community water systems (serving residential populations year-round), non-transient non-community systems (such as schools and offices), and transient non-community systems (such as campgrounds and gas stations). Over 433,000 water systems are tracked.

### Other Scrapers by SeatSignal

- [EPA ECHO Environmental Compliance Scraper](https://apify.com/seatsignal/epa-echo-scraper) — Extract compliance data for 800K+ EPA-regulated facilities
- [EPA Toxics Release Inventory (TRI) Scraper](https://apify.com/seatsignal/epa-tri-scraper) — Extract toxic chemical release data from EPA TRI
- [USBR Water Data Scraper](https://apify.com/seatsignal/usbr-water-data-scraper) — Extract U.S. Bureau of Reclamation water data
- [Energy Star Certified Products Scraper](https://apify.com/seatsignal/energystar-scraper) — Search 290K+ EPA Energy Star certified products
- [FEMA Disaster Declarations Scraper](https://apify.com/seatsignal/fema-disasters-scraper) — Extract 60K+ FEMA disaster declaration records

### Legal Disclaimer

This Actor accesses publicly available data from the EPA Envirofacts API, a free public service provided by the US Environmental Protection Agency. The data is in the public domain and freely available for any use.

This Actor does not bypass any authentication, does not violate any terms of service, and respects rate limits on the EPA API. The Actor is provided as-is without warranty. Users are responsible for ensuring their use of the data complies with applicable laws and regulations.

For questions or support, please open an issue on this Actor's page.

# Actor input Schema

## `dataType` (type: `string`):

Which SDWIS dataset to query

## `state` (type: `string`):

US state abbreviation (e.g., NY, CA, FL). Required.

## `maxResults` (type: `integer`):

Maximum number of records to return (0 = all)

## Actor input object example

```json
{
  "dataType": "waterSystems",
  "state": "NY",
  "maxResults": 500
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("compute-edge/epa-drinking-water-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("compute-edge/epa-drinking-water-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call compute-edge/epa-drinking-water-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=compute-edge/epa-drinking-water-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "EPA Safe Drinking Water Scraper",
        "description": "Scrapes EPA's Safe Drinking Water Information System (SDWIS) via the Envirofacts API. Get water system data, violations, and enforcement actions by state.",
        "version": "0.1",
        "x-build-id": "kk5ueZdtgBpEUf4zX"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/compute-edge~epa-drinking-water-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-compute-edge-epa-drinking-water-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/compute-edge~epa-drinking-water-scraper/runs": {
            "post": {
                "operationId": "runs-sync-compute-edge-epa-drinking-water-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/compute-edge~epa-drinking-water-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-compute-edge-epa-drinking-water-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "state"
                ],
                "properties": {
                    "dataType": {
                        "title": "Data Type",
                        "enum": [
                            "waterSystems",
                            "violations",
                            "enforcementActions"
                        ],
                        "type": "string",
                        "description": "Which SDWIS dataset to query",
                        "default": "waterSystems"
                    },
                    "state": {
                        "title": "State Code",
                        "type": "string",
                        "description": "US state abbreviation (e.g., NY, CA, FL). Required.",
                        "default": "NY"
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of records to return (0 = all)",
                        "default": 500
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
