# MSHA Mine Safety Scraper (`compute-edge/msha-mine-safety-scraper`) Actor

Downloads and parses MSHA (Mine Safety and Health Administration) bulk CSV data files covering mine information, violations, inspections, and accidents from the US Department of Labor.

- **URL**: https://apify.com/compute-edge/msha-mine-safety-scraper.md
- **Developed by:** [Compute Edge](https://apify.com/compute-edge) (community)
- **Categories:** Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## MSHA Mine Safety & Health Data Scraper

Extract **mine safety and health data** from the U.S. Department of Labor's **Mine Safety and Health Administration (MSHA)** database. This Actor downloads and parses the official MSHA bulk data files, covering over **91,000 mines**, millions of violations, and hundreds of thousands of inspections reported under federal mining safety law.

MSHA data is a critical resource for **mining industry compliance monitoring**, **workplace safety analysis**, **insurance risk assessment**, and **industrial supply chain due diligence**. Every mine operating in the United States — coal, metal, and non-metal — is required by law to be registered and inspected by MSHA. This Actor makes that data instantly accessible in JSON, CSV, or Excel format.

### Key Features

| Feature | Description |
|---------|-------------|
| **Three datasets** | Mines (registry), Violations (citations & penalties), Inspections (inspection events) |
| **State filtering** | Filter mines by US state abbreviation (WV, KY, PA, etc.) |
| **91,000+ mines** | Complete registry of all US mines under MSHA jurisdiction |
| **Penalty data** | Proposed penalties, current penalties, and amounts paid for violations |
| **No authentication** | Uses freely available MSHA public data — no API key needed |
| **Automatic parsing** | Downloads ZIP archives, extracts pipe-delimited CSV, and outputs clean JSON |

### What Data Can You Extract?

#### Mines Dataset

| Field | Description |
|-------|-------------|
| `mineId` | Unique MSHA mine identifier |
| `mineName` | Name of the mine |
| `mineType` | Surface or Underground |
| `mineStatus` | Active, Abandoned, Intermittent, etc. |
| `operatorName` | Current mine operator |
| `controllerName` | Controlling entity |
| `state` | US state abbreviation |
| `county` | County name |
| `commodity` | Primary commodity (Coal, Sand & Gravel, etc.) |
| `latitude`, `longitude` | GPS coordinates |
| `avgMineEmployment` | Average number of employees |

#### Violations Dataset

| Field | Description |
|-------|-------------|
| `violationNo` | Unique violation number |
| `mineId` | Associated mine ID |
| `violationIssueDt` | Date violation was issued |
| `proposedPenalty` | Initial penalty amount in dollars |
| `currentPenalty` | Current penalty after adjustments |
| `negligence` | Negligence classification |
| `gravity` | Gravity of violation |

#### Inspections Dataset

| Field | Description |
|-------|-------------|
| `eventNo` | Inspection event number |
| `mineId` | Associated mine ID |
| `inspectionBeginDt` | Inspection start date |
| `inspectionEndDt` | Inspection end date |
| `inspType` | Inspection type code |
| `violationsIssued` | Number of violations issued |
| `totalViolations` | Total violation count |

### How to Scrape MSHA Mine Safety Data

1. **Go to this Actor's page** on the Apify Store
2. **Click "Start"** to open the input form
3. **Set your filters:**
   - Select a **Data Type**: Mines, Violations, or Inspections
   - Enter a **State** abbreviation (e.g., `WV` for West Virginia) — or leave blank for all states
   - Set **Max Results** (default: 1,000; set to 0 for all records)
4. **Click "Start"** to run the Actor
5. **Download your data** in JSON, CSV, or Excel format from the Dataset tab

### Input Example

```json
{
    "dataType": "mines",
    "state": "WV",
    "maxResults": 500
}
````

### Output Example

```json
{
    "mineId": "4601432",
    "mineName": "BIRCH RIVER OPERATION",
    "mineType": "Surface",
    "mineStatus": "Active",
    "operatorName": "LEXINGTON COAL CO LLC",
    "controllerName": "FORESIGHT ENERGY LP",
    "state": "WV",
    "county": "Nicholas",
    "commodity": "Coal (Bituminous)",
    "latitude": "38.48861",
    "longitude": "-80.71083",
    "avgMineEmployment": "156"
}
```

### Pricing

This Actor uses **pay-per-result** pricing:

| Event | Price |
|-------|-------|
| Actor start | $0.00005 |
| Per result | $0.002 |

The MSHA data is free and public. You only pay for Apify compute resources plus the per-result fee above. A typical run of 1,000 mine records costs approximately $2.00 in Actor fees plus minimal compute costs.

### Use Cases

- **Mining Industry Compliance**: Monitor mine safety records, violations, and penalties for regulatory compliance
- **Insurance & Risk Assessment**: Evaluate mine safety history for underwriting and premium calculations
- **Equipment & Supply Sales**: Identify active mines by commodity, state, and size for targeted B2B outreach
- **Workplace Safety Research**: Analyze violation trends, penalty amounts, and inspection patterns across the mining industry
- **Environmental Analysis**: Map mine locations and operations for environmental impact studies
- **Investigative Journalism**: Data-driven reporting on mining safety enforcement

### Integrations

Connect this Actor to your existing workflows:

- Export to **Google Sheets** for collaborative analysis
- Send results to **Slack** or **email** for automated alerts
- Feed into **Zapier**, **Make**, or **n8n** for custom automation
- Use the Apify API to integrate directly with your application

### FAQ

#### Is it legal to scrape MSHA mine safety data?

Yes. MSHA data is publicly available through the MSHA Open Government Data portal, a free public service provided by the U.S. Department of Labor. The data is in the public domain and freely available for any use.

#### How much does it cost to scrape MSHA?

The Actor charges $0.002 per result plus a $0.00005 Actor start fee. A typical run of 1,000 mine records costs approximately $2.00 in Actor fees plus minimal compute costs. See the pricing table above for details.

#### Can I export MSHA data to Excel or CSV?

Yes. Apify supports exporting data in JSON, CSV, Excel, XML, HTML, and RSS formats. After the Actor run completes, go to the Dataset tab and choose your preferred export format.

#### How often is the MSHA data updated?

You can schedule this Actor to run at any interval — daily, weekly, or monthly. MSHA updates the bulk data files regularly, and the mines registry reflects the current status of all registered mines.

#### What types of mines are covered?

All mines under MSHA jurisdiction are included: coal mines, metal mines, non-metal mines (sand, gravel, limestone, etc.), and stone quarries. The mines dataset contains 91,000+ records, while violations and inspections contain millions of records.

### Other Scrapers by SeatSignal

- [OSHA Inspections Scraper](https://apify.com/seatsignal/osha-inspections-scraper) — Extract workplace safety inspection and violation data
- [EPA ECHO Environmental Compliance Scraper](https://apify.com/seatsignal/epa-echo-scraper) — Extract compliance data for 800K+ EPA-regulated facilities
- [EPA Toxics Release Inventory (TRI) Scraper](https://apify.com/seatsignal/epa-tri-scraper) — Extract toxic chemical release data from EPA TRI
- [FRA Railroad Accidents Scraper](https://apify.com/seatsignal/fra-railroad-accidents-scraper) — Extract railroad accident and incident data
- [IQS Directory Scraper](https://apify.com/seatsignal/iqsdirectory-scraper) — Scrape industrial B2B supplier listings

### Legal Disclaimer

This Actor accesses publicly available data from the MSHA Open Government Data portal, a free public service provided by the US Department of Labor. The data is in the public domain and freely available for any use.

This Actor does not bypass any authentication, does not violate any terms of service, and respects server resources. The Actor is provided as-is without warranty. Users are responsible for ensuring their use of the data complies with applicable laws and regulations.

For questions or support, please open an issue on this Actor's page.

# Actor input Schema

## `dataType` (type: `string`):

Which MSHA dataset to download

## `state` (type: `string`):

Filter by US state abbreviation (e.g., WV, KY, PA). Leave empty for all states.

## `maxResults` (type: `integer`):

Maximum number of records to return (0 = all)

## Actor input object example

```json
{
  "dataType": "mines",
  "state": "",
  "maxResults": 1000
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("compute-edge/msha-mine-safety-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("compute-edge/msha-mine-safety-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call compute-edge/msha-mine-safety-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=compute-edge/msha-mine-safety-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "MSHA Mine Safety Scraper",
        "description": "Downloads and parses MSHA (Mine Safety and Health Administration) bulk CSV data files covering mine information, violations, inspections, and accidents from the US Department of Labor.",
        "version": "0.1",
        "x-build-id": "A1h0L9sl8E3HAd01c"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/compute-edge~msha-mine-safety-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-compute-edge-msha-mine-safety-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/compute-edge~msha-mine-safety-scraper/runs": {
            "post": {
                "operationId": "runs-sync-compute-edge-msha-mine-safety-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/compute-edge~msha-mine-safety-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-compute-edge-msha-mine-safety-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "dataType": {
                        "title": "Data Type",
                        "enum": [
                            "mines",
                            "violations",
                            "inspections"
                        ],
                        "type": "string",
                        "description": "Which MSHA dataset to download",
                        "default": "mines"
                    },
                    "state": {
                        "title": "State",
                        "type": "string",
                        "description": "Filter by US state abbreviation (e.g., WV, KY, PA). Leave empty for all states.",
                        "default": ""
                    },
                    "maxResults": {
                        "title": "Max Results",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of records to return (0 = all)",
                        "default": 1000
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
