# BLS Labor Statistics Scraper (`crawlerbros/osha-inspection-violation-scraper`) Actor

Scrape US Bureau of Labor Statistics (BLS) employment, unemployment, and wage data by industry sector and state. Free government API, no authentication required.

- **URL**: https://apify.com/crawlerbros/osha-inspection-violation-scraper.md
- **Developed by:** [Crawler Bros](https://apify.com/crawlerbros) (community)
- **Categories:** Jobs, Automation, Integrations
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## OSHA Inspection & Violation Scraper

Scrape OSHA (Occupational Safety and Health Administration) inspection and violation records from the U.S. Department of Labor Enforcement Data API. Search inspections by company name, state, or SIC industry code; search violations by company or state. No authentication required.

### What this actor does

This actor accesses the DOL Enforcement Data API (`enforcedata.dol.gov`) to retrieve:

- **Inspections**: Search OSHA inspection records by establishment name, US state, or Standard Industrial Classification (SIC) code
- **Violations**: Search OSHA violation/citation records by establishment name or state

### Input

| Field | Type | Description |
|-------|------|-------------|
| `mode` | select | `searchInspections` or `searchViolations` |
| `establishmentName` | string | Company/establishment name (partial match) |
| `state` | select | Two-letter US state code (e.g. `CA`, `TX`, `NY`) |
| `sicCode` | select | SIC industry code (e.g. `3310` for Steel Works) |
| `inspectionType` | select | Filter by inspection type (e.g. `B` = Complaint, `E` = Accident) |
| `dateFrom` | string | Filter records from date (YYYY-MM-DD) |
| `dateTo` | string | Filter records to date (YYYY-MM-DD) |
| `maxItems` | integer | Maximum records to return (1–2000, default: 50) |

#### Example Input

```json
{
  "mode": "searchInspections",
  "state": "CA",
  "maxItems": 10
}
````

### Output

Each record contains:

| Field | Description |
|-------|-------------|
| `inspectionId` | OSHA inspection ID (activity number) |
| `establishmentName` | Name of the inspected business |
| `state` | US state abbreviation |
| `city` | City |
| `zipCode` | ZIP code |
| `naicsCode` | NAICS industry code |
| `industry` | Industry description |
| `inspectionDate` | Date inspection was opened (YYYY-MM-DD) |
| `closeCaseDate` | Date case was closed (YYYY-MM-DD) |
| `inspectionType` | Type of inspection (e.g. Complaint, Planned, Accident) |
| `scope` | Inspection scope (Complete, Partial, Records Only) |
| `unionStatus` | Union/non-union status |
| `employerSize` | Number of employees at establishment |
| `fatalities` | Number of fatalities |
| `injuries` | Number of injuries |
| `penaltyAmount` | Total current penalty amount (USD) |
| `totalViolationsCount` | Total violations cited |
| `seriousViolations` | Number of serious violations |
| `willfulViolations` | Number of willful violations |
| `repeatViolations` | Number of repeat violations |
| `citationId` | Citation ID (violations mode) |
| `violationType` | Violation type (Serious, Willful, Repeat, Other) |
| `standard` | OSHA standard cited |
| `description` | Violation description |
| `reportUrl` | Direct OSHA inspection report URL |
| `scrapedAt` | ISO timestamp of when record was scraped |

#### Example Output

```json
{
  "inspectionId": "1234567",
  "establishmentName": "ACME Manufacturing Inc",
  "state": "CA",
  "city": "Los Angeles",
  "naicsCode": "3310",
  "industry": "Steel Works",
  "inspectionDate": "2023-06-15",
  "inspectionType": "Complaint",
  "penaltyAmount": 12500.0,
  "totalViolationsCount": 3,
  "seriousViolations": 2,
  "reportUrl": "https://www.osha.gov/pls/imis/establishment.inspection_detail?id=1234567",
  "recordType": "inspection",
  "scrapedAt": "2024-01-15T10:30:00+00:00"
}
```

### Data Source

Data is sourced from the **DOL Enforcement Data API** (`enforcedata.dol.gov`), which provides public access to OSHA inspection and violation records. This API is operated by the U.S. Department of Labor and provides comprehensive enforcement data without authentication.

### SIC Industry Codes (select examples)

| Code | Industry |
|------|----------|
| 2011 | Meat Packing Plants |
| 1731 | Electrical Work |
| 3310 | Steel Works & Blast Furnaces |
| 5411 | Grocery Stores |
| 8051 | Skilled Nursing Care Facilities |
| 1600 | Heavy Construction |

See [OSHA SIC codes](https://www.osha.gov/sic-manual) for the full list.

### Inspection Types

| Code | Description |
|------|-------------|
| A | Planned |
| B | Complaint |
| C | Referral |
| D | Follow-up |
| E | Accident |
| F | Fatality / Catastrophe |
| V | National Emphasis Program |

### Frequently Asked Questions

**Is authentication required?**
No. The DOL Enforcement Data API is public and does not require any API key or credentials.

**How many records can I retrieve?**
You can retrieve up to 2,000 records per run using the `maxItems` parameter. For larger datasets, run multiple times with different filters.

**How current is the data?**
OSHA updates its enforcement database regularly. Records reflect the most recent data available in the DOL Enforcement system.

**Can I filter by both company and state?**
Yes, you can combine `establishmentName` and `state` filters to narrow results.

**What is the difference between searchInspections and searchViolations?**
`searchInspections` returns high-level inspection summary records. `searchViolations` returns individual citation/violation records with specific OSHA standard citations and penalty amounts.

# Actor input Schema

## `mode` (type: `string`):

What BLS data to fetch.

## `state` (type: `string`):

Two-letter US state abbreviation. Only applies to unemploymentByState mode. Leave empty for all states.

## `occupation` (type: `string`):

Filter series by keyword (e.g. 'manufacturing', 'retail', 'construction', 'healthcare'). Leave empty for all sectors.

## `startYear` (type: `integer`):

First year of data to retrieve.

## `endYear` (type: `integer`):

Last year of data to retrieve.

## `maxItems` (type: `integer`):

Hard cap on emitted records.

## Actor input object example

```json
{
  "mode": "employmentByOccupation",
  "startYear": 2022,
  "endYear": 2024,
  "maxItems": 50
}
```

# Actor output Schema

## `records` (type: `string`):

Dataset containing all scraped BLS employment and wage records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "employmentByOccupation",
    "state": "",
    "occupation": "",
    "startYear": 2022,
    "endYear": 2024,
    "maxItems": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("crawlerbros/osha-inspection-violation-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "employmentByOccupation",
    "state": "",
    "occupation": "",
    "startYear": 2022,
    "endYear": 2024,
    "maxItems": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("crawlerbros/osha-inspection-violation-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "employmentByOccupation",
  "state": "",
  "occupation": "",
  "startYear": 2022,
  "endYear": 2024,
  "maxItems": 50
}' |
apify call crawlerbros/osha-inspection-violation-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=crawlerbros/osha-inspection-violation-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "BLS Labor Statistics Scraper",
        "description": "Scrape US Bureau of Labor Statistics (BLS) employment, unemployment, and wage data by industry sector and state. Free government API, no authentication required.",
        "version": "2.0",
        "x-build-id": "o28mZ426QBEyU8iU0"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/crawlerbros~osha-inspection-violation-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-crawlerbros-osha-inspection-violation-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/crawlerbros~osha-inspection-violation-scraper/runs": {
            "post": {
                "operationId": "runs-sync-crawlerbros-osha-inspection-violation-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/crawlerbros~osha-inspection-violation-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-crawlerbros-osha-inspection-violation-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "employmentByOccupation",
                            "unemploymentByState",
                            "wagesByOccupation"
                        ],
                        "type": "string",
                        "description": "What BLS data to fetch.",
                        "default": "employmentByOccupation"
                    },
                    "state": {
                        "title": "State (for unemployment mode)",
                        "enum": [
                            "",
                            "AL",
                            "AK",
                            "AZ",
                            "AR",
                            "CA",
                            "CO",
                            "CT",
                            "DC",
                            "DE",
                            "FL",
                            "GA",
                            "HI",
                            "ID",
                            "IL",
                            "IN",
                            "IA",
                            "KS",
                            "KY",
                            "LA",
                            "ME",
                            "MD",
                            "MA",
                            "MI",
                            "MN",
                            "MS",
                            "MO",
                            "MT",
                            "NE",
                            "NV",
                            "NH",
                            "NJ",
                            "NM",
                            "NY",
                            "NC",
                            "ND",
                            "OH",
                            "OK",
                            "OR",
                            "PA",
                            "RI",
                            "SC",
                            "SD",
                            "TN",
                            "TX",
                            "UT",
                            "VT",
                            "VA",
                            "WA",
                            "WV",
                            "WI",
                            "WY"
                        ],
                        "type": "string",
                        "description": "Two-letter US state abbreviation. Only applies to unemploymentByState mode. Leave empty for all states."
                    },
                    "occupation": {
                        "title": "Occupation / Sector keyword",
                        "type": "string",
                        "description": "Filter series by keyword (e.g. 'manufacturing', 'retail', 'construction', 'healthcare'). Leave empty for all sectors."
                    },
                    "startYear": {
                        "title": "Start year",
                        "minimum": 2010,
                        "maximum": 2025,
                        "type": "integer",
                        "description": "First year of data to retrieve.",
                        "default": 2022
                    },
                    "endYear": {
                        "title": "End year",
                        "minimum": 2010,
                        "maximum": 2025,
                        "type": "integer",
                        "description": "Last year of data to retrieve.",
                        "default": 2024
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 2000,
                        "type": "integer",
                        "description": "Hard cap on emitted records.",
                        "default": 50
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
