# EPA GHGRP FLIGHT Facility GHG Emissions Scraper (`jungle_synthesizer/epa-ghgrp-flight-facility-emissions-scraper`) Actor

Scrape mandatory US GHG emissions from EPA GHGRP / FLIGHT. ~8k large industrial facilities x 41 subparts x 15 reporting years. Per-facility x sector x gas x year with NAICS, parent company, lat/lon. Power plants, refineries, cement, steel, landfills, oil & gas.

- **URL**: https://apify.com/jungle\_synthesizer/epa-ghgrp-flight-facility-emissions-scraper.md
- **Developed by:** [BowTiedRaccoon](https://apify.com/jungle_synthesizer) (community)
- **Categories:** Business, Other
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## EPA GHGRP FLIGHT Facility GHG Emissions Scraper

Extract mandatory US greenhouse gas emissions data from the **EPA Greenhouse Gas Reporting Program (GHGRP)** — also known as FLIGHT (Facility Level Information on Greenhouse gases Tool). This actor delivers one clean, joined record per facility x year x sector x subsector x gas combination, ready for direct analysis or integration into carbon accounting workflows.

### What This Scraper Collects

The EPA GHGRP is the authoritative source for large-emitter GHG data in the United States. Under 40 CFR Part 98, approximately 8,000 industrial facilities are required to report their greenhouse gas emissions annually. This actor surfaces that data with full dimensional context:

- **Facility identity**: name, parent company, address, city, state, ZIP, county, county FIPS, latitude, longitude
- **Industry classification**: NAICS code, reported GHGRP subpart(s) (A through AA)
- **Emissions breakdown**: per sector, subsector, and gas type with CO2-equivalent tonnage
- **Sector coverage**: Power Plants, Refineries, Minerals, Chemicals, Metals, Petroleum & Natural Gas, Waste, and 9 more GHGRP sectors
- **Gas types**: CO2, CH4, N2O, HFCs, PFCs, SF6, NF3, and biogenic CO2

**Data coverage**: 2010-2024 reporting years (EPA publishes approximately October of the following year).

### Input Parameters

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `reportingYears` | array | Calendar years to include (e.g. `["2022", "2023"]`) | `["2023"]` |
| `states` | array | USPS two-letter state codes (e.g. `["CA", "TX"]`). Leave empty for all states. | All states |
| `sectors` | array | GHGRP sector names (e.g. `["Power Plants", "Refineries"]`). Leave empty for all. | All sectors |
| `naicsPrefix` | string | NAICS code prefix filter (e.g. `"2211"` for electric power). | All NAICS |
| `gases` | array | Gas codes (e.g. `["CO2", "CH4"]`). Leave empty for all gases. | All gases |
| `includeSubpartDetail` | boolean | Pull PUB_FACTS_SUBP_GHG_EMISSION rows (per-Subpart detail). | `false` |
| `maxItems` | integer | Maximum records to return. Use `0` for unlimited. | `200` |

### Output Fields

Each record corresponds to one facility x reporting year x sector x subsector x gas combination:

```json
{
  "facility_id": 1000001,
  "facility_name": "PSE Ferndale Generating Station",
  "parent_company": "Empeco IV, LLC (74.33%); Diamond Generating Corporation (14%)",
  "reporting_year": 2023,
  "address1": "5105 LAKE TERRELL ROAD",
  "city": "FERNDALE",
  "state": "WA",
  "zip": "98248",
  "county": "WHATCOM COUNTY",
  "county_fips": "53073",
  "lat": 48.828707,
  "lon": -122.685533,
  "naics_code": "221112",
  "naics_label": null,
  "subpart": "C",
  "sector_id": 3,
  "sector_name": "Power Plants",
  "subsector_id": 1,
  "subsector_name": "Power Plants",
  "gas_id": 1,
  "gas_name": "Carbon Dioxide (CO2)",
  "co2e_emission_t": 714523.1,
  "gwp": null,
  "emission_classification": "CU_ONLY",
  "bamm_used": null,
  "facility_url": "https://ghgdata.epa.gov/ghgp/service/facilityDetail/1000001?year=2023"
}
````

### Example Use Cases

**Scope-3 / Supply-chain carbon accounting**: Join against supplier NAICS codes to build upstream emission factors for Scope-3 Category 1 calculations per GHG Protocol.

**ESG fund screening**: Filter to specific sectors (Power Plants, Refineries) to identify high-emission assets for TCFD or SFDR disclosure requirements.

**SEC Climate Disclosure compliance**: Pull facility-level data for facilities operated by a public company (filter by parent company substring) to populate physical risk and Scope-1 disclosures.

**Environmental justice research**: Filter by county FIPS + sector to identify co-location of industrial emitters with disadvantaged communities.

**NAICS-level emission benchmarking**: Aggregate CO2e by NAICS code x year to build sector-average emission intensity baselines.

### Technical Notes

- **Data source**: EPA Envirofacts REST API (`https://enviro.epa.gov/enviro/efservice/`) - public, unauthenticated, no API key required
- **Tables joined**: `PUB_FACTS_SECTOR_GHG_EMISSION`, `PUB_DIM_FACILITY`, `PUB_DIM_SECTOR`, `PUB_DIM_SUBSECTOR`, `PUB_DIM_GHG`
- **No proxy required**: EPA Envirofacts has no IP-based access controls
- **Rate limit**: 2 requests/second (500ms inter-page delay applied)
- **Memory**: 512 MB default (sufficient for most runs; increase to 1024 MB for full multi-year, all-state runs)
- **Timeout**: 2 hours default; full 15-year, all-state, all-sector run takes approximately 60-90 minutes

### GHGRP vs Related EPA Datasets

| Dataset | What it covers |
|---------|----------------|
| **GHGRP / FLIGHT** (this actor) | Large-facility GHG emissions, mandatory reporting, ~8k facilities/yr |
| EPA TRI (`jungle_synthesizer/epa-tri-crawler`) | Toxic chemical releases - NOT GHG |
| Climate TRACE | Satellite-modelled global emissions - NOT self-reported |

The three datasets are complementary: GHGRP gives legally-binding reported figures for the largest US emitters; TRI covers different chemical families; Climate TRACE fills in smaller sources and non-US geographies.

# Actor input Schema

## `sp_intended_usage` (type: `string`):

Please describe how you plan to use the data extracted by this crawler.

## `sp_improvement_suggestions` (type: `string`):

Provide any feedback or suggestions for improvements.

## `sp_contact` (type: `string`):

Provide your email address so we can get in touch with you.

## `reportingYears` (type: `array`):

Years to include (e.g. \["2022", "2023"]). EPA publishes approximately October of year N+1.

## `states` (type: `array`):

Two-letter USPS state codes (e.g. \["CA", "TX"]). Leave empty for all states.

## `sectors` (type: `array`):

Sector names from PUB\_DIM\_SECTOR (Power Plants, Refineries, Minerals, Chemicals, etc.). Leave empty for all sectors.

## `naicsPrefix` (type: `string`):

Filter by NAICS code prefix (e.g. 2211 for electric power generation). Leave empty for all NAICS codes.

## `gases` (type: `array`):

Gas filter (e.g. \["CO2", "CH4"]). Leave empty for all gases.

## `includeSubpartDetail` (type: `boolean`):

Pull PUB\_FACTS\_SUBP\_GHG\_EMISSION rows (per-Subpart breakdown with units and activity factors).

## `maxItems` (type: `integer`):

Maximum number of emission records to return. Use 0 for unlimited.

## Actor input object example

```json
{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "reportingYears": [
    "2023"
  ],
  "states": [],
  "sectors": [],
  "gases": [],
  "maxItems": 10
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "reportingYears": [
        "2023"
    ],
    "states": [],
    "sectors": [],
    "naicsPrefix": "",
    "gases": [],
    "includeSubpartDetail": false,
    "maxItems": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("jungle_synthesizer/epa-ghgrp-flight-facility-emissions-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "reportingYears": ["2023"],
    "states": [],
    "sectors": [],
    "naicsPrefix": "",
    "gases": [],
    "includeSubpartDetail": False,
    "maxItems": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("jungle_synthesizer/epa-ghgrp-flight-facility-emissions-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "reportingYears": [
    "2023"
  ],
  "states": [],
  "sectors": [],
  "naicsPrefix": "",
  "gases": [],
  "includeSubpartDetail": false,
  "maxItems": 10
}' |
apify call jungle_synthesizer/epa-ghgrp-flight-facility-emissions-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=jungle_synthesizer/epa-ghgrp-flight-facility-emissions-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "EPA GHGRP FLIGHT Facility GHG Emissions Scraper",
        "description": "Scrape mandatory US GHG emissions from EPA GHGRP / FLIGHT. ~8k large industrial facilities x 41 subparts x 15 reporting years. Per-facility x sector x gas x year with NAICS, parent company, lat/lon. Power plants, refineries, cement, steel, landfills, oil & gas.",
        "version": "0.1",
        "x-build-id": "Fvhnd5iuA0OVvIoFq"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/jungle_synthesizer~epa-ghgrp-flight-facility-emissions-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-jungle_synthesizer-epa-ghgrp-flight-facility-emissions-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~epa-ghgrp-flight-facility-emissions-scraper/runs": {
            "post": {
                "operationId": "runs-sync-jungle_synthesizer-epa-ghgrp-flight-facility-emissions-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~epa-ghgrp-flight-facility-emissions-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-jungle_synthesizer-epa-ghgrp-flight-facility-emissions-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "sp_intended_usage",
                    "sp_improvement_suggestions"
                ],
                "properties": {
                    "sp_intended_usage": {
                        "title": "What is the intended usage of this data?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Please describe how you plan to use the data extracted by this crawler."
                    },
                    "sp_improvement_suggestions": {
                        "title": "How can we improve this crawler for you?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide any feedback or suggestions for improvements."
                    },
                    "sp_contact": {
                        "title": "Contact Email",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide your email address so we can get in touch with you."
                    },
                    "reportingYears": {
                        "title": "Reporting Years",
                        "type": "array",
                        "description": "Years to include (e.g. [\"2022\", \"2023\"]). EPA publishes approximately October of year N+1.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "states": {
                        "title": "State Filter (USPS)",
                        "type": "array",
                        "description": "Two-letter USPS state codes (e.g. [\"CA\", \"TX\"]). Leave empty for all states.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sectors": {
                        "title": "GHGRP Sector Names",
                        "type": "array",
                        "description": "Sector names from PUB_DIM_SECTOR (Power Plants, Refineries, Minerals, Chemicals, etc.). Leave empty for all sectors.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "naicsPrefix": {
                        "title": "NAICS Prefix",
                        "type": "string",
                        "description": "Filter by NAICS code prefix (e.g. 2211 for electric power generation). Leave empty for all NAICS codes."
                    },
                    "gases": {
                        "title": "Gases",
                        "type": "array",
                        "description": "Gas filter (e.g. [\"CO2\", \"CH4\"]). Leave empty for all gases.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "includeSubpartDetail": {
                        "title": "Include Subpart Detail",
                        "type": "boolean",
                        "description": "Pull PUB_FACTS_SUBP_GHG_EMISSION rows (per-Subpart breakdown with units and activity factors)."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "type": "integer",
                        "description": "Maximum number of emission records to return. Use 0 for unlimited.",
                        "default": 10
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
