EPA GHGRP FLIGHT Facility GHG Emissions Scraper
Pricing
Pay per event
EPA GHGRP FLIGHT Facility GHG Emissions Scraper
Scrape mandatory US GHG emissions from EPA GHGRP / FLIGHT. ~8k large industrial facilities x 41 subparts x 15 reporting years. Per-facility x sector x gas x year with NAICS, parent company, lat/lon. Power plants, refineries, cement, steel, landfills, oil & gas.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Share
Extract mandatory US greenhouse gas emissions data from the EPA Greenhouse Gas Reporting Program (GHGRP) — also known as FLIGHT (Facility Level Information on Greenhouse gases Tool). This actor delivers one clean, joined record per facility x year x sector x subsector x gas combination, ready for direct analysis or integration into carbon accounting workflows.
What This Scraper Collects
The EPA GHGRP is the authoritative source for large-emitter GHG data in the United States. Under 40 CFR Part 98, approximately 8,000 industrial facilities are required to report their greenhouse gas emissions annually. This actor surfaces that data with full dimensional context:
- Facility identity: name, parent company, address, city, state, ZIP, county, county FIPS, latitude, longitude
- Industry classification: NAICS code, reported GHGRP subpart(s) (A through AA)
- Emissions breakdown: per sector, subsector, and gas type with CO2-equivalent tonnage
- Sector coverage: Power Plants, Refineries, Minerals, Chemicals, Metals, Petroleum & Natural Gas, Waste, and 9 more GHGRP sectors
- Gas types: CO2, CH4, N2O, HFCs, PFCs, SF6, NF3, and biogenic CO2
Data coverage: 2010-2024 reporting years (EPA publishes approximately October of the following year).
Input Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
reportingYears | array | Calendar years to include (e.g. ["2022", "2023"]) | ["2023"] |
states | array | USPS two-letter state codes (e.g. ["CA", "TX"]). Leave empty for all states. | All states |
sectors | array | GHGRP sector names (e.g. ["Power Plants", "Refineries"]). Leave empty for all. | All sectors |
naicsPrefix | string | NAICS code prefix filter (e.g. "2211" for electric power). | All NAICS |
gases | array | Gas codes (e.g. ["CO2", "CH4"]). Leave empty for all gases. | All gases |
includeSubpartDetail | boolean | Pull PUB_FACTS_SUBP_GHG_EMISSION rows (per-Subpart detail). | false |
maxItems | integer | Maximum records to return. Use 0 for unlimited. | 200 |
Output Fields
Each record corresponds to one facility x reporting year x sector x subsector x gas combination:
{"facility_id": 1000001,"facility_name": "PSE Ferndale Generating Station","parent_company": "Empeco IV, LLC (74.33%); Diamond Generating Corporation (14%)","reporting_year": 2023,"address1": "5105 LAKE TERRELL ROAD","city": "FERNDALE","state": "WA","zip": "98248","county": "WHATCOM COUNTY","county_fips": "53073","lat": 48.828707,"lon": -122.685533,"naics_code": "221112","naics_label": null,"subpart": "C","sector_id": 3,"sector_name": "Power Plants","subsector_id": 1,"subsector_name": "Power Plants","gas_id": 1,"gas_name": "Carbon Dioxide (CO2)","co2e_emission_t": 714523.1,"gwp": null,"emission_classification": "CU_ONLY","bamm_used": null,"facility_url": "https://ghgdata.epa.gov/ghgp/service/facilityDetail/1000001?year=2023"}
Example Use Cases
Scope-3 / Supply-chain carbon accounting: Join against supplier NAICS codes to build upstream emission factors for Scope-3 Category 1 calculations per GHG Protocol.
ESG fund screening: Filter to specific sectors (Power Plants, Refineries) to identify high-emission assets for TCFD or SFDR disclosure requirements.
SEC Climate Disclosure compliance: Pull facility-level data for facilities operated by a public company (filter by parent company substring) to populate physical risk and Scope-1 disclosures.
Environmental justice research: Filter by county FIPS + sector to identify co-location of industrial emitters with disadvantaged communities.
NAICS-level emission benchmarking: Aggregate CO2e by NAICS code x year to build sector-average emission intensity baselines.
Technical Notes
- Data source: EPA Envirofacts REST API (
https://enviro.epa.gov/enviro/efservice/) - public, unauthenticated, no API key required - Tables joined:
PUB_FACTS_SECTOR_GHG_EMISSION,PUB_DIM_FACILITY,PUB_DIM_SECTOR,PUB_DIM_SUBSECTOR,PUB_DIM_GHG - No proxy required: EPA Envirofacts has no IP-based access controls
- Rate limit: 2 requests/second (500ms inter-page delay applied)
- Memory: 512 MB default (sufficient for most runs; increase to 1024 MB for full multi-year, all-state runs)
- Timeout: 2 hours default; full 15-year, all-state, all-sector run takes approximately 60-90 minutes
GHGRP vs Related EPA Datasets
| Dataset | What it covers |
|---|---|
| GHGRP / FLIGHT (this actor) | Large-facility GHG emissions, mandatory reporting, ~8k facilities/yr |
EPA TRI (jungle_synthesizer/epa-tri-crawler) | Toxic chemical releases - NOT GHG |
| Climate TRACE | Satellite-modelled global emissions - NOT self-reported |
The three datasets are complementary: GHGRP gives legally-binding reported figures for the largest US emitters; TRI covers different chemical families; Climate TRACE fills in smaller sources and non-US geographies.