EPA GHGRP FLIGHT Facility GHG Emissions Scraper avatar

EPA GHGRP FLIGHT Facility GHG Emissions Scraper

Pricing

Pay per event

Go to Apify Store
EPA GHGRP FLIGHT Facility GHG Emissions Scraper

EPA GHGRP FLIGHT Facility GHG Emissions Scraper

Scrape mandatory US GHG emissions from EPA GHGRP / FLIGHT. ~8k large industrial facilities x 41 subparts x 15 reporting years. Per-facility x sector x gas x year with NAICS, parent company, lat/lon. Power plants, refineries, cement, steel, landfills, oil & gas.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Extract mandatory US greenhouse gas emissions data from the EPA Greenhouse Gas Reporting Program (GHGRP) — also known as FLIGHT (Facility Level Information on Greenhouse gases Tool). This actor delivers one clean, joined record per facility x year x sector x subsector x gas combination, ready for direct analysis or integration into carbon accounting workflows.

What This Scraper Collects

The EPA GHGRP is the authoritative source for large-emitter GHG data in the United States. Under 40 CFR Part 98, approximately 8,000 industrial facilities are required to report their greenhouse gas emissions annually. This actor surfaces that data with full dimensional context:

  • Facility identity: name, parent company, address, city, state, ZIP, county, county FIPS, latitude, longitude
  • Industry classification: NAICS code, reported GHGRP subpart(s) (A through AA)
  • Emissions breakdown: per sector, subsector, and gas type with CO2-equivalent tonnage
  • Sector coverage: Power Plants, Refineries, Minerals, Chemicals, Metals, Petroleum & Natural Gas, Waste, and 9 more GHGRP sectors
  • Gas types: CO2, CH4, N2O, HFCs, PFCs, SF6, NF3, and biogenic CO2

Data coverage: 2010-2024 reporting years (EPA publishes approximately October of the following year).

Input Parameters

ParameterTypeDescriptionDefault
reportingYearsarrayCalendar years to include (e.g. ["2022", "2023"])["2023"]
statesarrayUSPS two-letter state codes (e.g. ["CA", "TX"]). Leave empty for all states.All states
sectorsarrayGHGRP sector names (e.g. ["Power Plants", "Refineries"]). Leave empty for all.All sectors
naicsPrefixstringNAICS code prefix filter (e.g. "2211" for electric power).All NAICS
gasesarrayGas codes (e.g. ["CO2", "CH4"]). Leave empty for all gases.All gases
includeSubpartDetailbooleanPull PUB_FACTS_SUBP_GHG_EMISSION rows (per-Subpart detail).false
maxItemsintegerMaximum records to return. Use 0 for unlimited.200

Output Fields

Each record corresponds to one facility x reporting year x sector x subsector x gas combination:

{
"facility_id": 1000001,
"facility_name": "PSE Ferndale Generating Station",
"parent_company": "Empeco IV, LLC (74.33%); Diamond Generating Corporation (14%)",
"reporting_year": 2023,
"address1": "5105 LAKE TERRELL ROAD",
"city": "FERNDALE",
"state": "WA",
"zip": "98248",
"county": "WHATCOM COUNTY",
"county_fips": "53073",
"lat": 48.828707,
"lon": -122.685533,
"naics_code": "221112",
"naics_label": null,
"subpart": "C",
"sector_id": 3,
"sector_name": "Power Plants",
"subsector_id": 1,
"subsector_name": "Power Plants",
"gas_id": 1,
"gas_name": "Carbon Dioxide (CO2)",
"co2e_emission_t": 714523.1,
"gwp": null,
"emission_classification": "CU_ONLY",
"bamm_used": null,
"facility_url": "https://ghgdata.epa.gov/ghgp/service/facilityDetail/1000001?year=2023"
}

Example Use Cases

Scope-3 / Supply-chain carbon accounting: Join against supplier NAICS codes to build upstream emission factors for Scope-3 Category 1 calculations per GHG Protocol.

ESG fund screening: Filter to specific sectors (Power Plants, Refineries) to identify high-emission assets for TCFD or SFDR disclosure requirements.

SEC Climate Disclosure compliance: Pull facility-level data for facilities operated by a public company (filter by parent company substring) to populate physical risk and Scope-1 disclosures.

Environmental justice research: Filter by county FIPS + sector to identify co-location of industrial emitters with disadvantaged communities.

NAICS-level emission benchmarking: Aggregate CO2e by NAICS code x year to build sector-average emission intensity baselines.

Technical Notes

  • Data source: EPA Envirofacts REST API (https://enviro.epa.gov/enviro/efservice/) - public, unauthenticated, no API key required
  • Tables joined: PUB_FACTS_SECTOR_GHG_EMISSION, PUB_DIM_FACILITY, PUB_DIM_SECTOR, PUB_DIM_SUBSECTOR, PUB_DIM_GHG
  • No proxy required: EPA Envirofacts has no IP-based access controls
  • Rate limit: 2 requests/second (500ms inter-page delay applied)
  • Memory: 512 MB default (sufficient for most runs; increase to 1024 MB for full multi-year, all-state runs)
  • Timeout: 2 hours default; full 15-year, all-state, all-sector run takes approximately 60-90 minutes
DatasetWhat it covers
GHGRP / FLIGHT (this actor)Large-facility GHG emissions, mandatory reporting, ~8k facilities/yr
EPA TRI (jungle_synthesizer/epa-tri-crawler)Toxic chemical releases - NOT GHG
Climate TRACESatellite-modelled global emissions - NOT self-reported

The three datasets are complementary: GHGRP gives legally-binding reported figures for the largest US emitters; TRI covers different chemical families; Climate TRACE fills in smaller sources and non-US geographies.