EPA Toxic Release Inventory (TRI) Crawler avatar

EPA Toxic Release Inventory (TRI) Crawler

Pricing

Pay per event

Go to Apify Store
EPA Toxic Release Inventory (TRI) Crawler

EPA Toxic Release Inventory (TRI) Crawler

Crawl toxic chemical release data from the EPA TRI via the Envirofacts API. Extract facility details, chemical names, release quantities by media (air, water, land), coordinates, and carcinogen flags. Filter by state, chemical, year, and facility.

Pricing

Pay per event

Rating

0.0

(0)

Developer

BowTiedRaccoon

BowTiedRaccoon

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Categories

Share

Crawl toxic chemical release data from the EPA Toxics Release Inventory via the Envirofacts REST API. Extract facility details, chemical names, release quantities by environmental medium (air, water, land), off-site transfers, geographic coordinates, and carcinogen flags. Filter by state, chemical, reporting year, and facility name.

What does the EPA TRI Crawler do?

The EPA TRI Crawler queries the EPA Envirofacts Data Service API to extract facility-level toxic chemical release data from the Toxics Release Inventory (TRI) program. The TRI tracks releases of over 860 chemicals from ~25,000 reporting facilities across the United States, with data going back to 1987. The crawler joins data from four EPA tables -- TRI_REPORTING_FORM, TRI_FACILITY, TRI_RELEASE_QTY, and TRI_CHEM_INFO -- to produce a single flat record per facility-chemical-year combination with release quantities broken down by air emissions, water discharges, land disposal, underground injection, and off-site transfers.

EPA TRI Crawler Features

  • Extracts data from 25,000+ reporting facilities across all US states and territories
  • Joins four EPA tables into a single denormalized record per facility-chemical-year
  • Breaks down release quantities by medium -- fugitive air, stack air, surface water, underground injection, land disposal, and off-site transfers
  • Filters by state -- all 50 states plus DC, Puerto Rico, and US Virgin Islands (in-memory filtering since the EPA API silently ignores state filters on the reporting form table)
  • Filters by chemical name -- partial matching (e.g., "lead", "benzene", "mercury")
  • Filters by reporting year -- any year from 1987 to present
  • Filters by facility name -- partial matching to find specific industrial sites
  • Filters for carcinogens only -- returns releases of known carcinogens only
  • Includes chemical metadata -- CAS registry number, carcinogen flag, Clean Air Act classification, and chemical category (Metal, Dioxin, PBT, PFAS)
  • Converts coordinates -- DMS (degrees-minutes-seconds) to decimal degrees for mapping
  • No proxy required -- EPA Envirofacts is a public government API with no authentication

EPA TRI Crawler Output Fields

FieldTypeDescription
trifidstringTRI facility ID
facility_namestringFacility name
street_addressstringStreet address
citystringCity
countystringCounty
statestringState abbreviation
zip_codestringZIP code
latitudenumberLatitude coordinate (decimal degrees)
longitudenumberLongitude coordinate (decimal degrees)
parent_companystringParent company name
chemical_namestringChemical name
cas_numberstringCAS (Chemical Abstract Service) registry number
carcinogenbooleanWhether chemical is a known carcinogen
clean_air_act_chemicalbooleanWhether chemical is regulated under the Clean Air Act
classificationstringChemical classification (Dioxin, Metal, PBT, PFAS, etc.)
unit_of_measurestringUnit of measurement for release quantities (Pounds or Grams)
total_releasesnumberTotal releases including on-site and off-site (in reported units)
fugitive_airnumberFugitive air emissions
stack_airnumberStack/point air emissions
waternumberSurface water discharges
undergroundnumberUnderground injection
landnumberOn-site land disposal (landfills, surface impoundment, land treatment)
off_sitenumberOff-site transfers for disposal/release
reporting_yearnumberReporting year
federal_facilitybooleanWhether facility is federally owned

Who Uses EPA TRI Data?

  • Environmental researchers: Analyze pollution trends by chemical, facility, and region to study environmental health impacts and regulatory effectiveness
  • ESG analysts: Evaluate corporate environmental performance by tracking toxic releases from specific companies and their subsidiaries
  • Community organizers: Identify major pollution sources near residential areas and track whether releases are increasing or decreasing over time
  • Journalists: Investigate industrial pollution patterns, compare facility-level release data, and hold polluters accountable with public records
  • Government agencies: Monitor compliance with environmental regulations and identify facilities that may need additional oversight
  • Public health researchers: Correlate chemical release data with health outcomes using geographic coordinates and carcinogen flags

How to Use the EPA TRI Crawler

Input Parameters

ParameterRequiredDefaultDescription
stateNoAllUS state abbreviation (e.g., CA, TX, OH)
chemicalNoAllChemical name, partial matching (e.g., "lead", "benzene")
reportingYearNo(none)Reporting year (e.g., 2023)
facilityNameNoAllFacility name, partial matching
carcinogenOnlyNofalseIf true, only return releases of known carcinogens
maxItemsNo100Maximum release records to return. Set to 0 for unlimited (requires at least one filter)

Note: When maxItems is set to 0 (unlimited), at least one search filter is required to prevent accidentally crawling the entire TRI database (~4M+ records).

Example Configurations

Get mercury releases in Arizona for 2023:

{
"state": "AZ",
"chemical": "mercury",
"reportingYear": 2023,
"maxItems": 100
}

Get all carcinogen releases in Ohio:

{
"state": "OH",
"carcinogenOnly": true,
"reportingYear": 2023,
"maxItems": 100
}

Search for a specific facility:

{
"facilityName": "Freeport",
"reportingYear": 2023,
"maxItems": 50
}

Sample Output

{
"trifid": "85003PHNXM2827N",
"facility_name": "PHOENIX METALS CO",
"street_address": "2827 N 29TH AVE",
"city": "PHOENIX",
"county": "MARICOPA",
"state": "AZ",
"zip_code": "85009",
"latitude": 33.467842,
"longitude": -112.119637,
"parent_company": "PHOENIX METALS COMPANY LLC",
"chemical_name": "LEAD",
"cas_number": "7439921",
"carcinogen": false,
"clean_air_act_chemical": true,
"classification": "Metal",
"unit_of_measure": "Pounds",
"total_releases": 1250.5,
"fugitive_air": 120.0,
"stack_air": 350.5,
"water": 0,
"underground": 0,
"land": 780.0,
"off_site": 0,
"reporting_year": 2023,
"federal_facility": false
}

EPA TRI Data FAQ

How do I get toxic release data from the EPA? Use the EPA TRI Crawler to query the Envirofacts REST API. Set your filters (state, chemical, year, facility name) and the crawler returns structured JSON records with facility details, release quantities broken down by environmental medium, chemical metadata, and geographic coordinates.

What is the Toxics Release Inventory (TRI)? The TRI is an EPA program that requires certain industrial facilities to report annually on the quantities of toxic chemicals they release to the environment or transfer off-site. It covers over 860 chemicals and ~25,000 facilities, with data available from 1987 to present.

How are release quantities broken down? Each record includes quantities for six release pathways: fugitive air emissions, stack/point air emissions, surface water discharges, underground injection, on-site land disposal, and off-site transfers. The total_releases field is the sum of all pathways. Quantities are reported in Pounds or Grams depending on the chemical.

Does this crawler require proxies or authentication? No. The EPA Envirofacts API is a public government service. No authentication, API keys, or proxies are required.

How long does a typical run take? The EPA API responds in 3-7 seconds per request, and each record requires multiple API calls to join facility, release, and transfer data. Expect approximately 30 seconds for 10 records and 5 minutes for 100 records.

Why does state filtering use in-memory matching? The EPA Envirofacts API silently ignores the STATE_ABBR filter on the TRI_REPORTING_FORM table (the column exists on TRI_FACILITY, not TRI_REPORTING_FORM). The crawler works around this by pre-loading matching facility IDs from the TRI_FACILITY table and filtering reporting form records in memory.

What chemicals are tracked? The TRI tracks over 860 chemicals including metals (lead, mercury, chromium), volatile organic compounds (benzene, toluene), dioxins, persistent bioaccumulative toxins (PBTs), and PFAS compounds. Each chemical record includes the CAS registry number, carcinogen flag, and classification.

Need a Custom Feature?

If you need additional data fields, custom aggregations, or integration with your environmental monitoring pipeline, file an issue or get in touch.