Deprecated

Pricing

from $5.00 / 1,000 results

See alternative Actors

Go to Apify Store

ESG Report Scraper — Sustainability Data

Deprecated

See alternative Actors

Automatically collect corporate sustainability reports and ESG data. Extract GHG emissions, energy metrics, and CDP disclosures for investment research.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

Vhub Systems

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

ESG Report Scraper: Corporate Sustainability Data Aggregator

Automatically collect and extract ESG metrics, sustainability reports, and corporate climate disclosures from multiple sources in one unified workflow.

What is ESG Report Scraper?

ESG Report Scraper is an automated data collection tool designed for ESG analysts, investors, and sustainability professionals who need to monitor corporate environmental disclosures at scale. Instead of manually downloading dozens of PDF reports and searching through news sites, this actor aggregates sustainability data from corporate websites, CDP database, and specialized ESG news sources, extracting structured metrics like GHG emissions, energy consumption, and waste generation.

The actor intelligently searches for sustainability reports, annual reports, and CSRD-compliant disclosures using DuckDuckGo, then processes each document to extract key environmental metrics. It automatically detects report types, identifies reporting years, and structures the data into a standardized JSON format ready for analysis, dashboards, or integration with ESG rating systems.

Whether you are tracking climate commitments across a portfolio of companies, conducting competitive ESG benchmarking, or researching corporate sustainability trends for academic purposes, this scraper eliminates manual data collection and delivers consistent, structured ESG data in minutes instead of hours.

Output Data Fields

Field	Type	Description
companyName	string	Name of the company or keyword searched
reportTitle	string	Full title of the sustainability report or webpage
reportUrl	string	Direct URL to the report (PDF or webpage)
reportType	string	Type of report: "sustainability", "CSRD", or "annual"
year	integer	Reporting year extracted from document (e.g., 2024)
keyMetrics	object	Extracted environmental metrics (emissions, energy, waste)
keyMetrics.emissions	string	GHG emissions data (e.g., "16.2 million metric tons CO2e")
keyMetrics.energy	string	Energy consumption data (e.g., "2,950 GWh")
keyMetrics.waste	string	Waste generation data (e.g., "38,000 tons")
summary	string	Short summary or meta description from the source page
sourceUrl	string	Original source URL where the data was found
scrapedAt	string	ISO timestamp when the data was collected

Tutorial: How to Extract ESG Data in 7 Steps

1. Open the Actor Navigate to the ESG Report Scraper actor page on Apify Console and click "Try for free".

2. Prepare Your Keywords Enter company names or ESG topics you want to track. Examples: "Tesla", "Unilever", "renewable energy", "carbon neutrality".

3. Configure Data Sources Choose which sources to scrape: corporate websites (direct reports), CDP database (climate disclosures), or ESG news sites (latest updates).

4. Set Result Limits Specify the maximum number of reports to process. For initial testing, start with 20-30 results. For comprehensive data collection, increase to 50-100.

5. Run the Actor Click "Start" and the actor will begin searching for sustainability reports across all selected sources. Processing typically takes 3-10 minutes depending on the number of keywords.

6. Review Extracted Metrics Once complete, view the dataset in JSON or Excel format. Each record includes structured metrics like GHG emissions, energy consumption, and direct links to source documents.

7. Export or Integrate Download the data as CSV, JSON, or Excel for manual analysis, or connect the dataset to your BI tools, ESG platforms, or investment research systems via Apify API.

Input Parameters

Parameter	Type	Required	Default	Description
keywords	array of strings	Yes	-	List of company names or ESG topics to search for (e.g., ["Apple", "Microsoft", "Tesla"])
maxResults	integer	No	20	Maximum number of reports or pages to process per run (range: 1-200)
sources	array of strings	No	["corporate", "cdp", "news"]	Data sources to scrape: "corporate" (company websites), "cdp" (CDP database), "news" (ESG news sites)

Example Input

{
  "keywords": [
    "Apple",
    "Microsoft",
    "Unilever",
    "BP",
    "Tesla"
  ],
  "maxResults": 50,
  "sources": [
    "corporate",
    "cdp",
    "news"
  ]
}

Example Output

{
  "companyName": "Apple",
  "reportTitle": "Apple Environmental Progress Report 2024",
  "reportUrl": "https://www.apple.com/environment/pdf/Apple_Environmental_Progress_Report_2024.pdf",
  "reportType": "sustainability",
  "year": 2024,
  "keyMetrics": {
    "emissions": "16.2 million metric tons CO2e",
    "energy": "2,950 GWh",
    "waste": "38,000 tons"
  },
  "summary": "Apple's comprehensive environmental report covering carbon neutrality goals, renewable energy progress, circular economy initiatives, and supply chain emissions reduction strategies for 2024.",
  "sourceUrl": "https://www.apple.com/environment/",
  "scrapedAt": "2024-02-15T14:22:30.000Z"
}

{
  "companyName": "Microsoft",
  "reportTitle": "Microsoft 2024 Environmental Sustainability Report - CSRD Disclosure",
  "reportUrl": "https://query.prod.cms.rt.microsoft.com/cms/api/am/binary/RW1lMjE",
  "reportType": "CSRD",
  "year": 2024,
  "keyMetrics": {
    "emissions": "13.5 million metric tons CO2e",
    "energy": "5,100 MWh"
  },
  "summary": "Microsoft's progress towards carbon negative commitment by 2030, including Scope 1, 2, and 3 emissions data aligned with CSRD requirements, renewable energy procurement, and water stewardship initiatives.",
  "sourceUrl": "https://www.microsoft.com/en-us/sustainability",
  "scrapedAt": "2024-02-15T14:25:18.000Z"
}

{
  "companyName": "Unilever",
  "reportTitle": "Unilever Sustainability Report 2024: Climate Action and Regenerative Agriculture",
  "reportUrl": "https://www.unilever.com/files/sustainability/2024-sustainability-report.pdf",
  "reportType": "sustainability",
  "year": 2024,
  "keyMetrics": {
    "emissions": "8.7 million metric tons CO2e",
    "energy": "3,200 GWh",
    "waste": "125,000 tons"
  },
  "summary": "Unilever's annual sustainability disclosure covering climate commitments, regenerative agriculture programs, packaging waste reduction, and social impact initiatives across global operations.",
  "sourceUrl": "https://www.unilever.com/sustainability/",
  "scrapedAt": "2024-02-15T14:28:45.000Z"
}

{
  "companyName": "Tesla",
  "reportTitle": "Tesla Impact Report 2024",
  "reportUrl": "https://www.tesla.com/ns_videos/2024-tesla-impact-report.pdf",
  "reportType": "sustainability",
  "year": 2024,
  "keyMetrics": {
    "emissions": "2.1 million metric tons CO2e avoided",
    "energy": "6,800 GWh renewable"
  },
  "summary": "Tesla's 2024 impact report detailing environmental benefits of electric vehicles, Gigafactory renewable energy integration, battery recycling programs, and global emissions avoidance from vehicle fleet.",
  "sourceUrl": "https://www.tesla.com/impact",
  "scrapedAt": "2024-02-15T14:31:12.000Z"
}

Legal and Compliance

This actor collects publicly available ESG data from corporate websites, CDP disclosures, and news sites that are intended for public consumption. All scraped data consists of sustainability reports and environmental disclosures voluntarily published by companies for investor relations, regulatory compliance, and stakeholder transparency. Users are responsible for ensuring their use of scraped data complies with applicable laws, including copyright, data protection regulations, and the terms of service of source websites.

When using this actor, you should verify that your data collection activities comply with relevant regulations such as GDPR in the European Union or CCPA in California if processing personal data. The actor is designed for business intelligence, investment research, and academic purposes. It is not intended for unauthorized data harvesting, competitive harm, or violation of website terms of service. Users should implement appropriate rate limiting and respect robots.txt directives when deploying this tool at scale.

Pricing

This actor uses Apify platform resources based on compute time and proxy usage. Typical costs range from $0.10 to $0.50 per run depending on the number of keywords, maximum results configured, and data sources selected. Runs with 5-10 keywords and 20-50 max results usually complete in 3-10 minutes and consume 0.02-0.10 compute units.

Pricing is pay-per-use with no subscription required. You only pay for actual compute time consumed during actor runs. For high-volume or scheduled data collection, consider upgrading to Apify paid plans for better rates on compute units and increased concurrency limits. Detailed pricing information is available on the Pricing tab of this actor's page.

Frequently Asked Questions

Q: Can this actor extract data from password-protected sustainability reports? A: No, the actor only scrapes publicly accessible reports and webpages. It cannot bypass authentication or paywalls. Ensure the reports you are targeting are publicly available or published on corporate websites without login requirements.

Q: How accurate is the metric extraction for emissions and energy data? A: The actor uses advanced regex patterns to identify and extract common ESG metrics from report text. Accuracy depends on how consistently companies format their disclosures. Metrics formatted in standard ways (e.g., "16.2 million metric tons CO2e") are extracted with high accuracy, while non-standard formats may require manual review. PDF text extraction is not included, so metrics are extracted from HTML pages or PDF links are provided for manual processing.

Q: Can I schedule this actor to run automatically every month to track new reports? A: Yes, you can use Apify Schedules to run this actor automatically at specified intervals (daily, weekly, monthly). This is ideal for monitoring new sustainability report publications or tracking ESG news updates. Configure a schedule from the Schedules tab in the Apify Console.

Q: What is the difference between corporate, CDP, and news data sources? A: The "corporate" source searches company websites for sustainability reports and annual reports using DuckDuckGo. The "cdp" source targets the CDP (Carbon Disclosure Project) database for climate-related disclosures submitted by companies. The "news" source scrapes ESG news sites like ESG Today for the latest sustainability news and announcements. You can enable all three sources for comprehensive coverage or select specific sources based on your needs.

Q: How do I export the scraped data to Excel or integrate it with my ESG platform? A: After a run completes, you can download the dataset in multiple formats including JSON, CSV, Excel, and HTML from the dataset view in Apify Console. For integration with ESG platforms, BI tools, or custom applications, use the Apify API to programmatically fetch the dataset. The API provides endpoints to retrieve data in JSON format for seamless integration with Tableau, Power BI, or custom ESG rating systems.

Explore other data collection actors by lanky_quantifier for comprehensive web scraping solutions:

Reddit Thread Scraper - Extract Reddit discussions, sentiment, and community insights for brand monitoring and market research
Google Maps Scraper - Collect business listings, reviews, and location data from Google Maps for competitive intelligence
Contact Info Scraper - Automatically extract email addresses, phone numbers, and social media profiles from websites for lead generation
Amazon Product Scraper - Scrape Amazon product listings, prices, reviews, and seller information for e-commerce market analysis
LinkedIn Company Scraper - Extract company profiles, employee counts, and industry data from LinkedIn for B2B prospecting

Built by lanky_quantifier | Automating ESG data collection for investors and sustainability teams worldwide.

Company ESG & Sustainability Data Extractor

technicaldost/company-esg-sustainability-extractor

Extract ESG and sustainability metrics, carbon commitments, and net-zero targets from public company sustainability pages. Structured JSON output for finance, research, and procurement teams.

Technical Dost Solutions

💎ESG Scraper: Sustainability Reports & PDF Disclosures

primeparse/esg-content-scraper

Powerful ESG scraper (Environmental, Social, and Governance) to automatically extract sustainability reports, PDF disclosures, articles, and content from any website. Get clean, AI-ready datasets with keyword filtering, metadata extraction, images, links, and full PDF support.

PrimeParse

5.0

(1)

SEC & ESG Report Scraper

taroyamada/esg-disclosure-tracker

Extract climate disclosures and sustainability reports directly from SEC EDGAR filings and corporate investor relations web pages.

naoki anzai

Global Climate Sustainability B2B Leads

blukaze/global-climate-sustainability-b2b-leads-Apify-Actor

Global Climate & Sustainability B2B Leads Finder crawls company websites to detect ESG and sustainability activity, then converts it into structured leads with key pages, contacts, and a sustainabilityIntentScore (0–100) to quickly identify high-intent organizations.

Blukaze Automations

Elexon BMRS GB Generation Scraper

parseforge/elexon-bmrs-generation-scraper

Pull GB actual electricity generation per fuel type from the Elexon BMRS Insights API. Get psrType, quantity in MW, settlement period, and start time for wind, solar, nuclear, gas, and more. Output to CSV, Excel, JSON, or XML for grid and energy analysis.

ParseForge

esg-csrd-scraper

korobz/esg-csrd-scraper

Automate CSRD compliance. Extract Scope 1, 2, 3 emissions and ESG metrics from corporate reports. Perfect for Carbon Accounting & Supply Chain analysis.

Korobz Korobz

Maritime Resource Compliance MCP Server

ryanclinton/maritime-resource-compliance-mcp

Maritime resource extraction and environmental compliance intelligence powered by 7 Apify actors. This MCP server maps fishing quotas, oil and gas licenses, waste carriers, and environmental indicators into unified maritime resource compliance intelligence.

Ryan Clinton

Goldstandard Credits Search Scraper

stealth_mode/goldstandard-credits-search-scraper

Scrape the Gold Standard carbon credits registry to collect detailed credit issuance data. This scraper extracts 30+ fields per credit block—including certification dates, vintage years, transferability status, and compliance flags—essential for climate finance professionals, investors.

Stealth mode

LinkedIn Posts Search Scraper

freshdata/linkedin-post-search-scraper

Exact LinkedIn posts base on filters.

FreshData

224

LinkedIn Posts Search Scraper

bestscrapers/linkedin-post-search-scraper

Exact LinkedIn posts base on filters.

Linkedin Scrapers

166

Yahoo Finance Scraper

crawlerbros/yahoo-finance-scraper

Pull live and historical stock data from Yahoo Finance with quote, OHLCV history, financials, dividends, splits, news, recommendations, institutional holders. Handles Yahoo's crumb-cookie auth automatically. HTTP-only, no proxy or API key required.