Website Enrichment Scraper avatar

Website Enrichment Scraper

Pricing

from $6.00 / 1,000 results

Go to Apify Store
Website Enrichment Scraper

Website Enrichment Scraper

Website Enrichment Scraper extracts structured business intelligence from any website, including business name, category, and verified email addresses. Designed for lead enrichment, sales intelligence, and data validation workflows at scale.

Pricing

from $6.00 / 1,000 results

Rating

0.0

(0)

Developer

Gyanendra Thakur

Gyanendra Thakur

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

13 days ago

Last modified

Share

Website Enrichment Scraper enriches business profiles from website content and returns records in a fixed business-info format.

Overview

This actor accepts business records (or legacy URL lists), crawls each website with strict same-domain limits, and fills missing fields from structured and unstructured website signals (JSON-LD, metadata, visible text, and key links).

Extracted Data

For each record, the actor outputs:

  • businessName
  • category
  • rating
  • reviewsCount
  • address
  • phone
  • website
  • email
  • businessDescription
  • googleMapsUrl
  • location

Use Cases

  • Lead enrichment and verification
  • Sales prospecting workflows
  • Agency outreach campaigns
  • Market research
  • Contact database building

How It Works

The actor:

  1. Normalizes each input record to the target business schema.
  2. Scans the homepage and prioritized internal pages (e.g., contact/about).
  3. Extracts business signals from JSON-LD, metadata, and body text.
  4. Preserves existing non-empty input fields.
  5. Fills only missing fields and returns one enriched record per input record.

Input

  • records (recommended): array of partial business objects to enrich.
  • startUrls (legacy): array of URLs if records are not provided.
  • maxPagesPerSite: crawl depth cap per website.

Output Structure

Each dataset record includes the full business format shown above.

Performance & Architecture

  • Built with Crawlee + Cheerio for lightweight crawling
  • Controlled concurrency, retry handling, and same-domain link strategy
  • Structured extraction priority (JSON-LD first, then metadata/text fallback)
  • Optimized for scalable batch enrichment

Notes

This actor extracts only publicly available information from websites. Data availability depends on the structure and transparency of each site.