Social / Tech / Mail - Website Scraper avatar
Social / Tech / Mail - Website Scraper

Pricing

$25.00/month + usage

Go to Apify Store
Social / Tech / Mail - Website Scraper

Social / Tech / Mail - Website Scraper

Crawl multiple website at once and extract their tech stack (hotjar, klaviyo, googletagmanager ...) + their social accounts (linkedin, facebook, instagram) and the emails found

Pricing

$25.00/month + usage

Rating

0.0

(0)

Developer

SASWAVE

SASWAVE

Maintained by Community

Actor stats

9

Bookmarked

206

Total users

2

Monthly active users

4 days ago

Last modified

Categories

Share

Social & Tech Stack Website Scraper (Apify Actor)

Crawl multiple websites in parallel and extract each site's technology footprint, public social profiles, and publicly available emails

All returned as clean, structured JSON for marketing intelligence, lead gen, competitive research, or enrichment pipelines.

✅ What it does

This actor crawls one or many target domains and collects:

Detected tech stack (CDNs, analytics, tag managers, marketing tools, widgets — e.g. Hotjar, Klaviyo, Google Tag Manager, Amplitude)

Public social links (LinkedIn, Facebook, Instagram, Twitter, YouTube, etc.)

All email addresses found on pages (publicly visible text / mailto links)

Optionally: sitemaps, robots, and crawl status (errors, HTTP codes)

Runs concurrently across many domains and returns normalized JSON

🔍 Key features

Bulk input support (single URL, list, or CSV)

Parallel crawling with configurable concurrency and depth

JavaScript-enabled page rendering (captures client-side injected tools)

Extracts tech artifacts from scripts, resource hostnames, and known patterns

Finds social accounts from link tags, meta tags, page content, and structured data (JSON-LD)

Collects mailto links and emails in visible text (with basic deduplication)

Rate limiting, politeness delay, and user-agent customization

Output normalization for easy import to CRMs, BI tools or data lakes

📦 Output

Object output example

{
"url": "https://www.glady.com/",
"tech_stack": [
"connect.facebook.net",
"pi.pardot.com",
"bat.bing.com",
"cdn.jsdelivr.net",
"www.googletagmanager.com",
"sdk.privacy-center.org",
"widget.botmind.io",
"appvizer.one",
"www.clarity.ms",
"cdn.amplitude.com",
"snap.licdn.com",
"go.glady.com",
"stonly.com",
"cdn.dreamdata.cloud"
],
"linkedin": "https://www.linkedin.com/company/gladyoff",
"instagram": "",
"facebook": "https://www.facebook.com/gladyoff",
"emails": ""
}

🧠 Use cases

Marketing & growth: identify tools used by prospects and target similar stack users

Sales intelligence: enrich lead profiles with social links and contact emails

Competitive intelligence: compare tech footprints across competitors

M&A / due diligence: quick surface-level tech & contact signals for targets

Data engineering: feed tech tags and emails into enrichment pipelines

🚀 Benefits

Fast parallel crawling — scale to thousands of domains

JavaScript rendering captures modern client-side tooling

Normalized output for direct import into CRMs, Sheets, or databases

Configurable to respect robots and rate limits for safe scraping

Works well combined with enrichment APIs (Clearbit, Hunter, BuiltWith)

⚙️ Best practices & tips

Enable renderJs to detect tag managers and client-side SDKs (but expect slower runs).

Use shallow maxDepth for quick tech detection (homepage + key pages).

Provide a CSV with company + domain for bulk enrichment workflows.

Combine tech_stack results with BuiltWith/SimilarTech for completeness.

Respect robot settings and set polite concurrency values to avoid IP blocking.

🛟 SUPPORT

Share your runs with the developer team and create issues on error to help us improve actor quality.

You might discover edge case we didn't test yet

We stay available anytime