Website Contact Extractor (Browser) avatar

Website Contact Extractor (Browser)

Pricing

Pay per usage

Go to Apify Store
Website Contact Extractor (Browser)

Website Contact Extractor (Browser)

Extract team contacts from JavaScript-rendered company websites (React, Vue, Angular) using AI + Playwright. Companion to the HTTP-only Website Contact Extractor. Handles the ~28% of sites that need a real browser. Same output format, same quality, same LLM fallback chain.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Alessandro Santamaria

Alessandro Santamaria

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

4 hours ago

Last modified

Share

Extract contacts from JavaScript-rendered company websites (React, Vue, Angular SPAs) using AI + Playwright.

This is the browser-based companion to the Website Contact Extractor (HTTP-only). Use this actor when the HTTP version flags companies with js_rendering_suspected status.

When to use this actor

  • Team/contact/about pages built with React, Vue, Angular, or other JS frameworks
  • Pages that return empty/skeleton HTML without JavaScript execution
  • Companies flagged by the HTTP actor's JS-rendering detection
  • Auto-chained via enablePlaywrightFallback on the HTTP actor

How it works

  1. Link discovery from homepage (HTTP, fast)
  2. Phase 1: Playwright renders main pages (team, contact, about, impressum)
  3. Phase 2: Playwright renders team member detail pages
  4. LLM extraction using Gemini Flash / Groq / OpenRouter
  5. Anti-hallucination validation against source HTML
  6. Cross-company dedup to catch LLM-invented names

Same extraction pipeline as the HTTP actor — same output format, same quality.

Input

Same input format as the HTTP actor:

{
"companies": [
{
"company_id": "abc-123",
"company_name": "TechCorp AG",
"website_url": "https://techcorp.ch"
}
],
"llmProvider": "gemini",
"geminiApiKey": "YOUR_KEY"
}

Output

Each company result includes browser_extraction: true:

{
"company_id": "abc-123",
"company_name": "TechCorp AG",
"contacts": [
{
"name": "Max Mustermann",
"position": "CTO",
"email": "max@techcorp.ch",
"confidence": 0.92
}
],
"status": "success",
"browser_extraction": true
}

Memory requirements

  • Minimum: 1024 MB (Playwright + Chrome)
  • Recommended: 2048 MB for 5+ companies
  • Maximum: 4096 MB

Pricing

Browser-based extraction costs ~2x the HTTP actor:

EventCost
browser-company-enriched$0.02/company
browser-contact-result$0.010/contact

Auto-chaining

The HTTP actor can automatically trigger this browser actor:

  1. Run the HTTP actor with enablePlaywrightFallback: true
  2. Companies with js_rendering_suspected status are collected
  3. A browser actor run starts automatically (fire-and-forget)
  4. The browser run ID is saved in the key-value store

LLM fallback chain

Like the HTTP actor, this actor supports automatic provider fallback. Just provide API keys for the providers you want to use:

{
"geminiApiKey": "YOUR_GEMINI_KEY",
"llmApiKey": "YOUR_GROQ_KEY",
"openrouterApiKey": "YOUR_OPENROUTER_KEY"
}

The system auto-discovers available providers and builds a fallback chain (e.g. Gemini → Groq → OpenRouter). If one provider's quota runs out, it instantly falls back to the next.

End-to-end pipeline

This actor is part of a 5-actor enrichment suite:

ActorPurposeMemoryLink
Google Maps ScraperFind companies by location~80MBView
Website Contact ExtractorExtract contacts (HTTP)~256MBView
Website Contact Extractor (Browser)Extract contacts from JS pages~1-4GBThis actor
Website Job ExtractorExtract jobs (HTTP)~128MBView
Website Job Extractor (Browser)Extract jobs from JS pages~1-4GBView

Limitations

  • Higher memory usage (~1GB vs ~256MB for HTTP)
  • Slower execution (page rendering + wait times)
  • Higher cost per result (2x HTTP rates)
  • Use the HTTP actor first — only fall back to browser when needed