Website Contact Extractor (Browser)
Pricing
Pay per usage
Website Contact Extractor (Browser)
Extract team contacts from JavaScript-rendered company websites (React, Vue, Angular) using AI + Playwright. Companion to the HTTP-only Website Contact Extractor. Handles the ~28% of sites that need a real browser. Same output format, same quality, same LLM fallback chain.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Alessandro Santamaria
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
4 hours ago
Last modified
Categories
Share
Extract contacts from JavaScript-rendered company websites (React, Vue, Angular SPAs) using AI + Playwright.
This is the browser-based companion to the Website Contact Extractor (HTTP-only). Use this actor when the HTTP version flags companies with js_rendering_suspected status.
When to use this actor
- Team/contact/about pages built with React, Vue, Angular, or other JS frameworks
- Pages that return empty/skeleton HTML without JavaScript execution
- Companies flagged by the HTTP actor's JS-rendering detection
- Auto-chained via
enablePlaywrightFallbackon the HTTP actor
How it works
- Link discovery from homepage (HTTP, fast)
- Phase 1: Playwright renders main pages (team, contact, about, impressum)
- Phase 2: Playwright renders team member detail pages
- LLM extraction using Gemini Flash / Groq / OpenRouter
- Anti-hallucination validation against source HTML
- Cross-company dedup to catch LLM-invented names
Same extraction pipeline as the HTTP actor — same output format, same quality.
Input
Same input format as the HTTP actor:
{"companies": [{"company_id": "abc-123","company_name": "TechCorp AG","website_url": "https://techcorp.ch"}],"llmProvider": "gemini","geminiApiKey": "YOUR_KEY"}
Output
Each company result includes browser_extraction: true:
{"company_id": "abc-123","company_name": "TechCorp AG","contacts": [{"name": "Max Mustermann","position": "CTO","email": "max@techcorp.ch","confidence": 0.92}],"status": "success","browser_extraction": true}
Memory requirements
- Minimum: 1024 MB (Playwright + Chrome)
- Recommended: 2048 MB for 5+ companies
- Maximum: 4096 MB
Pricing
Browser-based extraction costs ~2x the HTTP actor:
| Event | Cost |
|---|---|
browser-company-enriched | $0.02/company |
browser-contact-result | $0.010/contact |
Auto-chaining
The HTTP actor can automatically trigger this browser actor:
- Run the HTTP actor with
enablePlaywrightFallback: true - Companies with
js_rendering_suspectedstatus are collected - A browser actor run starts automatically (fire-and-forget)
- The browser run ID is saved in the key-value store
LLM fallback chain
Like the HTTP actor, this actor supports automatic provider fallback. Just provide API keys for the providers you want to use:
{"geminiApiKey": "YOUR_GEMINI_KEY","llmApiKey": "YOUR_GROQ_KEY","openrouterApiKey": "YOUR_OPENROUTER_KEY"}
The system auto-discovers available providers and builds a fallback chain (e.g. Gemini → Groq → OpenRouter). If one provider's quota runs out, it instantly falls back to the next.
End-to-end pipeline
This actor is part of a 5-actor enrichment suite:
| Actor | Purpose | Memory | Link |
|---|---|---|---|
| Google Maps Scraper | Find companies by location | ~80MB | View |
| Website Contact Extractor | Extract contacts (HTTP) | ~256MB | View |
| Website Contact Extractor (Browser) | Extract contacts from JS pages | ~1-4GB | This actor |
| Website Job Extractor | Extract jobs (HTTP) | ~128MB | View |
| Website Job Extractor (Browser) | Extract jobs from JS pages | ~1-4GB | View |
Limitations
- Higher memory usage (~1GB vs ~256MB for HTTP)
- Slower execution (page rendering + wait times)
- Higher cost per result (2x HTTP rates)
- Use the HTTP actor first — only fall back to browser when needed