Website Job Extractor (Browser)
Pricing
Pay per usage
Website Job Extractor (Browser)
Extract job listings from JavaScript-rendered career pages (React, Vue, Angular) using AI + Playwright. Companion to the HTTP-only Website Job Extractor. Use it for the ~28% of company sites that need a real browser. Same output format, same quality, same LLM fallback chain.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Alessandro Santamaria
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
6 hours ago
Last modified
Categories
Share
Extract job listings from JavaScript-rendered career pages (React, Vue, Angular SPAs) using AI + Playwright.
This is the browser-based companion to the Website Job Extractor (HTTP-only). Use this actor when the HTTP version flags companies with js_rendering_suspected: true.
When to use this actor
- Career pages built with React, Vue, Angular, or other JS frameworks
- Pages that return empty/skeleton HTML without JavaScript execution
- Companies flagged by the HTTP actor's JS-rendering detection
- Auto-chained via
enablePlaywrightFallbackon the HTTP actor
How it works
- Playwright renders the full page (waits for network idle + text content)
- Career page discovery from homepage navigation (same as HTTP actor)
- ATS detection for 19 systems (Personio, Greenhouse, Softgarden, etc.)
- LLM extraction using Gemini Flash / Groq / OpenRouter
- Validation with confidence scoring and deduplication
- Pagination follow-up for multi-page listings
Same extraction pipeline as the HTTP actor — same output format, same quality.
Input
Same input format as the HTTP actor. Typically auto-chained:
{"companies": [{"company_id": "abc-123","company_name": "TechCorp AG","website_url": "https://techcorp.ch"}],"llmProvider": "gemini","geminiApiKey": "YOUR_KEY"}
Output
Each job is a dataset item with browser_extraction: true:
{"company_id": "abc-123","company_name": "TechCorp AG","title": "Senior Frontend Developer (m/w/d)","location": "Zürich","employment_type": "Vollzeit","department": "Engineering","application_url": "https://techcorp.ch/jobs/apply/123","confidence": 0.85,"browser_extraction": true,"extracted_at": "2026-03-09T10:00:00.000Z"}
Memory requirements
- Minimum: 1024 MB (Playwright + Chrome)
- Recommended: 2048 MB for 5+ companies
- Maximum: 4096 MB
Pricing
Browser-based extraction costs ~2x the HTTP actor due to Chrome overhead:
| Event | Cost |
|---|---|
browser-company-enriched | $0.02/company |
browser-job-result | $0.008/job |
Auto-chaining
The HTTP actor can automatically trigger this browser actor for JS-flagged companies:
- Run the HTTP actor with
enablePlaywrightFallback: true - Companies with
js_rendering_suspectedare collected - A browser actor run starts automatically (fire-and-forget)
- The browser run ID is saved in the key-value store as
BROWSER_FALLBACK_RUN_ID
LLM fallback chain
Like the HTTP actor, this actor supports automatic provider fallback. Just provide API keys for the providers you want to use:
{"geminiApiKey": "YOUR_GEMINI_KEY","llmApiKey": "YOUR_GROQ_KEY","openrouterApiKey": "YOUR_OPENROUTER_KEY"}
The system auto-discovers available providers and builds a fallback chain (e.g. Gemini → Groq → OpenRouter). If one provider's quota runs out, it instantly falls back to the next.
End-to-end pipeline
This actor is part of a 5-actor enrichment suite:
| Actor | Purpose | Memory | Link |
|---|---|---|---|
| Google Maps Scraper | Find companies by location | ~80MB | View |
| Website Job Extractor | Extract jobs (HTTP) | ~128MB | View |
| Website Job Extractor (Browser) | Extract jobs from JS pages | ~1-4GB | This actor |
| Website Contact Extractor | Extract contacts (HTTP) | ~256MB | View |
| Website Contact Extractor (Browser) | Extract contacts from JS pages | ~1-4GB | View |
Limitations
- Higher memory usage (~1GB vs ~128MB for HTTP)
- Slower execution (page rendering + wait times)
- Higher cost per result (2x HTTP rates)
- Use the HTTP actor first — only fall back to browser when needed