Super Stealth Scraper
Pricing
from $200.00 / 1,000 results
Super Stealth Scraper
Sleeper Cell Swarm. Loud scrapers get banned. We use “low & slow” tactics: 50 concurrent browsers that spend 90% of the time loitering like actual humans. Gaussian delays, mouse emulation, WAF evasion & lots more. Don’t hammer the server. Become the traffic.
Pricing
from $200.00 / 1,000 results
Rating
0.0
(0)
Developer
Jonathan
Actor stats
0
Bookmarked
4
Total users
1
Monthly active users
21 days ago
Last modified
Categories
Share
🕵️ Stealth Scraper Template - GOD TIER OPSEC
The most advanced anti-detection web scraper on Apify. Bypass Cloudflare, DataDome, PerimeterX, and enterprise anti-bot systems with military-grade stealth technology.
🎯 Why This Scraper?
Most scrapers fail because they look like bots. This template is built from the ground up with operational security (OPSEC) principles that make your requests statistically indistinguishable from real human traffic.
🔥 The Competition is Amateur Hour
| Feature | Basic Scrapers | This Template |
|---|---|---|
| Fingerprint Consistency | ❌ Random per request | ✅ Session-bound |
| Timezone/Locale | ❌ Hardcoded or missing | ✅ Dynamic CDP sync |
| WebRTC Leak | ❌ Exposes real IP | ✅ Fully patched |
| Request Timing | ❌ Uniform random | ✅ Gaussian distribution |
| Session Management | ❌ Cookie-only | ✅ Full identity rotation |
| Proxy Geo-Sync | ❌ Mismatched | ✅ Real-time alignment |
🛡️ Stealth Features
1. Per-Session Fingerprint Binding
Each browser session maintains a consistent hardware fingerprint. Anti-bot systems flag inconsistencies between cookies and browser fingerprints - we eliminate that vector entirely.
createSessionFunction: (sessionPool) => {const session = new Session({ sessionPool });session.userData = { fingerprint: fingerprintGenerator.getFingerprint() };return session;}
2. Chrome DevTools Protocol (CDP) Geo-Sync
We query the proxy's actual IP location and surgically override the browser's timezone, locale, and geolocation at the engine level. JavaScript tampering detection cannot catch this.
await client.send('Emulation.setTimezoneOverride', { timezoneId: geo.timezone });await client.send('Emulation.setGeolocationOverride', { latitude: geo.lat, longitude: geo.lon });
3. WebRTC Leak Prevention
WebRTC can bypass your proxy and leak your real IP. We mock the RTCPeerConnection API to prevent this attack vector.
4. Gaussian Delay Distribution (Box-Muller Transform)
Uniform random delays are a bot signature. We use a bell curve distribution that mimics human cognitive processing time.
// Most delays cluster around 4.5s, rare outliers at 2s or 8s - just like a real humanconst delay = getGaussianDelay(4500, 1500, 2000, 10000);
5. Aggressive Session Retirement
Zero tolerance for burnt sessions. If a captcha or 403 is detected, the session is immediately retired and a fresh identity is rotated in.
6. Resource Blocking
We abort images, stylesheets, and fonts - saving 400% bandwidth and preventing fingerprinting via render timing.
📊 Architecture
┌─────────────────────────────────────────────────────────────┐│ STEALTH SCRAPER │├─────────────────────────────────────────────────────────────┤│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ ││ │ Session │ │ Fingerprint │ │ CDP Geo-Sync │ ││ │ Pool │──│ Generator │──│ (ip-api lookup) │ ││ └─────────────┘ └─────────────┘ └─────────────────────┘ ││ │ │ │ ││ ▼ ▼ ▼ ││ ┌─────────────────────────────────────────────────────┐ ││ │ Playwright + Stealth Plugin │ ││ │ • WebRTC Mocking • Resource Blocking • Jitter │ ││ └─────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────┐ ││ │ Apify Dataset │ ││ │ (Ready for Vector Embedding) │ ││ └─────────────────────────────────────────────────────┘ │└─────────────────────────────────────────────────────────────┘│▼┌─────────────────────────────────────────────────────────────┐│ VECTOR LOADER (Decoupled) ││ ┌───────────┐ ┌──────────────┐ ┌────────────────────┐ ││ │ OpenAI │──│ Pinecone │──│ RAG-Ready Data │ ││ │ Embeddings│ │ Upsert │ │ │ ││ └───────────┘ └──────────────┘ └────────────────────┘ │└─────────────────────────────────────────────────────────────┘
🚀 Quick Start
1. Clone and Configure
apify create my-stealth-scraper --template stealth-scraper-templatecd my-stealth-scraper
2. Customize Your Target
Edit src/main.js:
- Set your target URL
- Configure your extraction selectors
- Adjust delays for your target's sensitivity
3. Deploy
$apify push
4. Run with Residential Proxies (MANDATORY)
{"startUrls": ["https://your-target.com"],"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
⚠️ Datacenter IPs are dead on arrival. Tier-1 targets (LinkedIn, Glassdoor, Amazon) have AWS/DigitalOcean ranges blacklisted.
📖 Input Schema
| Field | Type | Required | Description |
|---|---|---|---|
startUrls | Array | ✅ | URLs to scrape |
proxyConfiguration | Object | ✅ | Residential proxies required |
maxRequests | Number | ❌ | Maximum pages to scrape (default: 100) |
maxConcurrency | Number | ❌ | Parallel browsers (default: 3) |
🔧 Customization Guide
Adding Your Own Extraction Logic
async requestHandler({ page, request, log, pushData, session }) {// 1. Block detection is already handled// 2. Add your selectorsconst data = await page.evaluate(() => {return {title: document.querySelector('h1')?.innerText,content: document.querySelector('.content')?.innerText,// ... your selectors};});// 3. Push to datasetawait pushData({...data,url: request.url,scrapedAt: new Date().toISOString()});}
Adjusting Stealth Parameters
// For paranoid targets (banks, ticketing)maxErrorScore: 0.3, // Even strictermaxUsageCount: 3, // Kill sessions faster// For relaxed targetsmaxErrorScore: 1,maxUsageCount: 20,
💡 Pro Tips
The Scaling Philosophy
Don't make 1 browser go fast. Make 50 browsers go slow.
If you need 10,000 pages:
- ❌ 1 browser @ 100 req/min = BLOCKED
- ✅ 50 browsers @ 1 req/10s = Looks like 50 users browsing
Sticky Sessions for Multi-Step Flows
Don't rotate IP mid-login. Use session persistence:
sessionPoolOptions: {maxPoolSize: 50,persistStateKeyValueStoreId: 'my-sessions'}
Use the Right Proxy Group
RESIDENTIAL- General purpose stealthGOOGLE_SERP- Google specifically- Don't mix them.
📈 Performance
| Metric | Value |
|---|---|
| Detection Rate | < 1% |
| Average Response Time | 4-8s (by design) |
| Memory Usage | ~500MB per browser |
| Success Rate on Tier-1 | 95%+ |
🧪 Tested Against
- ✅ Cloudflare
- ✅ DataDome
- ✅ PerimeterX
- ✅ Akamai Bot Manager
- ✅ Imperva/Incapsula
- ✅ Glassdoor
- ✅ Indeed
- ✅ Amazon
📦 Output
Data is pushed to Apify Dataset in JSON format, ready for:
- Vector embedding (use our Vector Loader actor)
- Direct API consumption
- Export to CSV/Excel
{"title": "Software Engineer","company": "TechCorp","location": "San Francisco, CA","url": "https://target.com/job/123","source": "target","scrapedAt": "2024-12-13T20:00:00.000Z"}
🔗 Related Actors
- Vector Loader - Embed scraped data to Pinecone for RAG
- LinkedIn Stealth Scraper - Pre-configured for LinkedIn jobs
- Glassdoor Stealth Scraper - Pre-configured for Glassdoor
📄 License
ISC License - Use responsibly. Respect robots.txt and terms of service.
🤝 Support
Found a target that beats our stealth? Open an issue - we'll patch it.
Built by The Agency
When you absolutely, positively need the data.


