
Apify Scrappey
Pricing
Pay per usage

Apify Scrappey
A template for scraping data from web pages using the Scrappey.com API service integrated with an Apify Actor. This actor provides a robust solution for handling complex web scraping scenarios, including sites with anti-bot protection such as Cloudflare, Datadome, PerimeterX and all other forms.
5.0 (1)
Pricing
Pay per usage
0
Monthly users
2
Last modified
2 days ago
Apify Scrappey Actor
A powerful web scraping solution that combines Apify's actor infrastructure with Scrappey's advanced anti-detection capabilities. This actor helps you scrape any website while bypassing common anti-bot protections like Cloudflare, Datadome, and PerimeterX.
🚀 Key Features
- Advanced Protection Bypass - Handles Cloudflare, Datadome, PerimeterX, and other anti-bot systems
- Session Management - Maintains persistent browser sessions for efficient scraping
- Smart Proxy Rotation - Automatic proxy management with country-specific options
- Browser Fingerprint Randomization - Prevents detection through browser fingerprinting
- Comprehensive Data Extraction - Captures HTML, cookies, headers, and more
- Error Handling - Robust error handling with detailed error codes and messages
📋 Input Options
1{ 2 "scrappeyApiKey": "your-api-key", 3 "url": "https://example.com", 4 "requestType": "browser", // "browser" or "request" 5 "customHeaders": {}, // Custom HTTP headers 6 "browserActions": [], // Automated browser actions 7 "session": null, // Session ID for persistent browsing 8 "proxyCountry": null, // Specific country for proxy 9 "cookiejar": null, // Pre-set cookies 10 "includeImages": false, // Include image URLs in response 11 "includeLinks": false // Include link URLs in response 12}
📦 Output Data Structure
The actor stores the following data in the Apify dataset:
1{ 2 "url": "scraped-url", 3 "verified": true/false, // Request verification status 4 "cookieString": "cookie-string", // Formatted cookie string 5 "responseHeaders": {}, // Response HTTP headers 6 "requestHeaders": {}, // Request HTTP headers 7 "html": "page-html", // Raw HTML content 8 "innerText": "page-text", // Page text content 9 "cookies": [], // Array of cookies 10 "ipInfo": {}, // IP information 11 "status": 200, // HTTP status code 12 "timeElapsed": "1.2s", // Request duration 13 "session": "session-id", // Session identifier 14 "localStorage": {}, // Browser localStorage data 15 "timestamp": "ISO-date" // Timestamp of scrape 16}
🛠️ Common Use Cases
-
E-commerce Scraping
- Product details from protected stores
- Price monitoring
- Inventory tracking
-
Login-Protected Content
- Session management for authenticated scraping
- Cookie handling for maintaining login state
-
Anti-Bot Protected Sites
- Cloudflare challenge bypass
- Datadome protection handling
- PerimeterX mitigation
💡 Usage Examples
Basic Scraping
1{ 2 "scrappeyApiKey": "your-api-key", 3 "url": "https://example.com", 4 "requestType": "browser" 5}
Session-Based Scraping
1{ 2 "scrappeyApiKey": "your-api-key", 3 "url": "https://example.com", 4 "requestType": "browser", 5 "session": "my-session-id", 6 "cookiejar": [ 7 { 8 "name": "sessionId", 9 "value": "abc123", 10 "domain": "example.com", 11 "path": "/" 12 } 13 ] 14}
Geo-Targeted Scraping
1{ 2 "scrappeyApiKey": "your-api-key", 3 "url": "https://example.com", 4 "proxyCountry": "UnitedStates", 5 "includeImages": true, 6 "includeLinks": true 7}
⚠️ Error Handling
The actor handles common error scenarios:
Code | Description | Solution |
---|---|---|
CODE-0001 | Server overload | Retry with backoff |
CODE-0002 | Cloudflare blocked | Try different proxy |
CODE-0010 | Datadome blocked | Change proxy country |
CODE-0029 | Too many sessions | Wait for session cleanup |
🚦 Best Practices
-
Session Management
- Use persistent sessions for related requests
- Clean up sessions when done using
sessions.destroy
-
Proxy Usage
- Rotate proxies for high-volume scraping
- Use country-specific proxies for geo-restricted content
-
Error Handling
- Implement exponential backoff for retries
- Monitor error rates by URL
📚 Getting Started
-
Setup
1git clone https://github.com/yourusername/apify-scrappey 2cd apify-scrappey 3npm install
-
Configuration
- Get your Scrappey API key from scrappey.com
- Set up your input.json in the Apify console or locally
-
Running Locally
apify run
-
Deployment
1apify login 2apify push
🔗 Resources
🆘 Support
- Technical issues: Create GitHub Issue
- Scrappey API: Scrappey Support
- Apify Platform: Apify Discord
📄 License
ISC License - Feel free to use this actor for your scraping needs!
Pricing
Pricing model
Pay per usageThis Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.