Xcrawl Search Scrape Actor
Pricing
Pay per usage
Xcrawl Search Scrape Actor
Under maintenancePricing
Pay per usage
Rating
0.0
(0)
Developer
Charles
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
XCrawl Web Search & Scrape — Apify Actor
Search the web and scrape any URL using XCrawl's residential proxy network. Bypass anti-bot systems with automatic JS rendering fallback and global IP rotation.
Actor: yanxvdong123/xcrawl-search-scrape | Runtime: Node.js 22 | License: MIT
🚀 Quick Start
- Open the Actor Console
- Set
XCRAWL_API_KEYin Environment Variables (get a free key at dash.xcrawl.com) - Choose Search or Scrape mode, fill in the inputs
- Hit Run
No credit card needed — XCrawl gives free trial credits on signup.
📋 Input Parameters
Search Mode (action: "search")
| Parameter | Type | Default | Description |
|---|---|---|---|
query | string | required | Web search query (max 200 chars) |
limit | integer | 10 | Number of results (1–50) |
location | string | "US" | Geo-location code (US, UK, CN, JP, DE, etc.) |
language | string | "en" | Search language (en, zh, ja, fr, etc.) |
withContent | boolean | true | Fetch full page content for each result |
render | boolean | false | JS rendering for anti-bot bypass |
formats | string | "markdown,summary" | Output formats: comma-separated (markdown, summary, html) |
screenshot | boolean | false | Capture page screenshot (requires render=true) |
Scrape Mode (action: "scrape")
| Parameter | Type | Default | Description |
|---|---|---|---|
url | string | required | Single URL to scrape (max 2000 chars) |
render | boolean | false | JS rendering for anti-bot bypass |
formats | string | "markdown,summary" | Output formats |
screenshot | boolean | false | Capture screenshot (requires render=true) |
🧠Intelligent Anti-Block System
This actor is built to handle modern anti-bot systems out of the box:
- Automatic block detection — Heuristically checks for Cloudflare, DataDome, and other challenge pages (looks for captcha forms, browser verification, access denied messages)
- Smart retry — If a page appears blocked, automatically retries with headless browser rendering (Chromium via XCrawl's
jsRender) - Concurrent crawling — Uses
p-limitto run up to 5 parallel scrapes (balanced for speed + reliability) - Global proxy pool — Requests route through XCrawl's residential proxy network with configurable geo-location
- Per-URL resilience — Each URL gets at least 2 attempts; if both fail, the error is recorded per-entry without stopping the batch
When to enable render
✅ Turn ON for: News sites with paywalls (Reuters, WSJ), sites behind Cloudflare/DataDome, JavaScript-heavy SPAs
⌠Keep OFF for: Simple HTML pages, blogs, documentation (faster and cheaper without rendering)
📦 Output Format
Each result is pushed to the Apify dataset:
{"title": "Page Title","url": "https://example.com","snippet": "Search result description","markdown": "Full page content converted to markdown...","summary": "AI-generated summary from XCrawl...","scrapeStatus": "completed","screenshot": "base64-encoded PNG (if enabled)","credits": "0.5","scrapeError": null}
Search mode returns an array of enriched results.
Scrape mode returns a single result object.
💰 Usage & Pricing
| Mode | XCrawl Credits Consumed |
|---|---|
| Search (1 query) | ~1 credit |
| Scrape (no render) | ~1–3 credits |
| Scrape (with render) | ~3–8 credits |
| Free trial | ✅ Included with XCrawl signup |
The actor itself is free to run on Apify — you only pay for XCrawl API credits consumed.
🔧 Environment Variables
| Variable | Required | Description |
|---|---|---|
XCRAWL_API_KEY | ✅ Yes | Your API key from dash.xcrawl.com. Sign up → Dashboard → API Keys |
🎯 Use Cases
- Content research — Collect articles, blog posts, and documentation on any topic
- Market intelligence — Scrape competitor pricing, product listings, and reviews
- SEO / SERP monitoring — Track search rankings across different geo-locations
- RAG / LLM pipelines — Feed clean markdown content into vector databases or AI agents
- E-commerce — Monitor product catalogs with location-specific searches
- News aggregation — Gather articles from multiple sources with automatic paywall bypass
🗠Architecture
Apify Run└─ src/main.js (entry point)├─ XCrawl Search API → Get top results├─ XCrawl Scrape API → Extract page content│ └─ p-limit (concurrency = 5)│ ├─ Normal scrape (fast)│ └─ Retry with JS render (anti-bot fallback)└─ Apify Dataset ↠Push all results
📄 Links
- Source code: GitHub
- XCrawl Dashboard: dash.xcrawl.com
- XCrawl API Docs: docs.xcrawl.com
- Report issues: GitHub Issues