Webpage Content to Markdown Super Cost Effective
Pricing
Pay per event
Webpage Content to Markdown Super Cost Effective
Focus on cost, Turn any webpage content into LLM-ready Markdown for RAG. Uses a smart hybrid 4 tier engine: Apify for crawling + Cloudflare Browser Rendering for perfect extraction. Automatically saves costs by detecting native markdown support.
Pricing
Pay per event
Rating
0.0
(0)
Developer

Søren Riisager
Actor stats
0
Bookmarked
1
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🚀 The Ultimate Web-to-Markdown Converter (Cloudflare + Apify)
Turn any website into clean, LLM-ready Markdown while saving 90% on scraping costs.
This Actor uses a smart Quadruple-Tier Architecture to intelligently switch between free extraction, Cloudflare Browser Rendering ($0.005/page), and Apify's powerful Anti-Detect Browsers.
It is designed for RAG Pipelines, AI Agents, and Dataset Creation where quality, speed, and cost efficiency are paramount.
💡 Why This Actor?
Most scrapers are either too simple (fail on JavaScript) or too expensive (always use heavy browsers). We solve this with a "Cost-First, Robustness-Last" strategy:
1. 💰 Smart Cost Optimization
We don't just blindly launch a browser. We try the cheapest methods first:
- Tier 1 (Free): Checks for native Markdown headers.
- Tier 2 (Free): Uses a local Readability engine (no browser overhead).
- Tier 3 ($0.005): Uses Cloudflare Browser Rendering for fast, cheap JS rendering.
- Tier 4 (~$0.10): Uses Apify Anti-Detect Browser only as a last resort.
- Note: Defaults to Datacenter Proxies to keep costs low.
Result: You pay pennies for easy sites, and only use "Heavy Artillery" when absolutely necessary.
3. 💸 Tiered Pricing (New!)
This Actor uses Pay-per-event pricing to ensure you only pay for what you use:
- Standard Result ($0.10 / 1k): Charged for Tiers 1, 2, and 3 (Native, Local, Cloudflare).
- Premium Result ($2.00 / 1k): Charged for Tier 4 (Apify Browser + Proxy).
2. 🛡️ Anti-Block Handling (New!)
If a website blocks our cheap requests (returning 403 Forbidden or 429 Too Many Requests), the Actor automatically fights back:
- Detects the Block.
- Retries with Tier 3 (Cloudflare) to see if a simple browser pass works.
- Escalates to Tier 4 (Apify Residential Proxy) to bypass even the toughest WAFs (Cloudflare/Akamai/etc.).
Result: Near 100% Success Rate.
🏗️ The Quadruple-Tier Architecture
| Tier | Method | Cost Type | Speed | Best For |
|---|---|---|---|---|
| 1 | Native Markdown | Standard | ⚡ Instant | Sites that serve raw markdown (e.g., GitHub, Docs). |
| 2 | Local Readability | Standard | 🚀 Very Fast | Blogs, News, Static HTML sites. |
| 3 | Cloudflare Browser | Standard | 🚄 Fast | SPAs (React/Vue), JS-heavy sites. |
| 4 | Apify Browser | Premium | 🐢 Slow | Stubborn Sites, Anti-Bot Protection, Deep Complex Apps. |
⚙️ Configuration
You have full control. Toggle tiers on/off to fit your budget and needs.
| Field | Description |
|---|---|
| Start URLs | List of URLs to scrape. |
| Cloudflare Settings | Account ID & API Token (Required for Tier 3). |
| Enable Tier 1-4 | Toggle specific tiers on/off (Default: All Enabled). |
| Proxy Configuration | Choose proxies. Default: Datacenter (Low Cost). |
| Max Concurrency | Parallel pages. Note: Tier 4 eats RAM, keep low (1-2) if using it heavily. |
🔑 Getting Cloudflare Credentials (Required for Tier 3)
To use the Cost-Saving Tier 3, you need a Cloudflare Workers Paid Plan ($5/mo).
- Account ID: Found in your Cloudflare Dashboard URL.
- API Token: Create a token with Account > Browser Rendering > Edit permissions.
Note: You can disable Tier 3 if you don't have Cloudflare, but you lose the "Cheap Browser" advantage.
📊 Output Format
We provide clean JSON ready for your Vector Database or LLM:
{"url": "https://example.com/blog/ai-revolution","meta": {"title": "The AI Revolution","description": "How AI is changing the web...","keywords": "AI, LLM, RAG"},"content": {"markdown": "# The AI Revolution\n\nFull article content...","title": "The AI Revolution","source": "cloudflare_browser", // Tells you which Tier succeeded"estimatedTokens": 540},"scrapedAt": "2023-10-27T10:00:00.000Z"}
Built with ❤️ by Tulabot.com - Powering the next generation of AI Agents.