Universal Web to Markdown (Bulk & AI-Ready)
Pricing
Pay per usage
Universal Web to Markdown (Bulk & AI-Ready)
Bulk convert any website URLs to clean Markdown for AI & LLMs. Universal scraper that removes ads, scripts, and clutter. Optimized for RAG, ChatGPT, Claude, and LangChain. Fast, async, and API-ready.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

kalthireddy Abhishek
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🚀 Universal Web to Markdown (AI-Ready)
Turn any website into clean, noise-free Markdown. The perfect data feeder for LLMs, RAG, and AI Agents.
🤖 Why use this Actor?
Large Language Models (like ChatGPT, Claude, and Gemini) struggle with raw HTML. It consumes too many tokens and confuses the AI with scripts, styles, and ads.
This Actor solves that. It visits URLs, strips away the junk (ads, navbars, footers), and converts the core content into clean Markdown.
✨ Features
- ⚡ Fast & Async: Built on
httpxfor high-speed non-blocking extraction. - 📦 Bulk Processing: Add 1 or 100 URLs at once—the Actor handles the queue for you.
- 🧹 Smart Cleaning: Automatically removes ads, scripts, sidebars, and popups.
- 🧠 AI Optimized: Output is formatted specifically for RAG (Retrieval-Augmented Generation) pipelines.
- 🛡️ Anti-Bot Bypass: Uses browser headers to read sites that block basic bots.
📥 Input
You can provide a single URL or a list of URLs to scrape.
Example Input (JSON):
{"startUrls": [{ "url": "[https://en.wikipedia.org/wiki/Artificial_intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence)" },{ "url": "[https://www.example.com](https://www.example.com)" }]}
📤 Output
The Actor stores results in the default dataset. You can download it in JSON, CSV, Excel, or XML.
Sample JSON Output:
[{"url": "[https://en.wikipedia.org/wiki/Artificial_intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence)","title": "Artificial intelligence - Wikipedia","markdown": "# Artificial intelligence\n\nArtificial intelligence (AI) is intelligence demonstrated by machines..."},{"url": "[https://www.example.com](https://www.example.com)","title": "Example Domain","markdown": "# Example Domain\n\nThis domain is for use in illustrative examples in documents..."}]
🔌 API Example (Python) Easily integrate this into your own AI agent:
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")# Run the Actor with multiple URLsrun = client.actor("YOUR_USERNAME/web-to-markdown-converter").call(run_input={"startUrls": [{"url": "[https://en.wikipedia.org/wiki/Artificial_intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence)"},{"url": "[https://www.example.com](https://www.example.com)"}]})# Get resultsfor item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item["title"])print(item["markdown"][:100]) # Print first 100 chars