Pricing

from $0.50 / 1,000 results

universal-web-to-markdown

High-performance tool for AI & RAG pipelines. Converts web pages to clean Markdown by removing noise and fixing relative URLs. Built with Cheerio for extreme speed and low cost ($0.50/1k pages). Perfect for feeding clean data to LLMs.

Pricing

from $0.50 / 1,000 results

Rating

0.0

(0)

Developer

JI JUN

Actor stats

Bookmarked

Total users

Monthly active users

3 months ago

Last modified

🚀 Universal Web-to-Markdown (RAG-Ready API)

Universal Web-to-Markdown is a high-performance, cost-efficient tool designed specifically for AI developers and RAG (Retrieval-Augmented Generation) pipelines. It transforms any web page into clean, structured Markdown.

✨ Key Features

Pure Content Extraction: Removes ads, navbars, and footers automatically.
RAG-Optimized: Converts all relative links and images into absolute URLs for seamless LLM integration.
Zero-Cost Engine: Built with Cheerio for maximum speed and minimum compute usage.
Branded Metadata: Includes source tracking for data lineage.

🛠 How to Use

Enter the Start URLs you want to crawl.
(Optional) Set Max Depth to follow links.
Run the Actor and get your Markdown data in JSON format!

💰 Pricing

Actor Start: $0.01 (One-time event)
Per Result: $0.50 per 1,000 pages (Only $0.0005 per page!)
Platform Usage: Free (Included)

Developed by hachi-dev

🔍 Before & After (Why it's perfect for RAG)

Stop feeding your LLM with noisy HTML. See the difference:

✅ After (Clean Markdown by hachi-dev):

## What is AI?
![AI diagram](https://example.com/images/ai.png)

Read more on our [blog](https://example.com/blog/ai-future).

💻 Quick Start Code Snippets

Copy and paste this into your project to start extracting data immediately.

Python (apify-client)

from apify_client import ApifyClient

# Initialize the ApifyClient with your API token
client = ApifyClient("YOUR_API_TOKEN")

# Start the Actor and wait for it to finish
run = client.actor("hachi-dev/universal-web-to-markdown").call(
    run_input={ "startUrls": [{ "url": "https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/article" }] }
)

# Fetch and print the Markdown results
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item.get("markdown"))

Node.js / JavaScript

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor("hachi-dev/universal-web-to-markdown").call({
    startUrls: [{ url: "https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/article" }]
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].markdown);

Web Scraper For Llms

abotapi/web-scraper-for-llms

Stealth web scraping engine built for LLMs. Converts any web page to clean markdown or HTML

AbotAPI

Universal Markdown Scraper for LLMs

botflowtech/universal-markdown-scraper-for-llms

Universal Markdown Scraper for LLMs

BotFlowTech

Web to Markdown for LLMs

george.the.developer/web-to-markdown-llm

Convert any URL to clean LLM-ready markdown. 60-70% fewer tokens than raw HTML. Built for AI agents and RAG pipelines.

George Kioko

Web-to-Markdown Generator for AI & RAG Pipelines

profitstack/web-to-markdown-generator-for-ai-rag-pipelines

Convert any website into clean, heading-based chunking, LLM-ready Markdown for RAG and AI agents.

Manas Mantri

Universal Web to Markdown (Bulk & AI-Ready)

lentic_october/web-to-markdown-converter

Bulk convert any website URLs to clean Markdown for AI & LLMs. Universal scraper that removes ads, scripts, and clutter. Optimized for RAG, ChatGPT, Claude, and LangChain. Fast, async, and API-ready.

kalthireddy Abhishek

LLM Markdown Crawler

sleek_waveform/llm-markdown-crawler

Crawl any website and extract clean, boilerplate-free Markdown optimized for LLMs, RAG pipelines, and AI training datasets. Uses Mozilla Readability to strip navigation and ads, then converts to clean Markdown. No browser required — fast and cheap.

Daniel Dimitrov

rag-docs-scraper

marbled_jury/my-actor

Extract clean, RAG-optimized Markdown from any technical documentation. Built for LLMs and AI agents. No noise, just high-fidelity data.

Hastin S.

Universal RAG Web Scraper

express_kingfisher/rag-web-scraper

Turn any website into clean, LLM-ready Markdown. Automatically strips ads, navigation, and noise using Mozilla Readability. Perfect for feeding data to ChatGPT, Claude, or Vector Databases (RAG).

Prince Raj

AI Markdown Maker

onescales/bulk-ai-markdown-maker

Convert any web page into clean, AI ready markdown format in seconds. This markdown generator is perfect for content for AI models, creating documentation, or archiving web content. It intelligently parses web content, removing ads, navigation, and other clutter. Generate Markdown Today!

One Scales

122

5.0

Website to Clean Markdown (AI & RAG Ready)

ahmed_jasarevic/website-to-clean-markdown-ai-rag-ready

Convert any website into clean, noise-free Markdown. Perfect for training LLMs, building Custom GPTs, and RAG pipelines. Save 80% on OpenAI tokens by stripping HTML junk.