universal-web-to-markdown
Pricing
from $0.50 / 1,000 results
universal-web-to-markdown
High-performance tool for AI & RAG pipelines. Converts web pages to clean Markdown by removing noise and fixing relative URLs. Built with Cheerio for extreme speed and low cost ($0.50/1k pages). Perfect for feeding clean data to LLMs.
Pricing
from $0.50 / 1,000 results
Rating
0.0
(0)
Developer

JI JUN
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
a day ago
Last modified
Categories
Share
🚀 Universal Web-to-Markdown (RAG-Ready API)
Universal Web-to-Markdown is a high-performance, cost-efficient tool designed specifically for AI developers and RAG (Retrieval-Augmented Generation) pipelines. It transforms any web page into clean, structured Markdown.
✨ Key Features
- Pure Content Extraction: Removes ads, navbars, and footers automatically.
- RAG-Optimized: Converts all relative links and images into absolute URLs for seamless LLM integration.
- Zero-Cost Engine: Built with Cheerio for maximum speed and minimum compute usage.
- Branded Metadata: Includes source tracking for data lineage.
🛠 How to Use
- Enter the Start URLs you want to crawl.
- (Optional) Set Max Depth to follow links.
- Run the Actor and get your Markdown data in JSON format!
💰 Pricing
- Actor Start: $0.01 (One-time event)
- Per Result: $0.50 per 1,000 pages (Only $0.0005 per page!)
- Platform Usage: Free (Included)
Developed by hachi-dev
🔍 Before & After (Why it's perfect for RAG)
Stop feeding your LLM with noisy HTML. See the difference:
✅ After (Clean Markdown by hachi-dev):
## What is AI?Read more on our [blog](https://example.com/blog/ai-future).
💻 Quick Start Code Snippets
Copy and paste this into your project to start extracting data immediately.
Python (apify-client)
from apify_client import ApifyClient# Initialize the ApifyClient with your API tokenclient = ApifyClient("YOUR_API_TOKEN")# Start the Actor and wait for it to finishrun = client.actor("hachi-dev/universal-web-to-markdown").call(run_input={ "startUrls": [{ "url": "https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/article" }] })# Fetch and print the Markdown resultsfor item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item.get("markdown"))
Node.js / JavaScript
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor("hachi-dev/universal-web-to-markdown").call({startUrls: [{ url: "https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/article" }]});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items[0].markdown);