Pricing

from $1.00 / 1,000 page converteds

Webpage to Markdown Converter for LLMs

Convert any URL into clean Markdown text. Remove ads and navbars to perfectly format web content for AI and RAG ingestion.

Pricing

from $1.00 / 1,000 page converteds

Rating

0.0

(0)

Developer

Andok

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

Web Page to Markdown Converter for LLMs

Convert any webpage into clean, structured Markdown optimized for LLMs and RAG pipelines. Stop wasting tokens on HTML boilerplate — get only the core content with metadata, ready for AI ingestion. Process hundreds of URLs in a single run with configurable concurrency.

Features

Readability cleaning — strips ads, navigation, sidebars, and footers using Mozilla Readability
Markdown formatting — converts article HTML to well-structured Markdown with ATX headings and fenced code blocks
Bulk processing — convert hundreds of URLs in a single run
Metadata extraction — captures page title, author byline, and excerpt alongside the Markdown content
Redirect handling — follows HTTP redirects and reports the final URL
Configurable concurrency — control parallel processing from 1 to 50 simultaneous requests
Pay-per-event pricing — pay only for pages successfully converted

Input

Field	Type	Required	Default	Description
`urls`	`array`	Yes	—	List of webpage URLs to convert to Markdown
`timeoutSeconds`	`integer`	No	`15`	Maximum seconds to wait for each URL response
`concurrency`	`integer`	No	`10`	Number of URLs to process in parallel (1-50)

Input Example

{
  "urls": [
    "https://crawlee.dev",
    "https://docs.apify.com/academy/web-scraping-for-beginners"
  ],
  "timeoutSeconds": 15,
  "concurrency": 10
}

Output

Each URL produces one dataset item containing the converted Markdown and extracted metadata.

Key output fields:

inputUrl (string) — the original URL provided
finalUrl (string) — the URL after following redirects
status (number) — HTTP status code
pageTitle (string) — extracted article title
markdown (string) — the full article content converted to Markdown
excerpt (string) — short summary or description of the article
byline (string) — author name if available
error (string) — error message if conversion failed, otherwise null

Output Example

{
  "inputUrl": "https://crawlee.dev",
  "finalUrl": "https://crawlee.dev/",
  "status": 200,
  "pageTitle": "Crawlee - Build reliable crawlers. Fast.",
  "markdown": "# Crawlee\n\nBuild reliable crawlers. Fast.\n\nCrawlee is a web scraping and browser automation library...",
  "excerpt": "Crawlee is a web scraping and browser automation library for Node.js.",
  "byline": null,
  "error": null
}

Pricing

Event	Cost
Page Converted	Pay-per-event (see actor pricing page)

The actor respects the per-run max charge limit. Processing stops automatically when the spending cap is reached.

Use Cases

RAG pipeline ingestion — convert documentation sites and knowledge bases into Markdown for vector database indexing
LLM context preparation — clean web content for ChatGPT, Claude, or other LLM context windows without HTML noise
Documentation migration — bulk-convert web pages to Markdown files for static site generators
Content archiving — save readable article snapshots in a portable, version-control-friendly format
AI training data — prepare clean text corpora from web sources for fine-tuning or evaluation

Actor	What it adds
Article Text Extractor for TTS & AI	Plain text output optimized for text-to-speech and summarization
PDF to Text Converter for AI & RAG	Extend your pipeline to extract text from PDF documents
YouTube Transcript Scraper for AI & RAG	Add video transcript extraction to your content pipeline

Web Page to Markdown Extractor

fetch_cat/web-page-to-markdown-extractor

Convert public URLs into clean Markdown, text, metadata, links, images, and optional HTML for AI and automation workflows.

Hanna Nosova

Website To Markdown

smart_api/website-to-markdown

Convert any webpage into clean, LLM-ready Markdown in seconds — perfect for AI training data, RAG pipelines, and content archiving.

SmartApi

5.0

Website To Markdown

swarmgarden/website-to-markdown

Convert any webpage to clean, readable Markdown format. Perfect for content extraction and readability.

Swarm Garden

Web to Markdown — AI-Ready Text from Any URL

wsgcjj/web-to-markdown

Convert any web page URL to clean Markdown format. Perfect for LLM training data, RAG pipelines, and AI content processing. Extracts main content, strips ads/nav/footers.

陈俊杰

Webpage To Clean Markdown

technicaldost/webpage-to-clean-markdown

Technical Dost Solutions

Ai Ready Web Page To Markdown Converter

mustafa.irshaid.113/ai-ready-web-page-to-markdown-converter

Convert any webpage into structured Markdown and HTML using just a URL. Get the page title, link, and content—perfect for SEO, devs, and AI crawlers. Fast, clean, and ideal for repurposing or analysis. Start turning websites into Markdown instantly.

Mustafa Irshaid

AI Web-to-Markdown Extract API — URL to Clean JSON for LLMs

olican/ai-web-to-markdown-extract

Scrapes any webpage, automatically cleans HTML clutter (nav, footers, scripts, ads, cookie consent banners), and transforms the main content into clean, structured Markdown for LLMs and RAG.