Pricing

$19.00/month + usage

Try for free

Go to Apify Store

Dynamic Markdown Scraper

Try for free

Developed by

Louis Deconinck

Effortlessly feed LLM AIs with clean Markdown using our advanced web scraper. Seamlessly scrape dynamic, JavaScript-rendered websites while preserving original formatting. Ideal for AI training, documentation, and content migration.

5.0 (3)

Pricing

$19.00/month + usage

Last modified

7 months ago

Developer tools

Automation

A powerful web scraper that converts difficult to scrape web pages into clean, well-formatted Markdown content. This scraper crawls websites and automatically transforms their HTML content into Markdown format while maintaining the original structure and formatting. It handles dynamic content and JavaScript-rendered pages with ease.

Features

Crawls websites and converts content to Markdown format
Maintains proper heading structure, lists, and code blocks
Handles dynamic content and JavaScript-rendered pages
Handles images and links correctly
Respects same-domain crawling
Filters out unwanted content (navigation, footers, etc.)
Configurable maximum crawl limits
Smart content extraction focusing on main article content
Built with TypeScript for better maintainability

Use Cases

Feed website content to LLM AI for further processing
Extract content from websites for documentation, blog posts, or technical writing
Scrape and convert web pages for use in static sites, blogs, or other projects
Automate content migration from legacy systems to modern platforms

Input Configuration

The scraper accepts the following input parameters:

startUrls: Array of URLs where the crawler should begin (required)
maxRequestsPerCrawl: Maximum number of pages to crawl (optional, defaults to unlimited)

Example input:

{
    "startUrls": [
        { "url": "https://apify.com" }
    ],
    "maxRequestsPerCrawl": 100
}

Output Format

The scraper saves the following data for each processed page:

url: The URL of the scraped page
title: Page title
markdown: Converted Markdown content
capturedAt: Timestamp of when the page was scraped

Example output:

{
	"url": "https://apify.com/storage",
	"title": "Storage optimized for scraping · Apify",
	"markdown": "# Apify Storage\n\nScalable and reliable cloud data storage designed for web scraping and automation workloads.\n\n[View documentation](https://docs.apify.com/platform/storage)\n\nBenefits\n\n## Specialized storage from Apify[](https://apify.com/storage#specialized-storage-from-apify)\n\n![Enterprise_grade_reliability_performance_and_scalability_9890860f85.svg](https://cdn-cms.apify.com/Enterprise_grade_reliability_performance_and_scalability_9890860f85.svg)\n\n### Enterprise-grade reliability, performance, and scalability[](https://apify.com/storage#enterprise-grade-reliability-performance-and-scalability)\n\nStore a few records or a few hundred million, with the same low latency and high reliability. We use Amazon Web Services for the underlying data storage, giving you high availability and peace of mind.\n\n### Low-cost storage for web scraping and crawling[](https://apify.com/storage#low-cost-storage-for-web-scraping-and-crawling)\n\nApify provides low-cost storage carefully designed for the large workloads typical of web scraping and crawling operations.\n\n![Low_cost_storage_for_web_scraping_and_crawling_b313f7d95e.svg](https://cdn-cms.apify.com/Low_cost_storage_for_web_scraping_and_crawling_b313f7d95e.svg)\n\n![Easy_to_use_634e40ae76.svg](https://cdn-cms.apify.com/Easy_to_use_634e40ae76.svg)\n\n### Easy to use[](https://apify.com/storage#easy-to-use)\n\nData can be viewed on the web, giving you a quick way to review and share it with other people. The Apify [API](https://docs.apify.com/api/v2) and [SDK](https://docs.apify.com/sdk/js/) makes it easy to integrate our storage into your apps.\n\nFeatures\n\n## We’ve got you covered[](https://apify.com/storage#weve-got-you-covered)\n\n[![Dataset_78dfe4e3a4.svg](https://cdn-cms.apify.com/Dataset_78dfe4e3a4.svg)\n\n**Dataset**  \nStore results from your web scraping, crawling or data processing jobs into Apify datasets and export them to various formats like JSON, CSV, XML, RSS, Excel or HTML.\n\n\n\n\n\n](https://docs.apify.com/platform/storage/dataset)[![Request_queue_9e9602319e.svg](https://cdn-cms.apify.com/Request_queue_9e9602319e.svg)\n\n**Request queue**  \nMaintain a queue of URLs of web pages in order to recursively crawl websites, starting from initial URLs and adding new links as they are found while skipping duplicates.\n\n\n\n\n\n](https://docs.apify.com/platform/storage/request-queue)[![Key_value_store_bc65220b7d.svg](https://cdn-cms.apify.com/Key_value_store_bc65220b7d.svg)\n\n**Key-value store**  \nStore arbitrary data records along with their MIME content type. The records are accessible under a unique name and can be written and read at a rapid rate.\n\n\n\n\n\n](https://docs.apify.com/platform/storage/key-value-store)\n\n## Ready to build your first Actor?[](https://apify.com/storage#ready-to-build-your-first-actor)\n\n[Start developing](https://apify.com/templates)",
	"capturedAt": "2025-01-23T14:01:21.956Z"
}

On this page

Share Actor:

Webpage to Markdown

extremescrapes/webpage-to-markdown

This actor cost-effectively converts websites into structured markdown optimized for AI processing. It extracts webpage content, formats it into clean markdown, and ensures compatibility with AI models.

Extreme Scrapes

Ai Ready Web Page To Markdown Converter

mustafa.irshaid.113/ai-ready-web-page-to-markdown-converter

Convert any webpage into structured Markdown and HTML using just a URL. Get the page title, link, and content—perfect for SEO, devs, and AI crawlers. Fast, clean, and ideal for repurposing or analysis. Start turning websites into Markdown instantly.

Mustafa Irshaid

Website To Markdown

hamzasaleem/website-to-markdown

Convert any webpage to clean, readable Markdown format. Perfect for content extraction and readability.

Hmza

Website to MarkDown (AI-Ready)

mintii/website-to-markdown-ai-ready

Use this to scrape webpages and use for AI Tools and LLMs.

Martin from Mintii

Website Content to Markdown for LLM Training

easyapi/website-content-to-markdown-for-llm-training

🚀 Transform web content into clean, LLM-ready Markdown! 📘 Scrape multiple pages, extract main content, and convert to Markdown format. Perfect for AI researchers, data scientists, and LLM developers. Fast, efficient, and customizable. Supercharge your AI training data today! 🌐📝🧠

EasyApi

103

5.0

Markdown Converter

jindrich.bar/markdown-converter

A simple Actor for converting pdf / doc / docx files to Markdown.

Jindřich Bär

🔥 FireScrape AI Website Content Markdown Scraper

mohamedgb00714/fireScraper-AI-Website-Content-Markdown-Scraper

Advanced web scraper powered by Crawlee and Puppeteer — extracts website content, converts it to Markdown, and structures it for LLM training datasets.

mohamed el hadi msaid

121

3.8

Get Site to Markdown

jhaisley/get-site

Website to Markdown Crawler An asynchronous web crawler that mirrors websites into a single organized markdown file, with handling for images and directory structure preservation. Designed to operate with low cost. This works great to build context for AI agents.

b-w.pro

AI Website Content Markdown Scraper

quaking_pail/ai-website-content-markdown-scraper

This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.

AI_Builder

669

4.6

Website Content Crawler

apify/website-content-crawler

Crawl websites and extract text content to feed AI models, LLM applications, vector databases, or RAG pipelines. The Actor supports rich formatting using Markdown, cleans the HTML, downloads files, and integrates well with 🦜🔗 LangChain, LlamaIndex, and the wider LLM ecosystem.