Get Site to Markdown
Pricing
Pay per usage
Get Site to Markdown
Website to Markdown Crawler An asynchronous web crawler that mirrors websites into a single organized markdown file, with handling for images and directory structure preservation. Designed to operate with low cost. This works great to build context for AI agents.
0.0 (0)
Pricing
Pay per usage
0
Monthly users
2
Runs succeeded
>99%
Last modified
19 days ago
Website to Markdown Crawler
An asynchronous web crawler that mirrors websites into a single organized markdown file, with special handling for images and proper directory structure preservation. Built with Python, asyncio, and httpx.
Author: Jordan Haisley (jordan@b-w.pro)
Features
- 🚀 Fast asynchronous crawling using
httpx
andasyncio
- 📁 Preserves site structure - can be limited to specific subdirectories
- 🖼️ Smart image handling - preserves both alt text and filenames
- 📝 Clean Markdown output with proper sectioning
- 🔍 Depth-controlled crawling
- 🔒 Domain-restricted recursive crawling for safety
- 🤫 Quiet mode for silent operation
As an Apify Actor
Actor input schema:
1{ 2 "start_urls": [{"url": "https://example.com"}], 3 "max_depth": 1 4}
Output Format
The generated markdown file contains:
- A section for each page
- Page title as heading
- Original URL reference
- Page content in Markdown format
- Image references with both alt text and filenames
Example output:
1# Page Title 2*URL: https://example.com/page* 3 4 5 6Page content in markdown... 7 8----------------
Pricing
Pricing model
Pay per usageThis Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.