Webpage Text Extractor (Readability)
Pricing
$30.00 / 1,000 page extracteds
Webpage Text Extractor (Readability)
Extract the clean main article content of any URL as plain text and markdown — strips nav, ads, footers. The reader API agents need for RAG.
Pricing
$30.00 / 1,000 page extracteds
Rating
0.0
(0)
Developer
Anthony Snider
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Turn any article URL into clean main-content text and markdown — nav, ads, sidebars, and footers stripped. The reader your AI agent needs for RAG, summarization, and content pipelines. No API key, pay per page.
▶ Live on the Apify Store — run it instantly, or call it as an agent tool via Apify MCP.
Why
LLM agents waste tokens on boilerplate. This returns just the readable article — as portable markdown (absolute links/images) and plain text — plus word count, reading time, and a readability score.
What you get (per page)
markdown— clean GitHub-flavored markdown of the main contenttext— plain readable texttitle,byline,publishedAt,lang,excerptwordCount,readingTimeMin,fleschReadingEase
Input
{ "url": "https://example.com/some-article", "outputFormat": "both" }
or bulk:
{ "urls": ["https://a.com/post", "https://b.com/post"], "maxUrls": 25 }
Output
{"url": "https://example.com/some-article","title": "How web scraping works","byline": "Jane Doe","lang": "en","markdown": "# How web scraping works\n\nWeb scraping is ...","text": "How web scraping works. Web scraping is ...","wordCount": 1240,"readingTimeMin": 6,"fleschReadingEase": 58.2}
Notes
Uses a readability heuristic (semantic containers + text-density scoring) — works on most articles and blogs without a headless browser, so it's fast and cheap. Returns only the public content of the URL you provide.