Article to Text Extractor (for TTS/LLMs) avatar

Article to Text Extractor (for TTS/LLMs)

Pricing

from $1.00 / 1,000 dataset items

Go to Apify Store
Article to Text Extractor (for TTS/LLMs)

Article to Text Extractor (for TTS/LLMs)

Extract the core readable text of any article or blog post, stripping out boilerplate. Perfect for Text-to-Speech or AI summaries.

Pricing

from $1.00 / 1,000 dataset items

Rating

0.0

(0)

Developer

Andok

Andok

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 hours ago

Last modified

Share

Text-to-Speech Page Reader (Bulk)

Extracts the core readable article text from a list of URLs, stripping out navigation, ads, and sidebars, preparing the content for TTS (Text-to-Speech) pipelines.

What it does

For each input URL, it downloads the HTML and uses Mozilla's Readability engine to:

  • Extract the main article text (plain text).
  • Extract the main title.
  • Extract byline/author and excerpt.

Typical uses

  • Podcast generation: turn blog posts and articles into clean text payloads for TTS APIs (like ElevenLabs or OpenAI TTS).
  • Summarization: feed the clean text into an LLM without wasting tokens on HTML boilerplate.

Input

  • urls (required): list of URLs to check.
  • timeoutSeconds (default 15)
  • concurrency (default 10)

Output

Writes one dataset item per input URL containing the clean article text and metadata.

Monetization + safety

This actor is designed for Pay-Per-Event (dataset item = 1 unit of work) and respects the per-run max charge limit.