Go to Apify Store
User picture

Jordan Wagner

jalicia

ACTOR STATS

1 public Actor

2 total users

>99% runs succeeded

RAG Post-Processor

Clean and chunk raw scraped text for LLM/RAG pipelines.

This Actor takes messy output from any scraper and returns clean, overlapping chunks ready for embeddings and vector databases.

Features

  • Smart HTML/text cleaning
  • Semantic-aware chunking (1000 chars with overlap)
  • Metadata preservation
  • High-volume ready (microtransactions via PPE)

Usage

Connect it after any scraper using Actor-to-Actor webhooks or dataset input.

Input example:

{
"data": [
{"text": "Your raw scraped content here..."}
]
}
Output: One item per chunk with chunk_text.
Pricing
$0.25 per 1,000 chunks processed (very affordable for high volume)
Perfect for AI agents, RAG systems, and automation pipelines.
Made with ❤️ for the AI gold rush.

Public Actors