AI RAG Feeder V2 avatar
AI RAG Feeder V2

Pricing

$1.00 / 1,000 pages

Go to Apify Store
AI RAG Feeder V2

AI RAG Feeder V2

Turn any website into AI-ready Markdown. Scrapes entire domains, removes ads/clutter, and formats text specifically for RAG pipelines and LLM training data.

Pricing

$1.00 / 1,000 pages

Rating

0.0

(0)

Developer

Mickey Moore

Mickey Moore

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

2

Monthly active users

a month ago

Last modified

Share

AI RAG Feeder V2 is a specialized scraper designed to feed data into LLM (Large Language Model) and RAG (Retrieval-Augmented Generation) pipelines. It navigates websites and converts the HTML content into clean, token-efficient Markdown.

✨ Features

  • Clean Markdown Extraction: Automatically removes ads, navbars, and footers to save tokens.
  • Recursive Crawling: Can follow links to scrape entire documentation sites.
  • Smart Formatting: Preserves headers, code blocks, and tables for better embedding quality.
  • Proxy Support: Built-in rotation to avoid IP blocking.

🚀 How to use

  1. Start URLs: Enter the list of URLs you want to scrape.
  2. Max Depth: Set how deep the crawler should go (e.g., 1 for direct links, 0 for just the page).
  3. Run: The actor will output a JSON dataset ready for vector databases.

📦 Output

The results are stored in the default Apify dataset. Each item contains:

{
"url": "[https://example.com/docs](https://example.com/docs)",
"title": "Documentation",
"markdown": "# Documentation\n\nThis is the clean text...",
"metadata": { "depth": 1 }
}