AI-Powered RSS Aggregator & Summarizer avatar
AI-Powered RSS Aggregator & Summarizer

Pricing

from $0.50 / 1,000 results

Go to Apify Store
AI-Powered RSS Aggregator & Summarizer

AI-Powered RSS Aggregator & Summarizer

Enterprise-grade RSS aggregator with AI-powered summarization. Collects, filters, and processes feeds from any source. Ideal for content analysis, news monitoring, and AI training. Features keyword filtering, metadata extraction, and structured output in JSON/CSV. Built with Hugging Face.

Pricing

from $0.50 / 1,000 results

Rating

5.0

(1)

Developer

PrimeParse

PrimeParse

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

๐ŸŒ RSS Aggregator: AI-Powered RSS Aggregator & Summarizer

Enterprise-grade RSS aggregator with AI-powered summarization. Collects, filters, and processes feeds from any source. Ideal for content analysis, news monitoring, and AI training. Features keyword filtering, metadata extraction, and structured output in JSON/CSV. Built with Hugging Face for advanced summaries.

High-quality RSS Feed Aggregator & Processor for Content Teams, Researchers, and AI Engineers

Automatically aggregates RSS feeds, filters by keywords, extracts summaries, and optionally generates AI-powered summaries โ€” clean, structured, ready for analysis or AI.

Built for:

  • Content aggregators & news monitoring teams
  • Researchers tracking academic papers and publications
  • AI/ML engineers building content datasets
  • Marketing teams monitoring industry trends
  • Data analysts processing feed data

โœ… Smart keyword filtering
โœ… AI-powered summarization (Hugging Face transformers)
โœ… Multiple feed support (1-5 feeds recommended)
โœ… Rich metadata extraction (date, author, tags, description)
โœ… Rate limiting & respectful crawling
โœ… AI-ready structured output

๐Ÿ‘‰ Runs on Apify โ€ข No code required

๐Ÿš€ Why This Aggregator

โœ” Purpose-Built for RSS Processing
Intelligently aggregates and processes RSS feeds from any source โ€” news sites, academic journals, blogs, corporate feeds.

โœ” AI Summarization Ready
Optional integration with Hugging Face transformers (BART, Pegasus) for advanced AI-powered summarization of feed entries.

โœ” Clean & Structured Output
Extracts only meaningful content โ€” title, link, summary, author, tags, publication date โ€” ready for analysis.

โœ” Smart Keyword Filtering
Filter entries by custom keywords (case-insensitive) across title, summary, and tags for relevance.

โœ” AI & ML Ready
Structured JSON/CSV output perfect for RAG systems, LLM fine-tuning, or training datasets.

โœ” Fast & Efficient
Powered by feedparser โ€” excellent for RSS/Atom feeds. Lightweight and fast processing.

โœ” Safe & Controlled Processing
Configurable rate limiting, entry limits per feed, and graceful error handling.

๐Ÿ’ผ Use Cases

  • News monitoring โ€” Track industry news and trends from multiple sources
  • Academic research โ€” Aggregate papers from arXiv, PubMed, and other academic feeds
  • Content curation โ€” Collect and filter relevant content for newsletters or blogs
  • AI training data โ€” Generate clean datasets for LLM fine-tuning or RAG systems
  • Competitive intelligence โ€” Monitor competitor blogs and news feeds
  • Market research โ€” Track product announcements and industry updates

๐Ÿ“Š Supported Sources

  • News feeds โ€” TechCrunch, Reuters, BBC, Guardian, etc.
  • Academic feeds โ€” arXiv, PubMed, academic journals
  • Blog feeds โ€” Medium, WordPress, custom blog RSS
  • Corporate feeds โ€” Company blogs, press releases, announcements
  • Any RSS/Atom feed โ€” Standard-compliant feeds

โš™๏ธ How It Works

  1. Provide RSS feed URLs (1-5 feeds recommended)
  2. Set custom keywords and processing options
  3. Optionally enable AI summarization
  4. Run the Actor
  5. Download clean, structured RSS datasets

๐Ÿงฉ Input Configuration

Example JSON Input

{
"rssFeeds": [
"https://arxiv.org/rss/cs.AI",
"https://techcrunch.com/feed/"
],
"maxEntriesPerFeed": 10,
"keywords": [
"AI",
"machine learning",
"artificial intelligence"
],
"enableSummarization": true,
"enableAISummarization": true,
"aiModelName": "facebook/bart-large-cnn",
"aiMaxLength": 1024,
"aiMinLength": 50,
"aiMaxSummaryLength": 150,
"delayBetweenFeeds": 1.0
}

Key Options

  • rssFeeds โ€” List of RSS feed URLs to aggregate (required, 1-5 recommended)
  • maxEntriesPerFeed โ€” Maximum entries per feed (0 = unlimited, default: 10)
  • keywords โ€” Custom keywords for filtering entries (case-insensitive, empty = all entries)
  • enableSummarization โ€” Extract summary/description from feeds (default: true)
  • enableAISummarization โ€” Use Hugging Face AI for advanced summarization (default: false)
  • aiModelName โ€” Hugging Face model identifier (default: "facebook/bart-large-cnn")
  • aiMaxLength โ€” Maximum input length for AI model (default: 1024 tokens)
  • aiMinLength โ€” Minimum summary length (default: 50 tokens)
  • aiMaxSummaryLength โ€” Maximum summary length (default: 150 tokens)
  • delayBetweenFeeds โ€” Delay in seconds between feeds for rate limiting (default: 1.0)

๐Ÿ“‚ Output Dataset

All entries are stored in the default Apify dataset with the following structure:

Example Output Record

{
"title": "Adobe hit with proposed class-action, accused of misusing authors' work in AI training",
"link": "https://techcrunch.com/2025/12/17/adobe-hit-with-proposed-class-action-accused-of-misusing-authors-work-in-ai-training/",
"published": "2025-12-18T00:44:55",
"summary": "The lawsuit is just the latest in a string of copyright-related legal complaints aimed at the AI industry.",
"feedTitle": "TechCrunch",
"feedUrl": "https://techcrunch.com/feed/",
"author": "Lucas Ropek",
"tags": [
"AI",
"Adobe",
"Anthropic",
"artificial intelligence"
]
}

With AI Summarization

When enableAISummarization: true, the summary field contains AI-generated summaries:

{
"title": "Breakthrough in Quantum Computing",
"link": "https://example.com/quantum-breakthrough",
"published": "2025-12-15T10:30:00",
"summary": "Researchers achieve significant milestone in quantum error correction, bringing practical quantum computing closer to reality. The new method reduces error rates by 50%...",
"feedTitle": "Science News",
"feedUrl": "https://example.com/feed.xml",
"author": "Dr. Jane Smith",
"tags": ["quantum computing", "research", "technology"]
}

๐Ÿค– AI Summarization Models

Supported Hugging Face models for summarization:

  • facebook/bart-large-cnn (default) โ€” Best for news articles and general content
  • google/pegasus-xsum โ€” Optimized for news summaries
  • Any summarization model โ€” Compatible with Hugging Face transformers

The Actor automatically falls back to basic extraction if AI summarization fails or is unavailable.

๐Ÿ Getting Started

Quick Start on Apify

  1. Click "Try for free" on Apify
  2. Paste RSS feed URLs (e.g., https://techcrunch.com/feed/)
  3. Customize keywords and options
  4. Optionally enable AI summarization
  5. Run and download your dataset

๐Ÿ“ˆ Performance

  • Processing Speed โ€” ~1-2 seconds per feed (depending on entries)
  • Rate Limiting โ€” Configurable delay between feeds (default: 1s)
  • Memory Efficient โ€” Processes feeds sequentially
  • Scalability โ€” Handles 1-5 feeds optimally (can process more)

๐Ÿ”ง Advanced Configuration

Custom AI Models

You can use any Hugging Face summarization model:

{
"enableAISummarization": true,
"aiModelName": "google/pegasus-xsum",
"aiMaxLength": 2048,
"aiMinLength": 100,
"aiMaxSummaryLength": 200
}

๐Ÿ“ง Support

Tags: RSS, feed aggregator, content processing, AI summarization, Hugging Face, news aggregation, feed parser, content analysis, RAG, LLM training, data extraction


Built with โค๏ธ on Apify