📡 RSS & Atom Feed Extractor
Pricing
Pay per event
📡 RSS & Atom Feed Extractor
Aggregate public RSS feeds into structured JSON datasets to discover fresh website URLs from blogs and newsrooms for downstream web scrapers.
Pricing
Pay per event
Rating
0.0
(0)
Developer
太郎 山田
Actor stats
1
Bookmarked
4
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
📡 RSS Feed Aggregator
Aggregate and filter trusted RSS or Atom feeds into a clean discovery dataset. This actor is a feeder/discovery surface in the Content Intelligence Pack: use it when you already know which publishers, company blogs, or newsroom feeds you want to monitor, then hand the resulting links to the right extractor.
Store Quickstart
- Start with Quickstart (2 publisher feeds) for a reliable first run.
- Use Multi-Source Monitoring to watch several feeds with keyword filters.
- Use RSS → Article Cleanup when the next step is article extraction.
Where this actor fits
| Surface | Best for |
|---|---|
| RSS Feed Aggregator | Discover fresh URLs from known publishers and blogs |
| Google News Scraper | Discover fresh URLs from query-based Google News searches |
| Article Content Extractor | Clean discovered article/news/blog pages |
| Website Content Extractor | Clean discovered docs, pricing, policy, or product pages |
Key Features
- 📡 Feed discovery — Aggregate multiple public RSS/Atom feeds in one run
- 🔍 Keyword filtering — Keep only the rows that match the themes you care about
- 🏷️ Match visibility — Returns
matchedKeywordsfor filtered rows - 🔄 Deduplication — Remove duplicate links across feeds
- ⚡ Low-friction first run — Great for recurring monitoring of known sources
Use Cases
| Who | Why |
|---|---|
| PR / comms teams | Track publisher and company newsroom feeds |
| Competitive intelligence | Watch competitor blogs and product update feeds |
| Content ops | Build filtered story queues from trusted sources |
| AI / RAG teams | Maintain a fresh URL stream before deeper extraction |
Input
| Field | Type | Default | Description |
|---|---|---|---|
feedUrls | string[] | required | Public RSS/Atom URLs (max 50) |
keywords | string[] | [] | Optional include-list filter |
maxItemsPerFeed | integer | 25 | Max items to keep from each feed |
deduplicate | boolean | true | Remove duplicate links across feeds |
timeoutMs | integer | 15000 | Request timeout |
delivery | string | dataset | dataset or webhook |
webhookUrl | string | — | Webhook target when delivery=webhook |
dryRun | boolean | false | Run without saving |
Input Example
{"feedUrls": ["https://blog.google/rss/","https://openai.com/news/rss.xml"],"keywords": ["AI", "agents"],"maxItemsPerFeed": 10,"deduplicate": true}
Output
| Field | Type | Description |
|---|---|---|
source | string | Feed URL that produced the row |
title | string | Feed item title |
link | string | Item URL for downstream extraction |
pubDate | string | Original feed date |
pubDateISO | string | ISO timestamp version of pubDate |
description | string | Summary text from the feed |
content | string | Encoded content when available |
categories | array | Categories / tags from the feed |
matchedKeywords | array | Keywords that matched the row |
Output Example
{"source": "https://openai.com/news/rss.xml","title": "The next evolution of the Agents SDK","link": "https://openai.com/index/the-next-evolution-of-the-agents-sdk","pubDate": "Wed, 15 Apr 2026 10:00:00 GMT","pubDateISO": "2026-04-15T10:00:00.000Z","description": "OpenAI updates the Agents SDK with native sandbox execution...","matchedKeywords": ["ai", "agents"]}
First-run buyer experience
- Run Quickstart (2 publisher feeds).
- Confirm the actor returns recent item URLs plus
matchedKeywords. - Send article/news/blog links to Article Content Extractor.
- Send docs/product/policy links to Website Content Extractor.
Tips & Limitations
- Start with a small set of high-trust feeds.
- Keyword filtering is OR-based; any matched keyword keeps the item.
- This actor is a feed discovery layer, not a full-content extractor.
FAQ
How is this different from Google News Scraper?
Use RSS Feed Aggregator when you already know the publishers you trust. Use Google News Scraper when you want broader query-based discovery.
Can I see why an item matched?
Yes — filtered rows include a matchedKeywords array.
Can I get full article text here?
No. Use Article Content Extractor or Website Content Extractor on the returned links.
Related Actors
Content Intelligence Pack handoffs:
- 📰 Article Content Extractor — clean discovered article/news/blog pages
- 📄 Website Content Extractor — clean discovered non-article pages
- 📰 Google News Scraper — query-based discovery when you do not have feed URLs yet
Cost
Pay Per Event:
actor-start: $0.01dataset-item: $0.002 per output item
⭐ Was this helpful?
If this actor saved you time, please leave a ★ rating on Apify Store.