RSS / Atom Feed to Dataset
Pricing
Pay per event
RSS / Atom Feed to Dataset
Convert any RSS 2.0, Atom 1.0, or RDF feed into a clean structured dataset. Extracts title, link, pubDate, author, summary, content, categories, enclosures. Works with podcasts, news, blogs, GitHub releases. No API keys.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Mohieldin Mohamed
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 days ago
Last modified
Categories
Share
Convert any RSS, Atom, or RDF feed into a clean structured Apify dataset in seconds.
Point this actor at a feed URL — Hacker News, GitHub releases, Reddit, your favorite blog, an iTunes podcast, the New York Times — and it returns every item as a normalized JSON row you can download as JSON, CSV, HTML, or Excel.
What does RSS Feed to Dataset do?
Web feeds are a goldmine for content monitoring, news aggregation, and competitive intelligence — but every parser library you'd write is slightly different and most break on edge cases (CDATA, namespaced tags, empty fields, weird date formats). This actor handles all that for you and gives you a single normalized output schema regardless of whether the feed is RSS 2.0, Atom 1.0, or RDF 1.0.
Try it: the default input is https://news.ycombinator.com/rss — press Start and you'll get back the current top 30 HN stories in seconds.
Apify platform advantages: scheduled runs (poll a feed every hour), API access (pull dataset directly into Zapier/n8n), integrations (push items to Google Sheets, Slack, Airtable), and proxy rotation if a feed blocks server IPs.
Why use RSS Feed to Dataset?
- Content monitoring — track every new post on a competitor's blog
- News aggregation — pull headlines from 50 news sources into one CSV
- Backup — archive your own feed regularly so you don't lose old posts
- LLM training data — feed structured news content into an embeddings model
- Podcast catalog — extract iTunes feeds into a dataset of episodes with audio URLs
- Release notification — watch GitHub release feeds for libraries you depend on
- Custom Slack bot — bridge any RSS feed into Slack via Apify webhooks
How to use RSS Feed to Dataset
- Click Try for free (or Start if you're already logged in)
- Paste one or more feed URLs into Feed URLs (e.g.
https://news.ycombinator.com/rss) - Optionally cap Max items per feed (default 100)
- Click Start
- Download the dataset in JSON, CSV, HTML, or Excel — or hit the API endpoint
Input
- Feed URLs — one or more RSS/Atom/RDF feed URLs
- Max items per feed — cap on items per feed (default 100, use 0 for unlimited)
- Include full content — attach
<content:encoded>body to each item (default: yes) - Include raw XML — debug mode: attach the original raw item XML (default: no)
- Proxy configuration — optional Apify Proxy for paid/protected feeds
Output
{"title": "Show HN: Atlas — 6 MCP servers for Claude","link": "https://news.ycombinator.com/item?id=12345678","guid": "12345678","pubDate": "Tue, 15 Apr 2026 15:42:00 +0000","author": "mohye24k","summary": "Atlas is a suite of 6 MCP servers...","content": "<p>Full HTML content here...</p>","categories": ["AI", "Open Source"],"enclosureUrl": null,"enclosureType": null,"enclosureLength": null,"feedTitle": "Hacker News","feedDescription": "Links for the intellectually curious","feedLink": "https://news.ycombinator.com/","feedUrl": "https://news.ycombinator.com/rss","feedType": "rss","extractedAt": "2026-04-15T17:00:00.000Z"}
Data table
| Field | Type | Description |
|---|---|---|
title | string | Item title |
link | string | Permalink to the item |
guid | string | Unique identifier (RSS) or <id> (Atom) |
pubDate | string | Publication date as found in the feed |
author | string | Author name (<author>, <dc:creator>, or <author>/<name>) |
summary | string | Short description |
content | string | Full body (<content:encoded> for RSS, <content> for Atom) |
categories | array | List of category tags |
enclosureUrl | string | Attached file URL (podcasts, attachments) |
enclosureType | string | MIME type of enclosure |
enclosureLength | number | File size in bytes |
feedTitle | string | Feed channel title |
feedDescription | string | Feed channel description |
feedLink | string | Feed website URL |
feedUrl | string | The feed URL you provided |
feedType | string | rss, atom, or rdf |
extractedAt | string | ISO timestamp of extraction |
Pricing
This actor uses Apify's pay-per-event pricing:
- Actor start: $0.01 per run
- Per item extracted: $0.002 per item
Example costs:
- Hacker News RSS (30 items) → ~$0.07
- A blog feed with 50 items → ~$0.11
- 10 feeds × 50 items each → ~$1.01
Free Apify tier members get $5/month in platform credits, which covers ~2,000 items per month.
Tips and advanced options
- Schedule daily runs to catch new items as they appear, then deduplicate by
guiddownstream - Combine multiple feeds in one run to save on the actor-start fee — pass an array
- Disable
includeContentfor faster runs and smaller datasets when you only need headlines - Enable Apify Proxy for feeds behind Cloudflare or rate-limited
- Pipe into Slack / Discord / Email via Apify integrations
FAQ and support
Does it support podcast feeds? Yes. Episode enclosureUrl is included in the output.
Does it support paywall / authenticated feeds? Yes — pass a feed URL with ?token=... or use the proxy configuration to route through your own IP.
What about feeds with custom namespaces? The actor extracts dc:creator, content:encoded, and other common namespaces. For exotic namespaces, enable includeRawXml and parse downstream.
Found a bug? Open an issue on the Issues tab.