RSS / Atom Feed to Dataset avatar

RSS / Atom Feed to Dataset

Pricing

Pay per event

Go to Apify Store
RSS / Atom Feed to Dataset

RSS / Atom Feed to Dataset

Convert any RSS 2.0, Atom 1.0, or RDF feed into a clean structured dataset. Extracts title, link, pubDate, author, summary, content, categories, enclosures. Works with podcasts, news, blogs, GitHub releases. No API keys.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Mohieldin Mohamed

Mohieldin Mohamed

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Convert any RSS, Atom, or RDF feed into a clean structured Apify dataset in seconds.

Point this actor at a feed URL — Hacker News, GitHub releases, Reddit, your favorite blog, an iTunes podcast, the New York Times — and it returns every item as a normalized JSON row you can download as JSON, CSV, HTML, or Excel.

What does RSS Feed to Dataset do?

Web feeds are a goldmine for content monitoring, news aggregation, and competitive intelligence — but every parser library you'd write is slightly different and most break on edge cases (CDATA, namespaced tags, empty fields, weird date formats). This actor handles all that for you and gives you a single normalized output schema regardless of whether the feed is RSS 2.0, Atom 1.0, or RDF 1.0.

Try it: the default input is https://news.ycombinator.com/rss — press Start and you'll get back the current top 30 HN stories in seconds.

Apify platform advantages: scheduled runs (poll a feed every hour), API access (pull dataset directly into Zapier/n8n), integrations (push items to Google Sheets, Slack, Airtable), and proxy rotation if a feed blocks server IPs.

Why use RSS Feed to Dataset?

  • Content monitoring — track every new post on a competitor's blog
  • News aggregation — pull headlines from 50 news sources into one CSV
  • Backup — archive your own feed regularly so you don't lose old posts
  • LLM training data — feed structured news content into an embeddings model
  • Podcast catalog — extract iTunes feeds into a dataset of episodes with audio URLs
  • Release notification — watch GitHub release feeds for libraries you depend on
  • Custom Slack bot — bridge any RSS feed into Slack via Apify webhooks

How to use RSS Feed to Dataset

  1. Click Try for free (or Start if you're already logged in)
  2. Paste one or more feed URLs into Feed URLs (e.g. https://news.ycombinator.com/rss)
  3. Optionally cap Max items per feed (default 100)
  4. Click Start
  5. Download the dataset in JSON, CSV, HTML, or Excel — or hit the API endpoint

Input

  • Feed URLs — one or more RSS/Atom/RDF feed URLs
  • Max items per feed — cap on items per feed (default 100, use 0 for unlimited)
  • Include full content — attach <content:encoded> body to each item (default: yes)
  • Include raw XML — debug mode: attach the original raw item XML (default: no)
  • Proxy configuration — optional Apify Proxy for paid/protected feeds

Output

{
"title": "Show HN: Atlas — 6 MCP servers for Claude",
"link": "https://news.ycombinator.com/item?id=12345678",
"guid": "12345678",
"pubDate": "Tue, 15 Apr 2026 15:42:00 +0000",
"author": "mohye24k",
"summary": "Atlas is a suite of 6 MCP servers...",
"content": "<p>Full HTML content here...</p>",
"categories": ["AI", "Open Source"],
"enclosureUrl": null,
"enclosureType": null,
"enclosureLength": null,
"feedTitle": "Hacker News",
"feedDescription": "Links for the intellectually curious",
"feedLink": "https://news.ycombinator.com/",
"feedUrl": "https://news.ycombinator.com/rss",
"feedType": "rss",
"extractedAt": "2026-04-15T17:00:00.000Z"
}

Data table

FieldTypeDescription
titlestringItem title
linkstringPermalink to the item
guidstringUnique identifier (RSS) or <id> (Atom)
pubDatestringPublication date as found in the feed
authorstringAuthor name (<author>, <dc:creator>, or <author>/<name>)
summarystringShort description
contentstringFull body (<content:encoded> for RSS, <content> for Atom)
categoriesarrayList of category tags
enclosureUrlstringAttached file URL (podcasts, attachments)
enclosureTypestringMIME type of enclosure
enclosureLengthnumberFile size in bytes
feedTitlestringFeed channel title
feedDescriptionstringFeed channel description
feedLinkstringFeed website URL
feedUrlstringThe feed URL you provided
feedTypestringrss, atom, or rdf
extractedAtstringISO timestamp of extraction

Pricing

This actor uses Apify's pay-per-event pricing:

  • Actor start: $0.01 per run
  • Per item extracted: $0.002 per item

Example costs:

  • Hacker News RSS (30 items) → ~$0.07
  • A blog feed with 50 items → ~$0.11
  • 10 feeds × 50 items each → ~$1.01

Free Apify tier members get $5/month in platform credits, which covers ~2,000 items per month.

Tips and advanced options

  • Schedule daily runs to catch new items as they appear, then deduplicate by guid downstream
  • Combine multiple feeds in one run to save on the actor-start fee — pass an array
  • Disable includeContent for faster runs and smaller datasets when you only need headlines
  • Enable Apify Proxy for feeds behind Cloudflare or rate-limited
  • Pipe into Slack / Discord / Email via Apify integrations

FAQ and support

Does it support podcast feeds? Yes. Episode enclosureUrl is included in the output.

Does it support paywall / authenticated feeds? Yes — pass a feed URL with ?token=... or use the proxy configuration to route through your own IP.

What about feeds with custom namespaces? The actor extracts dc:creator, content:encoded, and other common namespaces. For exotic namespaces, enable includeRawXml and parse downstream.

Found a bug? Open an issue on the Issues tab.