RSS Feed Scraper — News Scraper & Article Extractor avatar

RSS Feed Scraper — News Scraper & Article Extractor

Pricing

$6.99/month + usage

Go to Apify Store
RSS Feed Scraper — News Scraper & Article Extractor

RSS Feed Scraper — News Scraper & Article Extractor

Scrape any RSS or Atom news feed. Get article title, URL, description, author, date, category, and image. 20+ built-in presets: BBC, Reuters, TechCrunch, CNN, NYT, Wired & more. Optional full article text. No login. $6.99/month. 2-hour free trial.

Pricing

$6.99/month + usage

Rating

0.0

(0)

Developer

Scrape Pilot

Scrape Pilot

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

📰 RSS Feed Scraper — News Scraper & Article Extractor

The most complete RSS Feed Scraper on Apify. Extract articles from any RSS or Atom feed — BBC, Reuters, TechCrunch, CNN, NYT, Wired, The Verge, Reddit, and 20+ built-in presets — or paste any custom RSS feed URL. Get title, URL, description, author, publish date, category, and image per article. Optional full article text extraction included. No login. No API key. Instant structured output.


📌 Table of Contents


🔍 What Is This Actor?

RSS Feed Scraper is a production-ready Apify actor that extracts structured article data from any RSS or Atom news feed — using 20+ built-in presets for the world's top publications or any custom feed URL you provide.

Select a preset like bbc_news, techcrunch, or reuters_world — or paste your own RSS feed URL — and receive back a clean dataset of news articles: title, article URL, description, author, publish date, category, and thumbnail image. Enable the optional full article extraction mode to also retrieve the complete article body text from each linked page.

This news scraper works with any publication that provides an RSS or Atom feed — from major global outlets to niche industry blogs — making it the most versatile news article scraper available on Apify.


🚀 Why Use This RSS Feed Scraper?

FeatureThis ActorManual ReadingGoogle AlertsOther Scrapers
RSS feed scraper — any feed URL⚠️ Limited
20+ built-in news presets
Full article text extraction✅ Optional✅ Slow⚠️
Multiple feeds in one run
Author, category, image✅ All fields⚠️
RSS + Atom both supportedN/AN/A⚠️
No login or API key❌ Required
Structured JSON output⚠️
Export to CSV / Excel✅ Via Apify
Scheduled runs✅ Email only

Bottom line: This RSS feed scraper is the only actor that combines 20+ built-in news presets, custom feed URL support, multi-feed batch runs, and optional full article body extraction — all in one tool with no credentials needed.


📡 Built-in News Feed Presets

Use any preset name directly in the preset_feed input — no URL needed:

🌍 News & World

Preset KeySource
bbc_newsBBC News — Top Stories
bbc_techBBC News — Technology
reuters_worldReuters — World News
reuters_techReuters — Technology
cnn_topCNN — Top Stories
nyt_homeNew York Times — Homepage
nyt_techNew York Times — Technology
wsj_worldWall Street Journal — World
google_newsGoogle News — Top Stories

💻 Tech & Science

Preset KeySource
techcrunchTechCrunch — All Stories
techcrunch_aiTechCrunch — AI Category
wiredWired Magazine
the_vergeThe Verge
arstechnicaArs Technica
engadgetEngadget
mit_tech_reviewMIT Technology Review
hn_frontpageHacker News — Front Page

🗣️ Community

Preset KeySource
reddit_worldReddit — r/worldnews
reddit_techReddit — r/technology
reddit_aiReddit — r/artificial

Don't see your feed? Paste any RSS or Atom feed URL directly into feed_urls — the news article scraper handles any valid feed automatically.


🎯 Use Cases

📰 News Monitoring & Media Intelligence

  • Use this news scraper to monitor multiple publications simultaneously for breaking stories on any topic
  • Build automated news briefing pipelines by scheduling daily RSS feed scraper runs
  • Track how different outlets cover the same story by scraping multiple feeds in one run

🤖 AI & NLP Training Datasets

  • Build large news article datasets for text classification, summarization, or language model training
  • Collect article titles and descriptions from diverse news sources for headline generation research
  • Use the full article extraction mode to build rich training corpora from any news publication

📊 Content Research & Competitive Analysis

  • Monitor competitor publications by scraping their RSS feeds for topic and publishing frequency analysis
  • Track technology trend coverage across multiple tech publications simultaneously
  • Collect article metadata for content gap analysis and editorial planning

🛠️ Developer & Content Pipeline Integrations

  • Feed news article data into Slack bots, newsletters, dashboards, or CMS platforms automatically
  • Build a multi-source news aggregator using structured data from this RSS feed scraper
  • Integrate real-time news data into AI applications, chatbots, or research tools

🎓 Academic & Journalism Research

  • Collect news article datasets for media bias research, framing analysis, or agenda-setting studies
  • Archive news coverage of specific events across multiple outlets for longitudinal research
  • Build structured datasets of news articles for computational journalism or fact-checking tools

🏢 Brand & Topic Monitoring

  • Monitor brand mentions across major publications using keyword-relevant RSS feeds
  • Track industry news from trade publications by adding their feed URLs to a scheduled run
  • Build a real-time news alert system by combining this news scraper with Apify's scheduling

⚙️ Input Parameters

{
"preset_feed": "techcrunch",
"feed_urls": [],
"max_results": 20,
"fetch_full_articles": false,
"proxyConfiguration": {
"useApifyProxy": false
}
}
ParameterTypeDefaultDescription
preset_feedstring""Built-in preset name — e.g. "bbc_news", "reuters_tech", "techcrunch_ai". See full preset list above
feed_urlsarray or string[]Custom RSS or Atom feed URLs — paste any valid feed. Newline-separated string also accepted. Multiple feeds processed in one run
max_resultsinteger20Maximum total articles to return across all feeds
fetch_full_articlesbooleanfalseWhen true, visits each article URL and extracts the full article body text. Adds ~5–10 seconds per article
proxyConfigurationobjectOffOptional proxy config — not required for most RSS feeds

Tip: You can combine preset_feed and feed_urls in the same run. The preset feed is processed first, then your custom URLs. Multiple feeds in feed_urls are all processed together with results merged into one dataset.


📋 Output Fields

Every record from this news article scraper includes:

FieldTypeDescriptionExample
titlestringArticle headline (max 300 chars)"OpenAI releases GPT-5 with major reasoning improvements"
urlstringFull article URL"https://techcrunch.com/2024/03/15/..."
descriptionstringArticle summary or excerpt (max 1000 chars)"OpenAI has announced the release of..."
publishedstringPublication date and time"Fri, 15 Mar 2024 09:30:00 GMT"
authorstringArticle author name"Sarah Perez"
categorystringArticle categories (up to 3, comma-separated)"AI, Technology, Startups"
imagestringArticle thumbnail or featured image URL"https://techcrunch.com/wp-content/..."
sourcestringFeed source domain"techcrunch.com"
typestringFeed format detected"rss", "atom"
full_textstringFull article body text — only when fetch_full_articles: true (max 5000 chars)"OpenAI has today announced..."

📦 Example Input & Output

Input — preset feed:

{
"preset_feed": "techcrunch_ai",
"max_results": 5
}

Input — custom feed URLs:

{
"feed_urls": [
"https://www.wired.com/feed/rss",
"https://www.theverge.com/rss/index.xml"
],
"max_results": 10,
"fetch_full_articles": false
}

Output (one record):

{
"title": "OpenAI releases GPT-5 with major reasoning improvements",
"url": "https://techcrunch.com/2024/03/15/openai-gpt5/",
"description": "OpenAI has announced the release of GPT-5, featuring significant improvements in multi-step reasoning and code generation tasks.",
"published": "Fri, 15 Mar 2024 09:30:00 GMT",
"author": "Sarah Perez",
"category": "Artificial Intelligence, Technology",
"image": "https://techcrunch.com/wp-content/uploads/2024/03/gpt5.jpg",
"source": "techcrunch.com",
"type": "rss",
"full_text": null
}

💰 Pricing & Free Trial

PlanPriceIncludes
Free Trial$02 hours full access — no credit card required
Monthly$6.99 / monthUnlimited runs, all presets, custom feeds, full article extraction

Everything included in every plan:

  • ✅ 20+ built-in news feed presets — BBC, Reuters, TechCrunch, CNN, NYT, and more
  • ✅ Custom RSS and Atom feed URL support — any publication
  • ✅ Multi-feed batch — process multiple feeds in one run
  • ✅ Full article body text extraction (optional)
  • ✅ Author, category, image, and publish date per article
  • ✅ No login or API key required
  • ✅ JSON + CSV + Excel export from Apify dataset
  • ✅ Scheduled runs for automated news monitoring

Start your 2-hour free trial now — no credit card needed. Click Try for free at the top of this page.


⚡ Performance & Limits

ModeArticlesEstimated Time
Single preset feed20~10–20 seconds
Multiple feeds50~30–60 seconds
With full article extraction20~3–5 minutes
With full article extraction50~8–15 minutes
  • Results pushed to the Apify dataset in real time as each feed is processed
  • Full article extraction adds approximately 5–10 seconds per article — disable for faster runs
  • No proxy required for most major RSS feeds
  • Multiple feeds are processed in sequence with automatic rate limiting

❓ FAQ

Q: Can I use this news scraper with any RSS feed — not just the presets? A: Yes. Paste any valid RSS or Atom feed URL into the feed_urls field. The actor handles both RSS 2.0 and Atom feed formats automatically. If a publication offers an RSS feed, this RSS feed scraper can extract it.

Q: Can I process multiple RSS feeds in one run? A: Yes. Add multiple URLs to the feed_urls array — or combine a preset_feed with custom URLs — and all feeds are processed in a single run. Results are merged into one output dataset.

Q: What does fetch_full_articles do? A: When enabled, the actor visits each article's URL after parsing the feed and extracts the full article body text from the page. This gives you the complete article content — not just the RSS excerpt. It adds processing time, so only enable it when you need the full text.

Q: Does this work with Atom feeds as well as RSS? A: Yes. Both RSS 2.0 and Atom feed formats are fully supported. The actor auto-detects the format and parses accordingly.

Q: Do I need a proxy for major news sites? A: No. Most major news RSS feeds are publicly accessible without any proxy. Proxy is optional and can be enabled for feeds that restrict access by geography or IP.

Q: Can I schedule this to run daily for automated news monitoring? A: Yes. Set up an Apify scheduled task with your chosen preset or feed URLs to automatically collect fresh articles every day — or at any interval you choose.

Q: What if a feed URL returns no articles? A: The actor logs a warning and skips that feed, then continues processing all remaining feeds. One failed feed never stops the rest of the run.

Q: Can I export results to Excel or CSV? A: Yes. All results are pushed to the Apify dataset, which can be exported to JSON, CSV, Excel, and more directly from the Apify Console after each run.


📜 Changelog

v1.0.0 (Current)

  • ✅ RSS 2.0 and Atom feed parsing
  • ✅ 20+ built-in news feed presets
  • ✅ Custom RSS feed URL support — any publication
  • ✅ Multi-feed batch processing in one run
  • ✅ Full article body text extraction (optional)
  • ✅ Article fields: title, URL, description, author, date, category, image, source
  • ✅ Automatic RSS vs Atom format detection
  • ✅ No proxy required for major news feeds
  • ✅ Real-time dataset push as each feed is processed

🏷️ Tags

rss feed scraper news scraper news article scraper rss scraper atom feed scraper news feed extractor bbc news scraper techcrunch scraper reuters scraper news data extractor media monitoring news aggregator


This actor accesses publicly available RSS and Atom feed data published by news outlets and content creators for the purpose of content distribution.

Please note:

  • RSS feeds are intentionally published by content creators for public consumption and aggregation
  • Use extracted news article data only for lawful purposes — research, monitoring, NLP datasets, aggregation, and academic study are common legitimate uses
  • Article content is copyright of the original publisher — do not republish full article text without authorization
  • Respect individual publication terms of service when using full article extraction
  • The actor developer is not responsible for how extracted data is used

🤝 Support & Feedback

  • Bug report? Contact us via the Apify actor page
  • Feature request? Post in the Apify Community forum
  • Loving it? Please leave a ⭐ review — it helps other users find this actor!

Built with ❤️ on Apify
The most complete RSS Feed Scraper — 20+ news presets, custom feeds, full article extraction

💰 $6.99/month · 🆓 2-hour free trial · No credit card required