Pricing

$29.00/month + usage

Try for free

Go to Store

Fast News Scraper

Try for free

Developed by

Tim Green

Extract full article text and metadata from popular news sites like The New York Times, AP News, Reuters, CNBC, NPR, and Wired. Scrape thousands of articles in just a few minutes.

5.0 (2)

Pricing

$29.00/month + usage

Issues response

1.3 days

Last modified

2 months ago

News

Fast News Scraper extracts full article text from select news and content websites with a focus on speed. It uses private APIs where available and only makes plain HTTP requests. This won't work for every website, but with a little ingenuity, it can work in a surprising number of cases. Thousands of full articles can be pulled in just minutes.

In addition to the full article text, Fast News Scraper also retrieves various pieces of metadata for each article. The full output is detailed below.

What news websites are supported?

Fast News Scraper currently supports scraping articles from the following websites:

The New York Times (nytimes.com)
The Washington Post (washingtonpost.com)
CNN (cnn.com)
Reuters (reuters.com)
Wired (wired.com)
CNBC (cnbc.com)
Associated Press (apnews.com)
NBC News (nbcnews.com)
NPR (npr.com)
People (people.com)
India Times (indiatimes.com)
CBC (cbc.ca)

Additional websites will be added over time. If there's a website you'd like to see supported, go to the Issues tab and create a new issue.

Why scrape full news articles?

There are a variety of reasons why scraping full news articles is useful:

Media monitoring: Scrape news articles to track mentions of your company, competitors, or industry-related keywords, allowing you to stay on top of your online reputation and market trends.
Research and analysis: Collect and analyze news articles to identify patterns, trends, and insights on various topics, such as politics, economics, or social issues.
Sentiment analysis: Analyze news articles to determine the sentiment around a particular topic, company, or individual, helping you understand public opinion and make informed decisions.
Event detection: Scrape news articles to detect and track events, such as natural disasters, protests, or product launches, allowing you to respond quickly and effectively.
Topic modeling: Use scraped news articles to identify underlying topics and themes, enabling you to understand the broader context and relationships between different news stories.
Entity extraction: Extract specific entities, such as people, organizations, and locations, from news articles to build databases, create profiles, or track relationships.
News recommendation: Scrape news articles to build personalized news recommendation systems, suggesting relevant content to users based on their interests and preferences.
Fake news detection: Analyze news articles to identify potential fake news stories, helping to combat misinformation and promote fact-based journalism.
Historical research: Scrape news articles to create archives of historical events, allowing researchers and scholars to study and analyze past events and trends.
Business intelligence: Collect and analyze news articles to gather competitive intelligence, track market trends, and identify business opportunities.
Content generation: Use scraped news articles as inspiration or input for generating new content, such as summaries, abstracts, or even entire articles.
Academic research: Collect and analyze news articles to support academic research in fields like journalism, communication, sociology, and political science.
Data journalism: Scrape news articles to create interactive visualizations, dashboards, and stories that help journalists and researchers.
AI training: AI models require large quantities of training data. News articles can provide a rich source of such data.

Input configuration

Fast News Scraper works by using the built-in search functionality on the website you want to scrape. You can search by query and return results based on relevance or date where supported.

Here are all the supported input fields. For more details, see the Input tab.

Field	Type	Description	Default value
site	string	The site to scrape. Must be one of the supported sites.	`reuters.com`
query	string	The query term used to search to selected site. Not all sites support queries, and only some sites allow an empty query.	`artificial intelligence`
sort	string	The order in which articles are returned. Must be either date or relevance. Not all websites support both.	`date`
maxItems	number	The approximate maximum number of items that will be returned by a run. The actual number returned may be slightly higher or lower.	`500`
datasetName	string	If this field is present, a named dataset will be used. This is useful for appending results from multiple runs.	`null`
requestQueueName	string	If this field is present, a named request queue will be used. This allows you to avoid scraping the same content across multiple runs.	`null`
proxy	object	The proxy configuration to use. This field is rcquired.	`{ "useApifyProxy": true }`

Output example

The scraped articles will be shown as a dataset which you can find in the Output tab. Note that the output will first be organized as a table for viewing convenience.

You can preview all the fields and choose in which format to download the data you’ve extracted: JSON, CSV, Excel, HTML table, or XML. Below is a sample dataset in JSON format:

{
	"query": "Nvidia",
	"label": "apnews.com.article",
	"site": "apnews.com",
	"url": "https://apnews.com/article/nvidia-gtc-jensen-huang-ai-457e9260aa2a34c1bbcc07c98b7a0555",
	"title": "Nvidia CEO Jensen Huang unveils new Rubin AI chips at GTC 2025",
	"tags": [
		"Business"
	],
	"description": "Nvidia founder Jensen Huang kicked off the company’s artificial intelligence developer conference, on Tuesday by telling a crowd of thousands that AI is going through “an inflection point.”",
	"image": "https://dims.apnews.com/dims4/default/1110a1f/2147483647/strip/true/crop/4659x2621+0+243/resize/1440x810!/quality/90/?url=https%3A%2F%2Fassets.apnews.com%2F1c%2Fa7%2Fbb9db252004b299235ec619feb7b%2F227eb1f572f14664b6ea05d276e07359",
	"author": "SARAH PARVINI",
	"published": "2025-03-18T18:35:20",
	"updated": "2025-03-18T18:35:20",
	"content": "Nvidia founder Jensen Huang kicked off the company’s artificial intelligence developer conference on Tuesday by telling a crowd of thousands that AI is going through “an inflection point.”\n\nAt GTC 2025 — dubbed the “Super Bowl of AI” — Huang focused his keynote on the company’s advancements in AI and his predictions for how the industry will move over the next few years. Demand for GPUs from the top four cloud service providers is surging, he said, adding that... (truncated)"
}

Note: Some fields will be blank, empty, or null depending on the website and article.

How long does it take to scrape news articles?

The article extraction rate for each supported website differs. Using the default settings, here's a rough idea of how quickly you can scrape full articles using Fast News Scraper based on some test runs:

Note: All runs listed below used Datacenter proxies unless otherwise noted.

Site	Articles	Time	Rate	Notes
reuters.com	1,999	4m 17s	467 articles/minute
cnn.com	1,586	2m 03s	774 articles/minute
wired.com	1,882	4m 25s	426 articles/minute
nytimes.com	924	4m 47s	193 articles/minute	Residential proxies
washingtonpost.com	290	4m 54s	59 articles/minute
cnbc.com	645	3m 19s	195 articles/minute
apnews.com	621	3m 1s	206 articles/minute
nbcnews.com	965	2m 26s	397 articles/minute
npr.com	980	1m 56s	507 articles/minute
people.com	1003	2m 16s	442 articles/minute
indiatimes.com	734	4m 2s	182 articles/minute
cbc.ca	462	1m 53s	245 articles/minute

How to scrape articles from The New York Times (nytimes.com)

The New York Times is a daily newspaper based in New York City that is widely regarded as one of the most respected and authoritative sources of news and information in the world. Founded in 1851, The Times has a long history of journalistic excellence, having won 127 Pulitzer Prizes, more than any other newspaper. Known for its in-depth reporting and thoughtful analysis, The Times covers a wide range of topics, including national and international news, politics, business, culture, and more.

The New York Times locks articles behind a paywall, only allowing free users to access a limited number of articles per month. Fast News Scraper gets around this, providing access to the full text of New York Times articles.

To extract New York Times articles:

Set site to nytimes.com.
If query is omitted or left empty, all New York Times content will be returned. A non-empty query will use the website's search functionality.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

Each New York Times search will only return a maximum of ~1,000 articles.

Note: Using Residential proxies is recommended for scraping New York Times articles to avoid getting blocked and to ensure that articles are not dropped.

Note: Some types of content, including live news content and any articles that live on a subdomain, are skipped.

How to scrape articles from The Washington Post (washingtonpost.com)

The Washington Post is a major American daily newspaper published in Washington, D.C. Founded in 1877, it is one of the oldest and most respected newspapers in the United States. Known for its in-depth coverage of national politics, The Post has won numerous Pulitzer Prizes for its investigative reporting, including its coverage of the Watergate scandal in the 1970s. Today, The Washington Post is a leading source of news and opinion on politics, business, sports, and culture, with a print and online circulation of millions.

The Washington Post locks articles behind a paywall, only allowing free users to access a limited number of articles per month. Fast News Scraper gets around this, providing access to the full text of Washington Post articles.

To extract Washington Post articles:

Set site to washingtonpost.com.
Set query to a non-empty string. The Washington Post does not allow empty queries.
Articles will always be returned sorted by relevance. The sort field will be ignored.

The default Datacenter proxies work just fine with The Washington Post.

Note: Each query generally only returns a few hundred articles.

How to scrape articles from Reuters (reuters.com)

Reuters is a leading international news agency that provides comprehensive and unbiased coverage of global news, including politics, business, finance, technology, and more. Founded in 1851, Reuters is one of the oldest and most respected news agencies in the world, with a reputation for accuracy, speed, and independence. Reuters.com offers real-time news coverage, in-depth analysis, and commentary on global events, as well as video and photography from around the world.

Reuters requires registration to view unlimited articles, only allowing unregistered users to access a limited number of articles per month. Fast News Scraper gets around this, providing access to the full text of Reuters articles.

To extract Reuters articles:

Set site to reuters.com.
Set query to a non-empty string. Reuters does not allow empty queries.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with Reuters.

How to scrape articles from CNN (cnn.com)

CNN (Cable News Network) is a 24-hour cable news channel that provides continuous coverage of global news, politics, business, entertainment, and more. Founded in 1980, CNN is one of the most recognized and respected news brands in the world, known for its breaking news coverage, in-depth reporting, and live coverage of major events. CNN.com offers a wide range of news content, including video, articles, and blogs, as well as live streaming of CNN TV programming.

CNN doesn't require registration to view news articles, so scraping the website is relatively straightforward.

To extract CNN articles:

Set site to cnn.com.
If query is omitted or left empty, all CNN content will be returned. A non-empty query will use the website's search functionality.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with CNN.

Note: Some types of content, including video, live news, CNN Underscored, and gallery content, will be skipped. Any articles that are in a special format (interactive, etc.) will likely fail to be extracted.

How to scrape articles from Wired (wired.com)

Wired is a technology-focused news site that provides in-depth coverage of the latest developments in tech, science, and innovation. Founded in 1993, Wired is known for its cutting-edge reporting on emerging trends, gadgets, and ideas that are shaping the future of business, culture, and society. Wired.com features news, analysis, and commentary on topics such as artificial intelligence, cybersecurity, robotics, and more, as well as profiles of innovators and entrepreneurs who are changing the world.

Wired locks articles behind a paywall, only allowing free users to access a limited number of articles per month. Fast News Scraper gets around this, providing access to the full text of Wired articles.

To extract Wired articles:

Set site to wired.com.
If query is omitted or left empty, all Wired content will be returned. A non-empty query will use the website's search functionality.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with Wired.

Note: Sponsored content is skipped.

How to scrape articles from CNBC (cnbc.com)

CNBC, or Consumer News and Business Channel, is a 24-hour cable television network that provides business news and financial information to a global audience. Founded in 1989, CNBC is a leading source of business and market news, offering live coverage of stock markets, economic indicators, and corporate news. The network's programming includes popular shows such as "Squawk Box," "Fast Money," and "Mad Money with Jim Cramer," featuring expert analysis and commentary from experienced journalists and financial experts. CNBC also provides online content, including articles, videos, and podcasts, making it a one-stop shop for investors, business leaders, and anyone interested in staying informed about the world of finance.

CNBC locks its PRO articles behind a paywall. Fast News Scraper gets around this, providing access to the full text of CNBC articles, regardless of whether they're standard articles or PRO articles.

To extract CNBC articles:

Set site to cnbc.com.
Set query to a non-empty string. CNBC does not allow empty queries.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with CNBC.

Note: Video content is skipped, and some "live update" content will fail to be scraped.

How to scrape articles from Associated Press (apnews.com)

The Associated Press (AP) is a non-profit news cooperative that has been a leading source of factual reporting for over 175 years. Founded in 1846, the AP is one of the largest and most respected news organizations in the world, providing comprehensive coverage of national and international news to thousands of newspapers, television and radio stations, and online media outlets. The AP's website, apnews.com, offers a wealth of news, photos, and videos on a wide range of topics, including politics, business, sports, and entertainment. Visitors to the site can access breaking news, in-depth analysis, and feature stories, as well as watch live video and access a vast archive of AP content. With its commitment to fact-based reporting and impartiality, apnews.com is a trusted source of news and information for people around the world.

To extract AP articles:

Set site to apnews.com.
Set query to a non-empty string. AP does not allow empty queries.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with AP, although if some articles are getting blocked, you can try Residential proxies.

How to scrape articles from NBC News (nbcnews.com)

NBCNews.com is the online news website of NBC News, a leading American news organization that provides comprehensive coverage of national and international news. The site offers a wide range of news, analysis, and features on topics such as politics, business, health, technology, and entertainment. Visitors to the site can access breaking news, in-depth reporting, and investigative journalism, as well as watch live video and access a vast array of multimedia content. NBCNews.com is known for its coverage of major news events, including elections, natural disasters, and global conflicts, and features reporting from NBC News correspondents and anchors, including Rachel Maddow, Lester Holt, and Katy Tur. The site also offers specialized sections, such as NBC News' investigative unit, "Investigations," and "Health," which provides the latest news and information on health and wellness topics.

To extract NBC News articles:

Set site to nbcnews.com.
Set query to a non-empty string. NBC News does not allow empty queries.

Note: NBC News does not support sorting by date. Articles will be returned sorted by relevance and the sort parameter will be ignored.

The default Datacenter proxies work just fine with NBC News, although if some articles are getting blocked, you can try Residential proxies.

How to scrape articles from NPR (npr.com)

NPR (National Public Radio) is a non-profit media organization that produces and distributes news, information, and cultural programming to a wide audience through its network of public radio stations and online platforms. NPR's flagship website, npr.org, offers a wealth of news, analysis, and features on a wide range of topics, including politics, science, arts, and culture. Visitors to the site can access NPR's signature news programs, such as "Morning Edition" and "All Things Considered," as well as original reporting and storytelling from NPR's correspondents and producers. The site also features a diverse array of podcasts, including "How I Built This," "TED Radio Hour," and "Planet Money," which offer in-depth explorations of topics such as business, technology, and global issues. With its commitment to in-depth reporting and nuanced storytelling, npr.org is a trusted source of news and information for millions of Americans.

To extract NPR articles:

Set site to npr.com.
Set query to a non-empty string. NPR does not allow empty queries.
Set sort to either date or relevance. If sort is omitted, articles will be returned by date.

The default Datacenter proxies work just fine with NPR, although if some articles are getting blocked, you can try Residential proxies.

How to scrape articles from People (people.com)

People is a weekly American magazine that features news, celebrity gossip, and human-interest stories, with a focus on entertainment, lifestyle, and current events. Founded in 1974 by Time Inc., the magazine has become one of the most widely read and influential publications in the world, with a circulation of over 3.5 million copies per issue. The magazine's website, People.com, is also a leading online destination for celebrity news, with in-depth coverage of the latest movies, TV shows, music, and celebrity culture, as well as exclusive interviews, photos, and videos, making it a go-to source for fans and entertainment enthusiasts alike, offering a wide range of content that includes news, features, and analysis on the people and events that shape popular culture.

To extract People articles:

Set site to people.com.
Set query to a non-empty string. People does not allow empty queries.

Note: People does not support sorting by date. Articles will be returned sorted by relevance and the sort parameter will be ignored.

The default Datacenter proxies work just fine with People, although if some articles are getting blocked, you can try Residential proxies.

How to scrape articles from India Times (indiatimes.com)

Indiatimes.com is a popular Indian online portal that is part of the Times Internet, the digital arm of the Times Group, offering a diverse range of content across various categories, including news, entertainment, lifestyle, technology, and more. The website aggregates content from various Times Group properties, including the Times of India, and features a wide array of sections, such as entertainment news and gossip on timesnownews.com and entertainment.timesofindia.indiatimes.com, lifestyle and wellness content on femina.in and indiatimes.com/life-style, and technology news on indiatimes.com/tech, making it a one-stop destination for Indians looking for news, information, and entertainment online, with a vast array of content that caters to different interests and demographics.

To extract India Times articles:

Set site to indiatimes.com.
Set query to a non-empty string. India Times does not allow empty queries.

Note: India Times does not support sorting by relevance. Articles will be returned sorted by date and the sort parameter will be ignored.

The default Datacenter proxies work just fine with India Times, although if some articles are getting blocked, you can try Residential proxies.

How to scrape articles from CBC (cbc.ca)

The Canadian Broadcasting Corporation (CBC), established in 1936, is Canada’s public broadcaster, offering a wide range of news, entertainment, and cultural content across television, radio, and digital platforms. Its flagship website, cbc.ca, serves as a comprehensive hub for real-time news updates, in-depth analysis, and multimedia content, including live streams of CBC’s TV and radio channels, podcasts, and on-demand programming. Known for its commitment to journalistic integrity and public service, CBC provides balanced coverage of national and international events with a distinct Canadian perspective, alongside regional news tailored to diverse communities across the country. The site features bilingual content in English and French, reflecting Canada’s linguistic duality, and offers 24/7 access to breaking news, sports, arts, and investigative reports. Renowned for its trusted reporting and role in informing the public, CBC remains a cornerstone of Canadian media, fostering civic engagement and connecting audiences through accessible, inclusive storytelling.

To extract CBC articles:

Set site to cbc.ca.
Set query to a non-empty string. CBC does not allow empty queries.

Note: Only news articles are returned. Some types of content, including interactive features and video content, are skipped.

Note: Each query appears limited to returning a few hundred articles.

The default Datacenter proxies work just fine with CBC, although if some articles are getting blocked, you can try Residential proxies.

How to pull only new articles

Let's say you want to schedule Fast News Scraper to run once a week and pull any new articles from wired.com that you haven't already extracted. The key is to use a named dataset and request queue, which can be done using the datasetName and requestQueueName input fields. Each time you run Fast News Scraper, only articles that have not yet been scraped will be processed, and the scraper will automatically stop once it's reached a point where the only articles it's finding are articles you've already scraped. This way you avoid wasting time and money repeatedly scraping the same content.

Learn more about scheduling on the Apify platform.

Note: If you use a named dataset, data will be pushed to the named dataset and an unnamed dataset linked to the run. This is a limitation of the Apify platform. You can view the full dataset and request queue by navigating to the Storage page in the Apify console.

Copyright and publishing

Extracting articles is legal, as you are scraping publicly available content. Please be aware that most articles are protected by copyright laws. Before you publish extracted articles anywhere, check the terms of use of the scraped website. In other words: Don't be a jerk.

News icons created by Freepik - Flaticon

Other Scrapers

Unlimited Airbnb Scraper - Scrape all Airbnb listings in a location. Get thousands of results in just a few minutes.

On this page

Fast News Scraper

Share Actor:

Ultimate News API

glitch_404/Ultimate-News-Scraper

news scraper to scrape up to 10K news articles from over 4500 news sources in less than 20 minutes news from over 20 categories .e.g. Crypto news, World News, Latest News, Celebrities News, and a lot more. you can get news from websites like Fox News, BBC News, CNN News, Crypto and Cryptocurrencies.

Yousif Wael

128

Advanced News Scraper

dorcy/advanced-news-scraper

This scraper is crafted to extract the latest news articles based on custom search queries, providing a wealth of information, including article titles, sources, publication dates, full article text, and AI-generated summary.

Dorcy Shema

208

Google News Scraper

easyapi/google-news-scraper

Powerful Google News scraper, collect up to 5000 news articles with flexible search options, language support. Perfect for news aggregation, market research, and sentiment analysis. 📰🔍

EasyApi

292

3.0

News Website Crawler & Article Extractor

xtech/news-source-crawler

Scrape all articles from any news website. Extract full text, metadata, keywords, and summaries. Ideal for content analysis, research, and news aggregation.

Xtech

141

Google News Scraper (Pay Per Result)

data_xplorer/google-news-scraper-fast

⚡️ Extract real-time news including Images and Descriptions from Google News with our powerful scraper. Get comprehensive structured data including titles, sources, publication dates and full article summaries. Perfect for news monitoring, market research and content aggregation.

Data Xplorer

172

5.0

Google News Realtime Scraper

devisty/google-news

Provide real-time news and articles sourced from Google News

Devisty

182

5.0

Google News Scraper

epctex/google-news-scraper

Unlock timely news insights with our Google News data retrieval tool. Get the latest news on any news at any time, and more. Effortless and powerful. 📰🔍 #NewsData