News Article Scraper avatar

News Article Scraper

Pricing

Pay per usage

Go to Apify Store
News Article Scraper

News Article Scraper

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Donny Nguyen

Donny Nguyen

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

13 hours ago

Last modified

Categories

Share

What does News Article Scraper do?

News Article Scraper extracts full article content from news websites like TechCrunch, CNN, BBC, and other online publications. It collects headlines, full body text, author names, publication dates, images, and source URLs from any news site you point it at. This actor is perfect for media monitoring, content aggregation, and building news datasets for research.

Why use News Article Scraper?

  • Full article extraction -- Goes beyond headlines to capture the complete body text of each article.
  • Multi-site support -- Works across a wide range of news sites and blogs thanks to intelligent content detection.
  • Configurable crawl depth -- Follow links from a homepage to discover and scrape articles across multiple pages.
  • Proxy-powered reliability -- Uses Apify Proxy to handle rate limiting and access regional content.
  • API integration -- Trigger scraping runs and retrieve results programmatically via the Apify API.

How to use News Article Scraper

  1. Visit the Apify Store and find News Article Scraper.
  2. Click Try for free to open the actor in the Apify Console.
  3. Enter one or more news site URLs or direct article links in the News Site URLs field.
  4. Set the Max Articles limit to control the total number of articles collected.
  5. Adjust the Crawl Depth if you want the actor to follow links deeper into the site (default is 1 level).
  6. Click Start and download the results from the Dataset tab when the run completes.

Input configuration

FieldTypeDescriptionDefault
urlsArray of stringsURLs of news sites or articles to scrape["https://techcrunch.com"]
maxArticlesIntegerMaximum number of articles to scrape50
crawlDepthIntegerHow many levels deep to follow links1

Output data

Each article is stored as a separate record in the dataset. Below is an example output:

{
"headline": "OpenAI Announces New Partnership with Major Cloud Provider",
"bodyText": "OpenAI revealed today that it has entered into a strategic partnership with a leading cloud infrastructure provider. The deal, reportedly valued at over $2 billion, will expand access to...",
"author": "Jane Doe",
"publishDate": "2025-11-20T14:30:00Z",
"imageUrl": "https://techcrunch.com/wp-content/uploads/2025/11/openai-partnership.jpg",
"sourceUrl": "https://techcrunch.com/2025/11/20/openai-cloud-partnership/"
}

Cost of usage

News Article Scraper uses Pay-Per-Event (PPE) pricing at the Mid tier:

TierCost per 1,000 eventsFree tier (approx.)
Mid$0.75~6,600 events/month

One event corresponds to one article scraped. Collecting 50 articles from a single news site would cost approximately $0.038. The free tier allocation covers roughly 6,600 articles per month at no charge.

Tips and advanced usage

  • Media monitoring -- Set up scheduled runs to scrape your target publications daily and track coverage on specific topics.
  • Build training datasets -- Collect thousands of articles for NLP and machine learning projects such as text classification or summarization.
  • Track multiple sources -- Pass several news site URLs in a single run to aggregate content across publications.
  • Increase crawl depth for archives -- Set crawlDepth to 2 or 3 to discover articles linked from category or archive pages.
  • Combine with keyword filtering -- Export the dataset and filter by headline or body text to isolate articles on specific topics.

Built with Crawlee and Apify SDK. See more scrapers by consummate_mandala on Apify Store.