News Article Scraper
Pricing
Pay per usage
News Article Scraper
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Donny Nguyen
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 hours ago
Last modified
Categories
Share
What does News Article Scraper do?
News Article Scraper extracts full article content from news websites like TechCrunch, CNN, BBC, and other online publications. It collects headlines, full body text, author names, publication dates, images, and source URLs from any news site you point it at. This actor is perfect for media monitoring, content aggregation, and building news datasets for research.
Why use News Article Scraper?
- Full article extraction -- Goes beyond headlines to capture the complete body text of each article.
- Multi-site support -- Works across a wide range of news sites and blogs thanks to intelligent content detection.
- Configurable crawl depth -- Follow links from a homepage to discover and scrape articles across multiple pages.
- Proxy-powered reliability -- Uses Apify Proxy to handle rate limiting and access regional content.
- API integration -- Trigger scraping runs and retrieve results programmatically via the Apify API.
How to use News Article Scraper
- Visit the Apify Store and find News Article Scraper.
- Click Try for free to open the actor in the Apify Console.
- Enter one or more news site URLs or direct article links in the News Site URLs field.
- Set the Max Articles limit to control the total number of articles collected.
- Adjust the Crawl Depth if you want the actor to follow links deeper into the site (default is 1 level).
- Click Start and download the results from the Dataset tab when the run completes.
Input configuration
| Field | Type | Description | Default |
|---|---|---|---|
urls | Array of strings | URLs of news sites or articles to scrape | ["https://techcrunch.com"] |
maxArticles | Integer | Maximum number of articles to scrape | 50 |
crawlDepth | Integer | How many levels deep to follow links | 1 |
Output data
Each article is stored as a separate record in the dataset. Below is an example output:
{"headline": "OpenAI Announces New Partnership with Major Cloud Provider","bodyText": "OpenAI revealed today that it has entered into a strategic partnership with a leading cloud infrastructure provider. The deal, reportedly valued at over $2 billion, will expand access to...","author": "Jane Doe","publishDate": "2025-11-20T14:30:00Z","imageUrl": "https://techcrunch.com/wp-content/uploads/2025/11/openai-partnership.jpg","sourceUrl": "https://techcrunch.com/2025/11/20/openai-cloud-partnership/"}
Cost of usage
News Article Scraper uses Pay-Per-Event (PPE) pricing at the Mid tier:
| Tier | Cost per 1,000 events | Free tier (approx.) |
|---|---|---|
| Mid | $0.75 | ~6,600 events/month |
One event corresponds to one article scraped. Collecting 50 articles from a single news site would cost approximately $0.038. The free tier allocation covers roughly 6,600 articles per month at no charge.
Tips and advanced usage
- Media monitoring -- Set up scheduled runs to scrape your target publications daily and track coverage on specific topics.
- Build training datasets -- Collect thousands of articles for NLP and machine learning projects such as text classification or summarization.
- Track multiple sources -- Pass several news site URLs in a single run to aggregate content across publications.
- Increase crawl depth for archives -- Set
crawlDepthto 2 or 3 to discover articles linked from category or archive pages. - Combine with keyword filtering -- Export the dataset and filter by headline or body text to isolate articles on specific topics.
Built with Crawlee and Apify SDK. See more scrapers by consummate_mandala on Apify Store.