Pricing

$5.00/month + usage

Go to Store

In Depth News Scraper

Try for free

Developed by

Alam

Extract full length articles from top news sources, streamlining the collection of the latest updates on any subject. Its key feature is retrieving complete content—not just headlines. Customise your output from concise summaries to complete articles, transforming your news gathering process.

0.0 (0)

Pricing

$5.00/month + usage

Total users

Monthly users

Runs succeeded

97%

Last modified

5 months ago

News

Integrations

Automation

In-Depth News Scraper

The In-Depth News Scraper is an Apify actor designed to revolutionise how you gather and process news data. It stands apart from conventional scrapers by delivering complete article content rather than just headlines, enabling comprehensive analysis across diverse news categories.

Key Advantages

• Thorough content extraction, not just headlines • Support for major news categories and outlets • Flexible search and filtering capabilities • Structured, analysis-ready output

Features

• Category-Based Filtering: Focus your news gathering by targeting specific categories such as World, Business, or Technology. • Complete Article Extraction: Access full article content directly, surpassing the limitations of basic news aggregators. • Customisable Content Length: Control output size by specifying word count or retrieving complete articles. • Intelligent Filtering: Exclude irrelevant content using customisable keyword filters. • Time-Range Selection: Gather current news or research historical content with flexible time frame options. • Structured Data Output: Receive consistently formatted data including titles, URLs, dates, and sources. • Optional Image Support: Choose whether to include article images based on your requirements.

Input Parameters

The actor accepts the following configuration options:

Parameter	Type	Description
newsCategory	String	Required: Category filter (e.g., "World", "Technology")
additionalKeywords	String	Optional: Refine search within selected category
numberOfItems	Number	Number of articles to retrieve (default: 10, max: 100)
filterBadKeywords	Array	Optional: Keywords to exclude from results
contentLength	String	Content extraction mode: "Full" or "Summary" (default: Full)
timeRange	String	Time period for article selection
retrieveImage	Boolean	Include image URLs in output (default: false)

Example configuration:

{
    "newsCategory": "Technology",
    "additionalKeywords": "artificial intelligence",
    "numberOfItems": 20,
    "filterBadKeywords": ["sponsored", "advertisement"],
    "contentLength": "Full",
    "timeRange": "Past week",
    "retrieveImage": false
}

Supported Categories

The actor provides coverage across these primary news categories:

World
Business
Technology
Entertainment
Health
Science
Sports
Politics

Output Structure

Each article in the dataset contains the following fields:

{
    "title": "Article headline",
    "link": "Article URL",
    "pubDate": "2025-02-05T10:00:00.000Z",
    "source": "Publishing outlet name",
    "summary": "Brief article overview",
    "content": "Full article text (length based on contentLength parameter)",
    "imageUrl": "Main image URL (if retrieveImage is true)"
}

Implementation Guide

Choose your target news category
Add any specific keywords to refine results
Set additional parameters as needed
Execute the actor
Access your structured dataset

Performance Considerations

Performance varies based on several factors:

Processing Duration: Typically 5-10 seconds per article for full extraction
Volume Handling: Efficiently processes up to 100 articles per run
Request Management: Sequential processing with appropriate intervals

For optimal results:

Limit requests to 50 items for faster completion
Use precise keywords to target relevant content
Consider using word limits unless full text is required
Disable image retrieval when not essential

Note: Network conditions and source website responsiveness may affect performance.

Error Handling and Troubleshooting

The actor implements comprehensive error handling:

Connection Issues: Automatic retry (up to 3 attempts) for failed connections
Rate Management: Dynamic delays between requests to prevent rate limiting
Content Fallback: Defaults to article summary if full content extraction fails
Input Validation: Clear error messages for invalid configurations

Troubleshooting Common Issues

Timeout Errors: Consider reducing batch size or increasing time between requests
Missing Content: Check if the source website requires authentication
Rate Limiting: The actor will automatically pause and retry; no action needed
Error Logs: Available in the actor's run details for debugging

For detailed error information, consult the actor's run log in the Apify Console.

Technical Support

For implementation assistance or to report issues:

Check the actor's run log for specific error messages
Review the troubleshooting section above
Contact support with the actor run ID for detailed investigation

The actor continuously logs its progress and any errors encountered, facilitating quick problem resolution.

On this page

In-Depth News Scraper

Share Actor:

Awesome Google News Scraper

sync-network/awesome-google-news-scraper

This tool scrapes content from Google News, streamlining the collection of latest the information on any topic. Its key feature is the ability to extract full-length articles, not just headlines. Customize results from brief summaries to complete content, revolutionizing your news gathering process.

Alam

Awesome Crypto News Scraper

sync-network/awesome-crypto-news-scraper

Scrapes and aggregates crypto news from Google, allowing you to collect, filter, and analyse the latest news on specific cryptocurrencies or blockchain topics. It retrieves full-length articles, not just headlines, and lets you customize results from brief summaries to detailed content.

Alam

News Website Crawler & Article Extractor

xtech/news-source-crawler

Scrape all articles from any news website. Extract full text, metadata, keywords, and summaries. Ideal for content analysis, research, and news aggregation.

Xtech

113

Ultimate News API

glitch_404/Ultimate-News-Scraper

news scraper to scrape up to 10K news articles from over 4500 news sources in less than 20 minutes news from over 20 categories .e.g. Crypto news, World News, Latest News, Celebrities News, and a lot more. you can get news from websites like Fox News, BBC News, CNN News, Crypto and Cryptocurrencies.

Yousif Wael

115

Google News Scraper (Pay Per Result)

data_xplorer/google-news-scraper-fast

⚡️ Extract real-time news including Images and Descriptions from Google News with our powerful scraper. Get comprehensive structured data including titles, sources, publication dates and full article summaries. Perfect for news monitoring, market research and content aggregation.

Data Xplorer

107

5.0

Google News Scraper

epctex/google-news-scraper

Unlock timely news insights with our Google News data retrieval tool. Get the latest news on any news at any time, and more. Effortless and powerful. 📰🔍 #NewsData

epctex

418

Google News Scraper

dan.scraper/google-news-scraper

This actor extracts news articles from Google News. It can extract articles from a specific topics like 'World', 'Nation', 'Business', 'Technology', 'Entertainment'. Also extract articles from a specific news source like 'CNN', 'BBC', 'TechCrunch'. Scrape all elements from Google News page.

Scrape It

Google News Scraper: Real-Time 📰⚡

scrapestorm/google-news-scraper-real-time

Unlock the power of the Google News scraper tool! 📰✨ Effortlessly gather news articles based on your chosen Keyword or topic 🔍. Get key details like the title 📝 source 🌐, publication time ⏰, images 🖼️, & direct links to the full articles 🔗perfect for staying informed and ahead of the curve! 🚀