Awesome Google News Scraper avatar

Awesome Google News Scraper

Pricing

$5.00/month + usage

Go to Apify Store
Awesome Google News Scraper

Awesome Google News Scraper

This tool scrapes content from Google News, streamlining the collection of latest the information on any topic. Its key feature is the ability to extract full-length articles, not just headlines. Customize results from brief summaries to complete content, revolutionizing your news gathering process.

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

Alam

Alam

Maintained by Community

Actor stats

9

Bookmarked

100

Total users

2

Monthly active users

8 days ago

Last modified

Share

Unlock the power of comprehensive news analysis with this cutting-edge Apify actor! Designed to revolutionize how you gather and process information, this tool doesn't just scrape headlines – it delivers entire articles right to your fingertips. By leveraging Google News as its source, our actor offers an unparalleled ability to extract, filter, and aggregate full-length news content on any topic you choose.

Features

• Full Article Extraction: Unlike standard RSS feeds or basic scrapers, this actor can retrieve the complete text of articles, giving you access to in-depth content without leaving the platform. • Customizable Content Length: Whether you need a quick summary or the entire story, you're in control. Choose between a specific word count or opt for the full article. • Smart Filtering: Easily exclude unwanted content with customizable keyword filters. • Flexible Time Ranges: Stay current or research past events with adjustable time frame options. • Streamlined Data Structure: Receive well-organized output including titles, URLs, publication dates, sources, and more. • Optional Image Retrieval: Choose whether to fetch image URLs for articles, balancing between comprehensive data and faster performance.

Transform your news gathering process and gain deeper insights with our actor's unique ability to provide complete article content. Say goodbye to surface-level summaries and hello to comprehensive news analysis at your fingertips!

Who Uses This Scraper?

Content Creators & Newsletter Writers

"I use this scraper to track crypto news daily. The full article extraction lets me read and summarize without visiting 50 different sites." — Newsletter operator, 200+ subscribers

AI & Machine Learning Teams

"Perfect for building news sentiment analysis models. The structured output makes it easy to feed directly into our NLP pipeline." — ML Engineer at fintech startup

Market Researchers

"I monitor competitor mentions across 50+ news sources. Keyword filtering cuts out irrelevant noise and saves hours of manual work." — Market research analyst

Journalists & Researchers

"The time-range filtering is incredible for historical research. I pulled 200 articles from 2023 to write a trend analysis piece." — Freelance journalist

Example Workflows

1. Crypto Trading Signals

{ "keyword": "Bitcoin", "numberOfItems": 50, "filterBadKeywords": ["scam", "fraud", "giveaway"], "contentLength": "full", "timeRange": "Past 24 hours", "retrieveImage": false } Use case: Get full articles from the last 24 hours for sentiment analysis and trading decisions.

  1. AI Industry News Monitoring

{ "keyword": "AI regulation", "numberOfItems": 30, "filterBadKeywords": ["clickbait"], "contentLength": 500, "timeRange": "Past week", "retrieveImage": true } Use case: Weekly digest of AI regulation news with 500-word summaries and images.

  1. Brand Reputation Tracking

{ "keyword": "OpenAI", "numberOfItems": 100, "filterBadKeywords": ["ads", "sponsored"], "contentLength": "full", "timeRange": "Past month", "retrieveImage": false }

Use case: Comprehensive analysis of OpenAI coverage over the last month.

Input

The actor accepts the following input parameters:

ParameterTypeDescription
keywordStringThe search term for news (e.g., "BRICS", "Politics")
numberOfItemsNumberThe number of news items to fetch (default: 10, maximum: 100)
filterBadKeywordsArrayOptional array of keywords to filter out unwanted news items
contentLengthString/NumberNumber of words to extract from the article or 'full' for entire content
timeRangeStringTime range for news articles (e.g., "Past hour", "Past 24 hours", "Past week", "Past year")
retrieveImageBooleanWhether to retrieve image URLs for articles (default: false)

Example input:

{ "keyword": "Bitcoin", "numberOfItems": 20, "filterBadKeywords": ["scam", "fraud"], "contentLength": "200", "timeRange": "Past week", "retrieveImage": false }

Output

The actor outputs a dataset with the following structure for each news article:

  • title: The title of the news article
  • link: The resolved URL of the article
  • pubDate: The publication date of the article
  • source: The source (news outlet) of the article
  • imageUrl: The URL of the article's main image (if retrieveImage is set to true)
  • summary: A brief summary of the article
  • content: The extracted content of the article (based on contentLength parameter)

Pricing

  • Free trial: 24 hours (1,440 minutes) of usage
  • Monthly subscription: $5/month
  • Usage-based: Additional compute time charged per Apify's standard rates

What You Get with $5/Month

  • Access to full article extraction
  • Up to 100 articles per run
  • Smart keyword filtering
  • Adjustable time ranges
  • Parallel processing (up to 5 concurrent requests)
  • No hidden fees

Cost per Use Example

  • 50 Bitcoin articles = ~$0.10-$0.20 per run
  • 100 AI news articles = ~$0.15-$0.30 per run
  • Daily news monitoring = ~$3-$5/month (depending on volume)

💡 Note: Apify charges for compute time. Larger runs, full content extraction, and concurrent processing use more resources. Start with small batches to estimate costs for your use case.

Usage

  1. Configure your desired input parameters
  2. Run the actor
  3. Retrieve the results from the dataset

Performance

The performance of this actor can vary based on the number of items requested and the complexity of the articles being scraped. Here are some general guidelines:

  • Processing Time: On average, the actor takes about 5-10 seconds per article for full content extraction.
  • Scalability: The actor is designed to handle up to 100 items per run efficiently.
  • Concurrent Requests: To balance performance and politeness to source websites, the actor processes up to 5 articles concurrently.

For optimal performance, we recommend:

  • Limiting requests to 50 items or fewer for quicker results.
  • Using more specific keywords to target relevant articles and reduce processing time.
  • Setting a reasonable contentLength if you don't need the full article text.
  • Keeping retrieveImage set to false unless image URLs are necessary, as this can significantly speed up the scraping process.

Note: Performance can be affected by factors such as network latency and the responsiveness of source websites.

Error Handling

This actor is designed with robust error handling to ensure smooth operation:

  • Network Issues: If a connection to Google News fails, the actor will retry up to 3 times before moving on to the next item.
  • Rate Limiting: The actor implements a delay between requests to avoid triggering Google's rate limits. If rate limiting is detected, the actor will pause for 60 seconds before retrying.
  • Article Extraction: If the full text of an article cannot be extracted, the actor will fall back to providing the summary from the RSS feed.
  • Invalid Inputs: The actor validates all inputs and will provide meaningful error messages for any invalid parameters.

In case of any unrecoverable errors, the actor will log the error details and continue processing the remaining items where possible.