Google News Scraper avatar

Google News Scraper

Pricing

$20.00 / 1,000 results

Go to Apify Store
Google News Scraper

Google News Scraper

Google News Search Scraper - Real-time news aggregation from Google News. Features smart article enrichment with full content extraction. Perfect for market research, trend analysis, and content monitoring.

Pricing

$20.00 / 1,000 results

Rating

5.0

(2)

Developer

Futurize Rush

Futurize Rush

Maintained by Community

Actor stats

3

Bookmarked

67

Total users

25

Monthly active users

2 days ago

Last modified

Share

Google News Scraper - Real-time News Data Extraction

Extract news articles from Google News RSS feeds with multi-language support, date filtering, and browser enrichment for actual article URLs and content.

What This Actor Does

Google News Scraper searches Google News RSS feeds by keyword, extracts article metadata, then uses a browser to follow Google News redirect links to resolve actual article URLs and extract content from the source websites.

Key Features

  • Keyword Search - Search for any topic across Google News
  • 40+ Languages - Support for English, Chinese (Simplified/Traditional), Japanese, Korean, Arabic, and more
  • 60+ Regions - Target news from specific countries
  • Date Filtering - Filter by past hour, 24 hours, week, or month
  • Browser Enrichment - Resolves actual article URLs, extracts images and article text
  • Resume on Migration - Saves progress and resumes if the Actor is migrated

Use Cases

  • Market Intelligence - Track industry trends and competitor news
  • Brand Monitoring - Monitor brand mentions across news sources worldwide
  • Content Aggregation - Build news dashboards and content platforms
  • Research - Collect news data for sentiment analysis and academic research
  • Media Monitoring - Track coverage of specific topics or events

Input Configuration

ParameterTypeDefaultDescription
searchQueries (required)array-Keywords to search for (1-50 queries, max 200 chars each)
regionstring"us"Country code for news results (e.g., "us", "uk", "tw", "jp", "de")
languagestring"en"Language code (e.g., "en", "zh-TW", "zh-CN", "ja", "ko", "es")
dateFilterstring"1d"Time period: "1h" (past hour), "1d" (24 hours), "1w" (week), "1m" (month), "" (all time)
maxResultsinteger20Maximum articles per query (10-200)

Output Data

Each article in the dataset contains:

FieldTypeDescription
titlestringArticle headline
googleNewsUrlstringOriginal Google News URL
articleUrlstring or nullActual article URL (resolved via browser enrichment)
sourcestringPublisher name (e.g., "TechCrunch")
websiteNamestring or nullWebsite name from meta tags
websiteUrlstring or nullPublisher's homepage URL
imageUrlstring or nullArticle image URL
pubDatestringPublication date from RSS feed
timestampstringISO 8601 timestamp
descriptionstringArticle summary from RSS
excerptstringExtended description (from enrichment or RSS)
articleContentobject or nullExtracted article text with character and token counts
enrichmentTimenumber or nullTime taken for browser enrichment in milliseconds
guidstringUnique article identifier
searchQuerystringThe keyword that found this article
regionstringRegion used for this search
languagestringLanguage used for this search
scrapedAtstringISO 8601 timestamp when the article was scraped

Note: Fields marked "or null" may be null when browser enrichment cannot access the target website (e.g., paywalled sites, geographic restrictions, or sites that block automated access).

When available, articleContent contains:

  • content — Article text (max 2,000 characters)
  • characterCount — Total character count of the content
  • tokenCount — Token count: each CJK character counts as 1 token, non-CJK text is split by spaces

Example Output

{
"title": "Major AI Breakthrough Announced by Research Team",
"googleNewsUrl": "https://news.google.com/rss/articles/...",
"articleUrl": "https://techcrunch.com/2026/02/27/ai-breakthrough",
"source": "TechCrunch",
"websiteName": "TechCrunch",
"websiteUrl": "https://techcrunch.com",
"imageUrl": "https://techcrunch.com/wp-content/uploads/...",
"pubDate": "Thu, 27 Feb 2026 10:30:00 GMT",
"timestamp": "2026-02-27T10:30:00.000Z",
"description": "Researchers announce breakthrough in artificial intelligence...",
"excerpt": "A team of researchers has announced a significant breakthrough...",
"articleContent": {
"content": "Full article text extracted from the website...",
"characterCount": 1850,
"tokenCount": 312
},
"enrichmentTime": 4520,
"guid": "CBMi...",
"searchQuery": "artificial intelligence",
"region": "us",
"language": "en",
"scrapedAt": "2026-02-27T10:35:00.000Z"
}

Quick Start Examples

Monitor Technology News

{
"searchQueries": ["artificial intelligence", "ChatGPT", "technology"],
"language": "en",
"region": "us",
"dateFilter": "1d",
"maxResults": 50
}

Track Business News in Taiwan

{
"searchQueries": ["台積電", "科技業", "股市"],
"language": "zh-TW",
"region": "tw",
"dateFilter": "1w",
"maxResults": 100
}

Real-time Breaking News

{
"searchQueries": ["breaking news"],
"language": "en",
"region": "us",
"dateFilter": "1h",
"maxResults": 20
}

Japanese Tech News

{
"searchQueries": ["AI", "テクノロジー"],
"language": "ja",
"region": "jp",
"dateFilter": "1d",
"maxResults": 30
}

How to Use

On Apify Platform

  1. Click Start on this Actor's page
  2. Enter your search keywords in the input form
  3. Select language, region, and time filter
  4. Click Run to start scraping
  5. View and export results in JSON, CSV, or Excel

Via API

const { ApifyClient } = require('apify-client');
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('YOUR_ACTOR_ID').call({
searchQueries: ["your keywords"],
language: "en",
region: "us",
dateFilter: "1d",
maxResults: 20
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Cost and Performance

  • Pricing: Pay-per-event — you are charged per Actor run and per result
  • Speed: RSS extraction is fast; browser enrichment adds ~5-10 seconds per article
  • Memory: Optimized for low memory usage with periodic browser restarts
  • Rate Limiting: Built-in rate limiting with exponential backoff to prevent blocking

Tips to Control Costs

  • Start with fewer maxResults (10-20) to test your keywords
  • Use specific keywords for more relevant results
  • Use dateFilter: "1d" to limit to recent articles
  • Fewer search queries per run = lower cost

FAQ

Why are some article URLs null? Some news websites block automated access, use paywalls, or have geographic restrictions. When the browser cannot follow the Google News redirect, the article URL and enriched content will be null. The basic article data (title, source, description) from the RSS feed is always available.

How do I choose the right language and region? Match the language to your target audience's reading language, and the region to the country whose news you want. For example, use language: "zh-TW" and region: "tw" for Traditional Chinese news from Taiwan.

Can I schedule this Actor to run automatically? Yes. On the Apify platform, you can set up schedules to run this Actor at regular intervals (e.g., every hour for breaking news monitoring).

What is the maximum number of articles I can get? Up to 200 articles per query, with up to 50 queries per run. Google News RSS feeds typically return 20-100 articles depending on the topic and time filter.

Does this Actor use proxies? No. This Actor accesses Google News RSS feeds directly, which are publicly available. No proxy or residential IP is needed.