Google News Scraper — Articles, Topics & Full Text avatar

Google News Scraper — Articles, Topics & Full Text

Under maintenance

Pricing

Pay per usage

Go to Apify Store
Google News Scraper — Articles, Topics & Full Text

Google News Scraper — Articles, Topics & Full Text

Under maintenance

Scrape Google News articles by keyword, topic, or RSS feed. Extract titles, sources, dates, snippets, full text, images, and related articles. 40+ languages, 70+ countries, four strategies with auto-fallback. For media monitoring, PR, and market research.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Ricardo Akiyoshi

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

a minute ago

Last modified

Categories

Share

Google News Scraper

Scrape Google News articles by keyword, topic, or RSS feed. Extract article titles, publication sources, dates, snippets, full article text, images, and related articles. Supports 40+ languages and 70+ countries.

What does Google News Scraper do?

This actor scrapes Google News using four complementary strategies to deliver comprehensive news coverage:

  1. Google News Search Pages — Searches Google News for your keywords and extracts article cards with titles, sources, timestamps, and snippets
  2. Google News RSS Feeds — Parses structured RSS/XML feeds for reliable, well-formatted article data with related article clusters
  3. Google News Topic Pages — Scrapes curated topic sections (headlines, business, technology, science, health, sports, entertainment, world news)
  4. Full Article Extraction — Follows article links to extract the complete article body text using readability heuristics, plus high-resolution images

All four strategies run in parallel and results are deduplicated automatically. The scraper handles Google's anti-bot measures with proxy rotation, user-agent cycling, and intelligent rate limiting.

Use Cases

  • Media Monitoring — Track brand mentions, competitor coverage, and industry news in real time
  • PR & Communications — Monitor press coverage, measure media reach, and identify journalist sources
  • Market Research — Analyze news sentiment and trends across industries and regions
  • Competitive Intelligence — Track competitor announcements, product launches, and executive changes
  • Academic Research — Collect news datasets for NLP, sentiment analysis, and media studies
  • Financial Analysis — Monitor news for trading signals, earnings coverage, and market-moving events
  • Content Curation — Build automated news feeds and newsletters from multiple topics
  • Crisis Monitoring — Real-time alerts for breaking news about your brand or industry

Input Configuration

FieldTypeDefaultDescription
searchTermsArray of strings[]Keywords or phrases to search for. Each term produces a separate search.
topicEnumheadlinesTopic section: headlines, world, business, technology, science, health, sports, entertainment
languageStringenLanguage code (ISO 639-1): en, es, fr, de, pt, ja, zh, etc.
countryStringUSCountry code (ISO 3166-1): US, GB, DE, FR, BR, JP, IN, AU, etc.
maxAgeEnumpast_dayArticle age filter: past_hour, past_day, past_week, past_month, past_year
maxResultsInteger200Maximum articles to scrape (1-10,000)
proxyObjectApify proxy configuration. Residential proxies recommended for high volume.

Example Input

{
"searchTerms": ["artificial intelligence", "machine learning startups"],
"topic": "technology",
"language": "en",
"country": "US",
"maxAge": "past_week",
"maxResults": 500
}

Output

Each article is saved as a dataset item with the following fields:

FieldTypeDescription
titleStringArticle headline
sourceStringPublisher name (CNN, Reuters, BBC, etc.)
urlStringDirect link to the article
publishedAtStringPublication date in ISO 8601 format
snippetStringArticle summary or preview text
fullTextStringFull article body text (when extraction succeeds)
topicStringCategory or topic label
imagesArrayImage URLs found in the article, with alt text
relatedArticlesArrayRelated article titles and URLs from the same story cluster
searchTermStringThe search term that found this article
scrapedAtStringTimestamp when the article was scraped
strategyStringWhich scraping strategy found the article

Example Output

{
"title": "OpenAI Announces GPT-5 with Breakthrough Reasoning Capabilities",
"source": "Reuters",
"url": "https://www.reuters.com/technology/openai-gpt5-2026-03-01/",
"publishedAt": "2026-03-01T14:30:00.000Z",
"snippet": "OpenAI unveiled its latest language model on Friday, claiming significant improvements in mathematical reasoning and code generation...",
"fullText": "OpenAI unveiled its latest language model on Friday, claiming significant improvements in mathematical reasoning and code generation. The new model, dubbed GPT-5, was trained on...",
"topic": "technology",
"images": [
{
"url": "https://www.reuters.com/images/openai-gpt5.jpg",
"alt": "OpenAI CEO presenting GPT-5"
}
],
"relatedArticles": [
{
"title": "Google DeepMind Responds to GPT-5 Launch",
"url": "https://www.theverge.com/2026/3/1/deepmind-response-gpt5"
}
],
"searchTerm": "artificial intelligence",
"scrapedAt": "2026-03-01T15:00:00.000Z",
"strategy": "rss_feed"
}

Pricing

This actor uses a pay-per-event pricing model:

  • $0.004 per article scraped and added to the dataset
  • Full article text extraction is included at no extra charge
  • Related articles are included at no extra charge

Cost Examples

ArticlesCost
100$0.40
500$2.00
1,000$4.00
5,000$20.00

Scraping Strategies

The scraper uses four strategies simultaneously for maximum coverage:

Strategy 1: Google Search Pages

Queries google.com/search?tbm=nws with your keywords. Parses article cards from search results. Supports pagination for up to 500 results per keyword.

Strategy 2: Google News RSS Feeds

Fetches news.google.com/rss/search?q=keyword for structured XML article data. RSS feeds provide clean, well-formatted data with publication dates and source information. Also extracts related article clusters from RSS descriptions.

Strategy 3: Topic Pages

Scrapes news.google.com/topics/... for curated editorial sections. Provides the same articles you see on the Google News homepage under each topic category.

Strategy 4: Full Article Extraction

Follows article links to the original publisher's website and extracts the full article body text using readability heuristics. Removes navigation, ads, sidebars, cookie banners, and other non-content elements. Skips paywalled domains automatically (WSJ, FT, NYT, etc.).

Tips for Best Results

  1. Use specific search terms — "Tesla quarterly earnings Q1 2026" works better than just "Tesla"
  2. Combine search terms with topics — Search terms and topic pages run in parallel, maximizing coverage
  3. Use residential proxies for large scrapes (1000+ articles) to avoid Google rate limits
  4. Set appropriate maxAge — Use past_hour for breaking news, past_week for research
  5. Monitor the run log for blocked request warnings. If you see many blocks, add proxy configuration
  6. Full text extraction works best on major news sites (Reuters, AP, BBC, CNN). Some sites may block or return partial content

Limitations

  • Google may rate-limit requests without proxy configuration
  • Paywalled sites (WSJ, FT, NYT, Bloomberg, etc.) are skipped for full-text extraction
  • Some Google News redirect URLs may not decode properly for newer article formats
  • Maximum ~500 search results per keyword (Google's pagination limit)
  • Full article text extraction depends on the publisher's page structure

Integrations

Connect Google News Scraper to other tools and services:

  • Webhooks — Get notified when the run completes
  • API — Start runs and download results programmatically
  • Integrations — Connect to Slack, Google Sheets, Zapier, Make, and more
  • Scheduling — Run hourly, daily, or weekly for continuous monitoring