Google News Scraper - Articles, Sources & Monitoring
Pricing
from $4.00 / 1,000 article scrapeds
Google News Scraper - Articles, Sources & Monitoring
Scrape Google News by keyword, topic, top headlines or city. Get real publisher URLs (decoded, not redirect links), source, date, snippet, related coverage and full article text, author & image. Monitor mode returns only new articles. Export JSON, CSV, Excel.
Pricing
from $4.00 / 1,000 article scrapeds
Rating
0.0
(0)
Developer
Scrape Sage
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Google News Scraper — Articles, Real URLs & Monitoring (Source, Date, Full Text)
Extract complete Google News data by keyword, topic, top headlines, or city — including the field other scrapers get wrong: the real publisher URL. This actor decodes Google News' redirect links into the actual article URL (https://www.reuters.com/...), then optionally opens each article for the full text, author, publish date, lead image, section, and keywords. Turn on monitor mode to return only articles you haven't seen before — perfect for brand, competitor, and topic tracking.
No login, no cookies, no browser, no API key — fast RSS + JSON extraction with 99%+ reliability.
Why this Google News scraper?
Most Google News scrapers hand back the encoded news.google.com/rss/articles/CBMi… redirect link — useless for outreach, de-duplication, or content extraction — and stop at the headline. This actor ships the richest dataset in the category:
| Data | Typical scrapers | This actor |
|---|---|---|
| Real publisher URL (decoded, not a redirect) | ❌ Google redirect link | ✅ https://publisher.com/article |
| Source / publisher name | partial | ✅ |
| Publish date (ISO) | partial | ✅ |
| Snippet / description | partial | ✅ |
| Related coverage cluster (other outlets on the same story) | ❌ | ✅ |
| Full article text | ❌ | ✅ opt-in |
| Author(s), publish & modified dates | ❌ | ✅ opt-in |
| Lead image, section, keywords, word count | ❌ | ✅ opt-in |
| Search / topic / local / headlines feeds in one actor | partial | ✅ |
| Monitor mode — only new articles since last run | ❌ | ✅ |
| Any language & country edition | partial | ✅ |
Use cases
- Media & brand monitoring — track every mention of your brand, product, or executives across thousands of publishers. Run on a Schedule with monitor mode to get only the new articles each time.
- Competitor & market intelligence — watch competitors, categories, and industry topics; feed alerts into Slack or a CRM the moment something breaks.
- PR & reputation tracking — measure coverage volume, see which outlets pick up a story (via the related-coverage cluster), and capture author bylines for outreach.
- AI, RAG & LLM pipelines — clean, LLM-ready JSON with full article text is ideal for summarization, sentiment, embeddings, and retrieval-augmented generation.
- News aggregation & newsletters — power apps, digests, and dashboards with structured, deduplicated news for any keyword, topic, or city.
- Finance & trading signals — pull the latest news for tickers, companies, and sectors with precise timestamps and real source URLs.
How to use
- Sign up for Apify — the free plan is enough to try this actor.
- Open the Google News Scraper, enter search terms (and/or topics, locations, or paste Google News URLs), and click Start.
- Watch results stream into the dataset table.
- Export as JSON, CSV, Excel, XML, or RSS — or pull results programmatically via the Apify API.
Input
{"searchTerms": ["electric vehicles", "\"interest rates\""],"topics": ["TECHNOLOGY", "BUSINESS"],"locations": ["San Francisco"],"includeTopHeadlines": false,"language": "en-US","country": "US","timeWindow": "7d","resolveArticleUrls": true,"includeArticleContent": true,"maxItems": 200,"monitorMode": true,"monitorKey": "ev-watch"}
- searchTerms — keywords/phrases. Google News operators work:
"exact phrase",OR,-exclude,site:reuters.com,intitle:,when:7d. - topics —
WORLD,NATION,BUSINESS,TECHNOLOGY,ENTERTAINMENT,SPORTS,SCIENCE,HEALTH, or a rawCAAq…topic token. - locations — cities/regions for local news (
New York,California,London). - includeTopHeadlines — also pull the top-stories feed for your edition.
- startUrls — paste Google News search/topic/RSS/article URLs (or any publisher article URL) directly.
- language / country — the Google News edition (
hl/gl), e.g.en-GB+GB,de+DE,pt-BR+BR. - timeWindow — restrict search-term feeds to recent articles (
1h,12h,1d,7d,1y). - resolveArticleUrls (default true) — decode every Google News link into the real publisher URL.
- includeArticleContent (default false) — open each article for full text, author, dates, image, section, keywords, word count.
- includeRelatedArticles (default true) — attach the related-coverage cluster for each story.
- maxItems / maxItemsPerFeed — output caps to control cost.
- sinceDate — keep only articles since an ISO date or relative window (
24h,3d,2w). - includeKeywords / excludeKeywords / includeSources / excludeSources — client-side filters.
- dedupeByResolvedUrl (default true) — drop the same story repeated across feeds.
- monitorMode / monitorKey (default false /
default) — return only articles not seen in previous runs.
Output
One record per article (type: "article"):
{"type": "article","title": "The Cybercab is the lightest, most efficient Tesla ever made","source": "The Verge","url": "https://www.theverge.com/transportation/950596/tesla-cybercab-efficient-weight-range-epa","urlResolved": true,"googleNewsUrl": "https://news.google.com/rss/articles/CBMiqwFBVV95cUx…","publishedAt": "2026-06-16T15:02:39.000Z","snippet": "Against all odds, the Tesla Cybercab is in production…","imageUrl": "https://platform.theverge.com/wp-content/uploads/…/cybercab.jpg","feedType": "search","query": "tesla","topic": null,"locationQuery": null,"language": "en-US","country": "US","relatedArticles": [{ "title": "Tesla starts Cybercab production", "source": "Electrek", "googleNewsUrl": "https://news.google.com/rss/articles/CBMi…" }],"relatedCount": 2,"contentExtracted": true,"author": "Andrew J. Hawkins","authors": ["Andrew J. Hawkins"],"articlePublishedAt": "2026-06-16T15:02:39.000Z","articleModifiedAt": "2026-06-16T15:18:02.000Z","section": "Transportation","keywords": ["Autonomous Cars", "Electric Cars", "Tesla", "Transportation"],"wordCount": 496,"canonicalUrl": "https://www.theverge.com/transportation/950596/tesla-cybercab-efficient-weight-range-epa","fullText": "Against all odds, the Tesla Cybercab is in production. And while…","scrapedAt": "2026-06-16T16:49:35.000Z"}
What to expect (field coverage)
| Field group | Always present | Present when enabled / published |
|---|---|---|
| Core | title, source, googleNewsUrl, publishedAt, feedType, discovery context | snippet (search feeds & content) |
| Real URL | — | url + urlResolved with resolveArticleUrls (~95–100% resolve) |
| Related coverage | relatedCount | relatedArticles on clustered stories (topic/headlines/local) |
| Full content | — | fullText, author(s), dates, imageUrl, section, keywords, wordCount with includeArticleContent |
A blank field means the publisher didn't expose it (e.g. some sites omit author/section, and paywalled articles return only a summary) — never because the scraper skipped it. Nothing is dropped, so you always get the richest record available.
Automate & schedule
Run this actor on autopilot and pull results into your own stack:
- Apify API — start runs, fetch datasets, and manage schedules over REST.
- apify-client for JavaScript and apify-client for Python — official SDKs.
- Schedules — run it hourly/daily to monitor a brand, competitor, topic, or city. Pair with monitor mode so each run returns only the new articles.
- Webhooks — trigger downstream actions (CRM import, Slack alert, summarizer) the moment a run finishes.
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'MY_APIFY_TOKEN' });const run = await client.actor('scrapesage/google-news-scraper').call({searchTerms: ['"my brand"'],language: 'en-US',country: 'US',resolveArticleUrls: true,includeArticleContent: true,monitorMode: true,monitorKey: 'my-brand',});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(`Got ${items.length} new articles`);
Integrate with any app
Connect the dataset to 5,000+ apps — no code required:
- Make — multi-step automation scenarios.
- Zapier — push new articles straight into Slack, Sheets, or your CRM.
- Slack — get notified when a monitored search finds news.
- Google Drive / Sheets — auto-export every run to a spreadsheet.
- Airbyte — pipe results into your data warehouse.
- GitHub — trigger runs from commits or releases.
Use with AI assistants (MCP)
The output is clean, LLM-ready JSON with full article text. Call this actor from Claude, ChatGPT, or any agent framework through the Apify MCP server — ask your assistant to "monitor Google News for my company and summarize today's coverage" and let it run this scraper for you.
More scrapers from scrapesage
Build a complete media monitoring & market-intelligence stack:
- Telegram Scraper — channels, messages, media, and search.
- Substack Scraper — newsletters, posts, and creator leads.
- YouTube Scraper — channels, videos, and creator leads.
- Bluesky Scraper — profiles, posts, followers, and leads.
- Google Ads Transparency Scraper — who's advertising what on Google.
- Facebook Ad Library Scraper — competitor ad intelligence on Meta & Instagram.
- Product Hunt Scraper — launches, makers, and leads.
- GitHub Scraper — repos, developers, and contact leads.
Tips
- Real URLs are the value — keep
resolveArticleUrlson. It decodes the Google News redirect into the actual publisher link so you can de-duplicate, click through, and extract content. - Full text — turn on
includeArticleContentfor summarization, sentiment, and embeddings. It adds one fast request per article; paywalled sites return only a summary. - Monitoring — combine Schedules with
monitorMode+ a uniquemonitorKeyper brand/topic to receive only new articles each run. Monitor mode is independent of the schedule — the schedule starts the run, monitor mode deduplicates against prior runs. - Coverage depth — Google News returns up to ~100 articles per feed. To go wider, split into several
searchTerms, addtopics/locations, or narrow withtimeWindow. - Editions — set
language+countryto target a specific edition (e.g.en-GB+GB,de+DE). Local news useslocations.
FAQ
How do I get the real article URL instead of the Google News link? It's automatic — resolveArticleUrls is on by default. Every record includes both url (the decoded publisher URL) and googleNewsUrl (the original Google News link).
Does it need the Google News API or a key? No. Google News has no public API for this; the actor reads the public RSS feeds and resolves URLs the same way a browser does — no key or login needed.
Can I get the full article text? Yes — enable includeArticleContent. The actor opens each article and extracts the body, author, dates, image, section, and keywords from the page's structured data, with a readability fallback.
How do I monitor news automatically? Create a Schedule (e.g. hourly), turn on monitorMode, and give each watch-list a unique monitorKey. Each run returns only articles not seen before. Add a webhook or Zapier zap to push them into Slack or your CRM.
Does monitor mode conflict with Apify Schedules? No — they're complementary. The schedule decides when the actor runs; monitor mode decides what's new by comparing against a named key-value store from previous runs.
Can I scrape news in other languages and countries? Yes. Set language (e.g. fr, de, pt-BR) and country (e.g. FR, DE, BR) to any Google News edition.
Can I export to Google Sheets, CSV, or Excel? Yes — one click in the dataset view, or automatically on every run via the Google Drive integration.
A field is empty — why? Some publishers don't expose an author, section, or keywords, and paywalled articles return only a summary. Fields are blank only when the data isn't published — never because the scraper skipped them.
Is scraping Google News legal? This actor collects publicly available data only. You're responsible for using the data in compliance with applicable laws (e.g. GDPR/CCPA for personal data), Google's terms, and each publisher's terms — including copyright when storing full article text.
Need help?
Open an issue on the actor's Issues tab, or visit the Apify help center. Feature requests are welcome — this actor is actively maintained.