Google News Scraper - Articles, Sources, Real URLs, No API Key
Pricing
Pay per event
Google News Scraper - Articles, Sources, Real URLs, No API Key
Pricing
Pay per event
Rating
0.0
(0)
Developer
Renzo Madueno
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Share
Scrape Google News into clean structured data - with the real publisher URL, not the Google redirect
This Google News scraper turns any keyword search, topic section, top-stories feed, or location feed into structured, export-ready data. For every article you get the headline, the real resolved publisher URL (not the useless news.google.com/rss/articles/... redirect link), the source/publisher name, the publish date in ISO 8601, a clean text snippet, and the image when available.
No Google News API key. No login. No browser automation. Just fast, cheap HTTP requests against the public Google News RSS feeds, with the redirect links decoded into the actual article URLs you can click, crawl, or store.
If you have used other Google News tools and ended up with thousands of news.google.com redirect links that you could not open programmatically, this actor solves exactly that problem. Resolving the real article URL is the single most important feature when you actually want to use the data downstream.
Why this scraper is different
Most Google News scrapers hand back the Google redirect link (https://news.google.com/rss/articles/CBMi...). That link is encoded and cannot be opened, crawled, or de-duplicated reliably - it changes per session and is bound to Google. This actor decodes that token through Google's own resolution endpoint and returns the genuine publisher URL, for example https://www.nytimes.com/2026/06/10/business/economy/back-office-workers-ai.html. Each resolved URL is correctly paired back to its own article, so titles, sources, and URLs always line up.
You also get four input modes in one actor - keyword search, top stories, topic sections, and local/geo headlines - plus full control over language and country.
What data can you extract?
| Field | Type | Description |
|---|---|---|
title | string | Article headline as shown on Google News |
url | string | Real publisher article URL (resolved from the Google redirect) |
urlResolved | boolean | true if the real URL was decoded, false if only the redirect is available |
googleNewsUrl | string | The original Google News redirect link (kept for reference) |
source | string | Publisher / source name (e.g., "The New York Times") |
sourceUrl | string | Publisher homepage URL |
pubDate | string | Publish date in ISO 8601 (e.g., 2026-06-10T09:02:42.000Z) |
description | string | Clean text snippet / summary of the article |
image | string | Article image URL when available |
language | string | Language code used for the request (e.g., en-US) |
country | string | Country code used for the request (e.g., US) |
query | string | The search query (search mode) |
topic | string | The topic section (topic mode) |
mode | string | Which mode produced the record |
scrapedAt | string | ISO 8601 timestamp of when the record was collected |
Use cases
- Media monitoring & PR - Track every time a brand, person, product, or competitor is mentioned in the news, with direct links to the original coverage.
- Brand & reputation tracking - Watch keyword feeds and store the real article URLs for clipping reports and sentiment analysis.
- Market & competitive intelligence - Follow an industry topic (BUSINESS, TECHNOLOGY) and build a dated dataset of headlines and sources.
- News aggregation apps - Power a newsfeed product with clean titles, snippets, images, and clickable publisher links.
- SEO & content research - See which publishers rank in Google News for a query and what angles they cover.
- AI / LLM pipelines & RAG - Feed an LLM real, fetchable article URLs (not dead redirects) so it can read and summarize the source.
- Academic & data science - Collect structured, time-stamped news data for research on coverage, framing, and media trends.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
mode | string | Yes | search | search, topStories, topic, or location |
query | string | When mode=search | artificial intelligence | Keyword(s). Supports Google News operators like "openai" OR "anthropic" |
topic | string | When mode=topic | BUSINESS | One of TOP, WORLD, NATION, BUSINESS, TECHNOLOGY, ENTERTAINMENT, SPORTS, SCIENCE, HEALTH |
location | string | When mode=location | - | City / place name for local headlines (e.g., "New York") |
language | string | No | en-US | Interface language code (e.g., en-US, es-419, fr, de) |
country | string | No | US | Two-letter country code (e.g., US, GB, ES, IN) |
maxItems | integer | No | 50 | Maximum number of articles to return (1-100) |
resolveUrls | boolean | No | true | Decode Google redirect links into the real publisher URL |
Example input
{"mode": "search","query": "artificial intelligence","language": "en-US","country": "US","maxItems": 25,"resolveUrls": true}
Example output
{"title": "The Hidden Workers Most Threatened by A.I. - The New York Times","url": "https://www.nytimes.com/2026/06/10/business/economy/back-office-workers-ai.html","urlResolved": true,"googleNewsUrl": "https://news.google.com/rss/articles/CBMihgFBVV95cUx...?oc=5","source": "The New York Times","sourceUrl": "https://www.nytimes.com","pubDate": "2026-06-10T09:02:42.000Z","description": "The Hidden Workers Most Threatened by A.I.","image": null,"language": "en-US","country": "US","query": "artificial intelligence","mode": "search","scrapedAt": "2026-06-12T10:31:11.000Z"}
How it works
The actor fetches the public Google News RSS feed for your chosen mode, parses each item (title, link, source, date, description, image), de-duplicates the results, and then resolves each Google redirect link into the real publisher URL by calling Google's own article-resolution endpoint in efficient batches. The whole pipeline is HTTP-only - no headless browser - which keeps runs fast and inexpensive. If resolution ever fails for an item, you still keep the headline, source, date, snippet, and the original Google link, and urlResolved is set to false.
Pricing
This actor uses pay-per-event pricing: a small charge per run start plus a small charge per article returned. You only pay for the articles you actually get. There are no monthly platform fees beyond your Apify plan.
Frequently asked questions
Do I need a Google News API key?
No. Google News does not offer a public articles API. This actor uses the public RSS feeds plus URL resolution, so there is nothing to sign up for and no key to manage.
Does it return the real article URL or the Google redirect?
The real publisher URL. With resolveUrls enabled (default), every Google News redirect link is decoded into the actual article URL you can open, crawl, or store. The original redirect is also kept in googleNewsUrl.
Can I scrape Google News in other languages or countries?
Yes. Set language (e.g., es-419, fr, de) and country (e.g., ES, GB, IN) to get localized results, just like changing your Google News region.
What modes are supported?
Keyword search, top stories, topic sections (Business, Technology, Sports, Science, Health, and more), and local/geo headlines for a city or place.
How many articles can I get per run?
Up to 100 articles per run via maxItems. Run the actor multiple times with different queries, topics, or regions to build a larger dataset.
Is scraping Google News legal?
This actor only reads publicly available Google News RSS feeds. You are responsible for complying with Google's terms and applicable laws, and for how you use the collected data. Review the feed's usage notes and consult your legal counsel for commercial use.
What if an article URL cannot be resolved?
You still get the headline, source, publish date, snippet, and the original Google link. The urlResolved field tells you which records have the real URL so you can filter if needed.
Can I use the output with other tools?
Yes. Export to JSON, CSV, or Excel from the Apify dataset, or pull it via the Apify API. The resolved URLs make the data immediately usable in crawlers, dashboards, spreadsheets, and LLM pipelines.