Google News Scraper - Real URLs & Archives
Pricing
$0.70 / 1,000 results
Google News Scraper - Real URLs & Archives
๐ฐ$0.70/1K result๐ฐ Fast Google News scraper for keywords, topics, publications, and URLs. Get titles, dates, publishers, domains, thumbnails, Google News IDs, optional real publisher URLs, locale presets, filters, dedupe, ticker/entity signals, and archives up to 50K results.
Pricing
$0.70 / 1,000 results
Rating
5.0
(1)
Developer
VortexData
Maintained by CommunityActor stats
1
Bookmarked
9
Total users
7
Monthly active users
2 days ago
Last modified
Categories
Share
๐ฐ Google News Scraper
Scrape Google News search results, topics, publications, and Google News URLs into clean, structured datasets.
Use this actor for media monitoring, brand tracking, market intelligence, SEO research, financial news signals, and AI/RAG news pipelines. It returns article titles, RSS descriptions, publishers, source domains, publication dates, thumbnails, stable Google News IDs, Google News URLs, optional real publisher URLs, locale metadata, and ticker/entity signals.
โ No browser automation, no Google account, no Google API key, and no proxy required for normal runs.
โจ Why Use It
| Need | What this actor gives you |
|---|---|
| ๐ Monitor brands, people, products, or topics | Search one or many keywords across Google News. |
| ๐ Get real article URLs | Resolve Google News redirects into direct publisher URLs when decodeUrls is enabled. |
| ๐ Scrape more than 100 results | Large keyword requests are split by date and deduplicated by Google News article ID. |
| ๐ Work across countries and languages | Choose from 49 country/language presets such as US:en, DE:de, FR:fr, JP:ja, and BR:pt-419. |
| ๐ Export clean data | Dataset rows are flat and CSV-friendly, with fields like title, sourceDomain, publishedAt, publisherUrl, and imageUrl. |
| ๐ฏ Keep results focused | Include or exclude publisher domains such as reuters.com, bbc.com, msn.com, or yahoo.com. |
| ๐ธ Reduce unnecessary proxy cost | Runs direct first by default and uses Apify Proxy only as fallback when configured through API options. |
๐ What You Can Scrape
- ๐ Keyword searches, including Google operators such as
"exact phrase",OR,-exclude,site:domain.com, andintitle:term - ๐๏ธ Google News topics such as Business, Technology, Sports, Health, Science, World, and Entertainment
- ๐ Google News topic, publication, section, search, and RSS URLs
- ๐ Localized editions of Google News by country and language
- ๐๏ธ Recent monitoring windows or larger historical archives
โก Quick Start
- Open the actor on Apify and go to the Input tab.
- Enter one or more search terms in
Search keywords. - Choose
Country and language. - Set
Articles per search. - Keep
Real publisher URLson if you need direct article links. - Run the actor and export the dataset as JSON, CSV, Excel, HTML, XML, or RSS.
Example input:
{"keywords": ["OpenAI", "Anthropic"],"maxArticles": 50,"timeframe": "1d","region_language": "US:en","decodeUrls": true,"extractImages": true}
๐ผ Common Use Cases
๐ Brand Monitoring
Track brand mentions, executives, products, lawsuits, partnerships, or incidents.
{"keywords": ["Acme Corp", "Acme CEO", "\"Acme\" lawsuit"],"timeframe": "1d","maxArticles": 100,"excludeDomains": ["msn.com", "yahoo.com"]}
๐ Market And Competitor Research
Collect news around companies, industries, funding rounds, regulations, or competitors.
{"keywords": ["AI chip market", "NVIDIA OR AMD", "semiconductor supply chain"],"timeframe": "7d","region_language": "US:en","maxArticles": 200,"decodeUrls": true}
๐ Localized News Tracking
Get results from a specific country and language edition of Google News.
{"keywords": ["election"],"region_language": "FR:fr","timeframe": "7d","maxArticles": 100}
๐๏ธ Topic Feeds
Scrape top stories from Google News topic feeds.
{"topics": ["TECHNOLOGY", "BUSINESS"],"region_language": "GB:en","maxArticles": 50}
๐๏ธ Historical Archive
Request more than a single Google News RSS feed usually returns by using a larger article limit.
{"query": "semiconductor supply chain","dateFrom": "2025-01-01","dateTo": "2025-12-31","maxArticles": 5000,"region_language": "US:en"}
๐ฆ Output
Each dataset item is one Google News result.
{"position": 1,"sourcePosition": 3,"keyword": "OpenAI","title": "OpenAI announces new enterprise AI tools","description": "OpenAI announces new enterprise AI tools - Example News","source": "Example News","sourceUrl": "https://www.example.com","sourceDomain": "example.com","url": "https://www.example.com/openai-enterprise-ai-tools","publisherUrl": "https://www.example.com/openai-enterprise-ai-tools","googleNewsUrl": "https://news.google.com/rss/articles/CBMi...","rssLink": "https://news.google.com/rss/articles/CBMi...","guid": "CBMi...","articleId": "CBMi...","publishedAt": "2026-05-29T08:30:00.000Z","publishedTimestamp": 1780043400000,"image": "https://news.google.com/api/attachments/CC8i...","imageUrl": "https://news.google.com/api/attachments/CC8i...","sourceType": "keyword","tickers": ["OPENAI"],"entities": [{"ticker": "OPENAI", "name": "OpenAI", "type": "private"}],"metadata": {"scrapeTimestamp": "2026-05-29T08:31:02.123Z","sourceType": "keyword","timeframe": "1d","region": "US","language": "en","feedUrl": "https://news.google.com/rss/search?q=OpenAI..."}}
๐ Important output fields:
| Field | Meaning |
|---|---|
title | Article headline from Google News. |
description | Plain-text RSS description when Google News provides it. |
source | Publisher name. |
sourceDomain | Normalized publisher domain. |
publishedAt | Publication time in UTC. |
url | Best available article URL. If URL decoding succeeds, this is the publisher URL; otherwise it is the Google News URL. |
publisherUrl | Direct publisher article URL when decodeUrls is enabled and decoding succeeds. |
googleNewsUrl | Original Google News result URL. |
imageUrl | Google News thumbnail URL when available. |
guid / articleId | Stable Google News identifiers, useful for deduplication. |
tickers / entities | Stock, crypto, and company signals detected from headline and RSS description. |
๐งญ Input Reference
Most users only need these fields:
| Field | Default | Description |
|---|---|---|
keywords | [] | Search terms. Add one query per line. |
topics | [] | Optional Google News topic feeds. |
topicUrls | [] | Optional Google News URLs pasted from your browser. |
timeframe | 1d | Search window: 1h, 1d, 7d, 1m, 1y, or all. |
dateFrom / dateTo | - | Optional exact date range. Format: YYYY-MM-DD. |
region_language | US:en | Country and language preset. |
maxArticles | 10 | Number of articles to return per keyword, topic, or URL. |
decodeUrls | true | Resolve Google News links to real publisher URLs. Turn off for the fastest RSS-only runs. |
extractImages | true | Add thumbnails when Google News exposes them. |
extractEntities | true | Extract ticker and company signals. |
includeDomains | [] | Return only these publisher domains. |
excludeDomains | [] | Remove these publisher domains. |
๐ ๏ธ Advanced API-only fields are also supported for automation and compatibility: query, q, searchQuery, queries, startUrls, maxResults, numberOfResults, limit, maxItems, country, language, gl, hl, lr, cr, nfpr, filter, deduplicate, proxyMode, proxyConfiguration, and concurrency/retry tuning.
โก Speed And Cost
๐จ For the fastest and cheapest runs, turn decodeUrls off. RSS-only mode is enough if Google News URLs are acceptable.
๐ Keep decodeUrls on when you need real publisher links. URL decoding adds extra Google News requests per article, so it is slower than RSS-only mode.
๐ผ๏ธ extractImages is usually worth keeping on. It uses Google News thumbnail data and a feed-level image index, not one publisher-page request per article. Some Google News results do not expose thumbnails, so imageUrl can still be null.
๐ธ By default, networking is direct-first. Proxy settings are hidden from the visual UI to avoid accidental proxy spend. API users can still set proxyMode to direct, auto, or apify.
โ ๏ธ Notes And Limitations
- ๐ This actor does not extract full article text. It focuses on stable Google News data: headline, RSS description, publisher, date, source domain, image, Google News URL, and optional resolved publisher URL.
- ๐ผ๏ธ Some articles do not have thumbnails in Google News.
- ๐ Publisher URL decoding depends on Google News redirect/decode behavior and can occasionally fail. When that happens, the item is still returned with the Google News URL.
- ๐ Google News ranking and availability vary by country, language, query, and time.
- ๐๏ธ Very large archive runs depend on how many unique results Google News exposes for the query and date range.
โ FAQ
๐ Can I get direct publisher URLs instead of Google News links?
Yes. Keep decodeUrls enabled. The actor will resolve Google News redirects into publisher article URLs when possible.
๐ Can I scrape more than 100 Google News results?
Yes. Set maxArticles above 100. The actor splits large keyword requests by date and deduplicates results by Google News ID.
๐ Can I scrape Google News in different countries and languages?
Yes. Use region_language, for example US:en, DE:de, FR:fr, JP:ja, or BR:pt-419.
๐ธ Does it use Apify Proxy?
Normal runs are direct-first and do not require a proxy. Advanced API users can enable Apify Proxy fallback or force proxy mode if needed.
๐ Does it extract full article content?
No. Full publisher-page extraction is slower, less reliable across paywalls and consent pages, and often produces sparse or inconsistent records. This actor is optimized for reliable Google News result data.
โ๏ธ Is this actor affiliated with Google?
No. This actor is not affiliated with Google. Users are responsible for using the data lawfully and respecting Google News and publisher terms.
๐ Need help?
Open the Issues tab on the actor page and include your input JSON, run ID, expected result, and one example of missing or incorrect data.