Google News Scraper — Articles, Topics & Full Text
Pricing
Pay per usage
Google News Scraper — Articles, Topics & Full Text
Scrape Google News articles by keyword, topic, or RSS feed. Extract titles, sources, dates, snippets, full text, images, and related articles. 40+ languages, 70+ countries, four strategies with auto-fallback. For media monitoring, PR, and market research.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Ricardo Akiyoshi
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a minute ago
Last modified
Categories
Share
Google News Scraper
Scrape Google News articles by keyword, topic, or RSS feed. Extract article titles, publication sources, dates, snippets, full article text, images, and related articles. Supports 40+ languages and 70+ countries.
What does Google News Scraper do?
This actor scrapes Google News using four complementary strategies to deliver comprehensive news coverage:
- Google News Search Pages — Searches Google News for your keywords and extracts article cards with titles, sources, timestamps, and snippets
- Google News RSS Feeds — Parses structured RSS/XML feeds for reliable, well-formatted article data with related article clusters
- Google News Topic Pages — Scrapes curated topic sections (headlines, business, technology, science, health, sports, entertainment, world news)
- Full Article Extraction — Follows article links to extract the complete article body text using readability heuristics, plus high-resolution images
All four strategies run in parallel and results are deduplicated automatically. The scraper handles Google's anti-bot measures with proxy rotation, user-agent cycling, and intelligent rate limiting.
Use Cases
- Media Monitoring — Track brand mentions, competitor coverage, and industry news in real time
- PR & Communications — Monitor press coverage, measure media reach, and identify journalist sources
- Market Research — Analyze news sentiment and trends across industries and regions
- Competitive Intelligence — Track competitor announcements, product launches, and executive changes
- Academic Research — Collect news datasets for NLP, sentiment analysis, and media studies
- Financial Analysis — Monitor news for trading signals, earnings coverage, and market-moving events
- Content Curation — Build automated news feeds and newsletters from multiple topics
- Crisis Monitoring — Real-time alerts for breaking news about your brand or industry
Input Configuration
| Field | Type | Default | Description |
|---|---|---|---|
searchTerms | Array of strings | [] | Keywords or phrases to search for. Each term produces a separate search. |
topic | Enum | headlines | Topic section: headlines, world, business, technology, science, health, sports, entertainment |
language | String | en | Language code (ISO 639-1): en, es, fr, de, pt, ja, zh, etc. |
country | String | US | Country code (ISO 3166-1): US, GB, DE, FR, BR, JP, IN, AU, etc. |
maxAge | Enum | past_day | Article age filter: past_hour, past_day, past_week, past_month, past_year |
maxResults | Integer | 200 | Maximum articles to scrape (1-10,000) |
proxy | Object | — | Apify proxy configuration. Residential proxies recommended for high volume. |
Example Input
{"searchTerms": ["artificial intelligence", "machine learning startups"],"topic": "technology","language": "en","country": "US","maxAge": "past_week","maxResults": 500}
Output
Each article is saved as a dataset item with the following fields:
| Field | Type | Description |
|---|---|---|
title | String | Article headline |
source | String | Publisher name (CNN, Reuters, BBC, etc.) |
url | String | Direct link to the article |
publishedAt | String | Publication date in ISO 8601 format |
snippet | String | Article summary or preview text |
fullText | String | Full article body text (when extraction succeeds) |
topic | String | Category or topic label |
images | Array | Image URLs found in the article, with alt text |
relatedArticles | Array | Related article titles and URLs from the same story cluster |
searchTerm | String | The search term that found this article |
scrapedAt | String | Timestamp when the article was scraped |
strategy | String | Which scraping strategy found the article |
Example Output
{"title": "OpenAI Announces GPT-5 with Breakthrough Reasoning Capabilities","source": "Reuters","url": "https://www.reuters.com/technology/openai-gpt5-2026-03-01/","publishedAt": "2026-03-01T14:30:00.000Z","snippet": "OpenAI unveiled its latest language model on Friday, claiming significant improvements in mathematical reasoning and code generation...","fullText": "OpenAI unveiled its latest language model on Friday, claiming significant improvements in mathematical reasoning and code generation. The new model, dubbed GPT-5, was trained on...","topic": "technology","images": [{"url": "https://www.reuters.com/images/openai-gpt5.jpg","alt": "OpenAI CEO presenting GPT-5"}],"relatedArticles": [{"title": "Google DeepMind Responds to GPT-5 Launch","url": "https://www.theverge.com/2026/3/1/deepmind-response-gpt5"}],"searchTerm": "artificial intelligence","scrapedAt": "2026-03-01T15:00:00.000Z","strategy": "rss_feed"}
Pricing
This actor uses a pay-per-event pricing model:
- $0.004 per article scraped and added to the dataset
- Full article text extraction is included at no extra charge
- Related articles are included at no extra charge
Cost Examples
| Articles | Cost |
|---|---|
| 100 | $0.40 |
| 500 | $2.00 |
| 1,000 | $4.00 |
| 5,000 | $20.00 |
Scraping Strategies
The scraper uses four strategies simultaneously for maximum coverage:
Strategy 1: Google Search Pages
Queries google.com/search?tbm=nws with your keywords. Parses article cards from search results. Supports pagination for up to 500 results per keyword.
Strategy 2: Google News RSS Feeds
Fetches news.google.com/rss/search?q=keyword for structured XML article data. RSS feeds provide clean, well-formatted data with publication dates and source information. Also extracts related article clusters from RSS descriptions.
Strategy 3: Topic Pages
Scrapes news.google.com/topics/... for curated editorial sections. Provides the same articles you see on the Google News homepage under each topic category.
Strategy 4: Full Article Extraction
Follows article links to the original publisher's website and extracts the full article body text using readability heuristics. Removes navigation, ads, sidebars, cookie banners, and other non-content elements. Skips paywalled domains automatically (WSJ, FT, NYT, etc.).
Tips for Best Results
- Use specific search terms — "Tesla quarterly earnings Q1 2026" works better than just "Tesla"
- Combine search terms with topics — Search terms and topic pages run in parallel, maximizing coverage
- Use residential proxies for large scrapes (1000+ articles) to avoid Google rate limits
- Set appropriate maxAge — Use
past_hourfor breaking news,past_weekfor research - Monitor the run log for blocked request warnings. If you see many blocks, add proxy configuration
- Full text extraction works best on major news sites (Reuters, AP, BBC, CNN). Some sites may block or return partial content
Limitations
- Google may rate-limit requests without proxy configuration
- Paywalled sites (WSJ, FT, NYT, Bloomberg, etc.) are skipped for full-text extraction
- Some Google News redirect URLs may not decode properly for newer article formats
- Maximum ~500 search results per keyword (Google's pagination limit)
- Full article text extraction depends on the publisher's page structure
Integrations
Connect Google News Scraper to other tools and services:
- Webhooks — Get notified when the run completes
- API — Start runs and download results programmatically
- Integrations — Connect to Slack, Google Sheets, Zapier, Make, and more
- Scheduling — Run hourly, daily, or weekly for continuous monitoring