Google News Scraper
Pricing
from $5.00 / 1,000 results
Go to Apify Store
Google News Scraper
A robust, high-performance utility designed for developer automation, data integration, and AI training. Features built-in captcha bypass, headful/headless browser execution, and proxy support to scrape Google data seamlessly, reliably, and at scale.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
codingfrontend
Maintained by CommunityActor stats
1
Bookmarked
4
Total users
1
Monthly active users
4 hours ago
Last modified
Categories
Share
Features
- News Articles: Scrapes news articles from news.google.com for any search query
- Deep Article Extraction: Optionally visits each article page to extract full content, author, and metadata
- Article Metadata: Captures title, source publication, publication date, and author information
- Full Article Text: Extracts word count and reading time estimates for each article
- SEO Metadata: Retrieves canonical URLs, OG tags, Twitter card data, keywords, and article sections
- Media Content: Detects and captures video URLs embedded in articles
- Localization Support: Supports language and country codes for region-specific news feeds
- Proxy Support: Built-in Apify Proxy with residential proxies for reliable news.google.com access
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | String | Yes | "Elon Musk" | Search term to find news articles on Google News |
maxItems | Integer | Yes | 10 | Maximum number of news articles to retrieve (10–5000) |
hl | String | No | "en-US" | Language code for the Google News interface (e.g., en-US, en-IN, fr-FR) |
gl | String | No | "US" | Country code for Google News (e.g., US, IN, GB, FR) |
deepScrape | Boolean | No | true | Visit each article page to extract full content, author, and metadata |
proxyConfiguration | Object | No | Apify Residential | Proxy settings for the scraper |
Input Schema Example
{"query": "artificial intelligence","maxItems": 50,"hl": "en-US","gl": "US","deepScrape": true,"proxyConfiguration": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"]}}
Output Schema
The scraper outputs structured JSON data for each news article found on Google News.
Main Fields
| Field | Type | Description |
|---|---|---|
position | Integer | Article position in the results |
title | String | Article headline |
link | String | URL to the full article |
source | String | Source name (e.g., "CNN", "BBC News") |
date | String | Publication date as shown on Google News |
author | String | Article author (from deep scrape) |
publisher | String | Publisher name |
publishedDate | String | ISO publication date (from deep scrape) |
description | String | Article description or summary |
snippet | String | Short text snippet shown in Google News |
wordCount | Integer | Approximate word count of the article |
readingTime | Number | Estimated reading time in minutes |
deepScrapeStatus | String | Status of deep scraping (success/failed/skipped) |
domain | String | Domain of the article URL |
keywords | String | Keywords extracted from the article metadata |
language | String | Language of the article |
Article Example
{"position": 1,"title": "OpenAI Releases GPT-5 With Improved Reasoning Capabilities","link": "https://www.techcrunch.com/2025/01/15/openai-gpt-5-release","source": "TechCrunch","date": "2 hours ago","author": "Jane Smith","publisher": "TechCrunch","publishedDate": "2025-01-15T08:00:00.000Z","description": "OpenAI has officially launched GPT-5, featuring significantly improved reasoning and coding capabilities...","snippet": "The new model shows a 40% improvement over GPT-4 on standard benchmarks...","wordCount": 850,"readingTime": 4,"deepScrapeStatus": "success","domain": "techcrunch.com","keywords": "OpenAI, GPT-5, AI, language model","language": "en"}