Google News Scraper
Pricing
$19.99/month + usage
Google News Scraper
📰 Google News Scraper extracts real‑time headlines, snippets, publishers, timestamps & links from Google News by topic, keyword, region & language. ⚡ Ideal for media monitoring, PR, SEO, market research & competitive intelligence. 🔎 Clean JSON/CSV output.
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapePilot
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Google News Scraper
Google News Scraper is a real-time Google News scraping tool that collects headlines, publishers, timestamps, links, snippets, and thumbnails from Google News RSS results — filtered by query, region, language, and time period. It solves the pain of manual monitoring by turning Google News into structured, machine-readable data that marketers, developers, data analysts, and researchers can use at scale. With built‑in proxy fallback and pagination strategies, this Google News crawler powers media monitoring, SEO analysis, and automated intelligence pipelines with clean JSON/CSV exports.
What data / output can you get?
Below are the exact fields this Google News data extraction actor saves to the Apify dataset in real time:
| Data type | Description | Example value |
|---|---|---|
| position | Sequential rank within the collected set | 1 |
| title | News article title | “Tesla shares rise after delivery update” |
| link | Direct article URL (resolved from Google redirect) | “https://www.bloomberg.com/news/...” |
| domain | Publisher domain (derived from source name or URL) | “bloomberg.com” |
| source | Publisher name (parsed from RSS title) | “Bloomberg” |
| date | Human‑readable relative time (e.g., “2 hours ago”) | “3 hours ago” |
| date_utc | ISO‑8601 UTC timestamp | “2026-04-03T03:25:12+00:00” |
| snippet | Cleaned article snippet extracted from RSS description | “Company reported stronger‑than‑expected deliveries...” |
| thumbnail | Base64 data URL of a representative image (if found) | “data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ...” |
| block_position | Same as position (useful for downstream ordering) | 1 |
Notes:
- Items are pushed as they’re processed (streaming save).
- You can export your dataset as JSON, CSV, or Excel from the Apify platform.
- Thumbnails are encoded as base64 data URLs when an image is found via Open Graph/Twitter Card tags or on‑page images; if none are found, the field may be an empty string.
Key features
-
⚡ Real‑time RSS extraction Processes Google News RSS feeds for your query and saves items to the dataset as they’re discovered — ideal for real-time dashboards and alerts.
-
🌍 Region & language targeting Control result location and language via gl (Google Country), hl (UI Language), lr (Language Results), and cr (Country Results) to build localized feeds.
-
⏱️ Time period filtering Filter results by last_hour, last_day, last_week, last_month, last_year, or a custom date range (MM/DD/YYYY).
-
🧠 Clean snippets & normalized dates HTML descriptions are cleaned into concise snippets; dates are provided both as human‑readable (date) and ISO‑8601 UTC (date_utc).
-
🖼️ Smart thumbnail capture Attempts Open Graph/Twitter preview images first, then falls back to on‑page images; outputs base64 data URLs for portability.
-
🛡️ Automatic proxy fallback Robust request pipeline escalates from no proxy ➜ datacenter ➜ residential proxies on block/timeout to maximize reliability.
-
🔁 Resilient retries & pacing Built‑in retry logic and gentle rate limiting keep runs stable under varying network conditions.
-
🧩 Developer‑friendly workflow Built with the Apify Python SDK and aiohttp for performance; consume results via the Apify API from Python, Node.js, or any HTTP client to power Google News scraper API workflows.
How to use Google News Scraper - step by step
-
Create or log in to your Apify account Access the Apify Console to run the actor.
-
Open the “Google News Scraper” actor Find it in your dashboard and click “Run”.
-
Enter your input
- Required:
- query: Your search term (e.g., “Elon Musk”)
- maxItems: Number of results to retrieve (100–5000)
- Optional:
- gl (Google Country), hl (UI Language), lr (Language Results), cr (Country Results)
- time_period and optional time_period_min / time_period_max for custom ranges
- nfpr (0/1) to exclude autocorrected results
- filter (0/1) to control similar/omitted results filter
- proxyConfiguration to configure proxies (defaults to no proxy, with automatic fallback inside the run)
- Required:
-
Review key settings Choose your region/language, set a time window (e.g., last_day), and adjust filters (nfpr, filter) as needed.
-
Start the run Click “Start”. The actor will fetch, parse, and stream results to the dataset as it processes strategies and pages.
-
Monitor progress Logs will show proxy fallback decisions, retries, and how many new articles were found.
-
Export results Open the run’s Dataset and export to JSON, CSV, or Excel — or consume via the Apify API for downstream automation.
Pro Tip: Orchestrate scheduled jobs and pipe results into BI dashboards, databases, or NLP pipelines using the Apify API for a seamless Google News aggregator script.
Use cases
| Use case name | Description |
|---|---|
| Media monitoring for PR | Track mentions of brands, executives, and products across regions/languages and export daily digests. |
| SEO trend discovery | Analyze fresh headlines and snippets to spot trending keywords and content gaps for faster content planning. |
| Competitive intelligence | Monitor competitor announcements, launches, and coverage to inform strategy and positioning. |
| Market and finance tracking | Collect business and financial news, earnings updates, and sentiment indicators for analysis. |
| AI/NLP training datasets | Feed clean, structured news items into LLMs, topic modeling, and classification pipelines. |
| Academic & policy research | Build time‑bounded corpora by topic, region, and language for research reproducibility. |
| API-driven news feeds | Power internal apps with a Google News scraper API by consuming the Apify dataset programmatically. |
Why choose Google News Scraper?
Built for precision, automation, and reliability, this Google News web scraping actor turns Google News into structured data with minimal setup.
- 🚀 Fast & scalable: Async fetching with pagination strategies collects hundreds to thousands of items per run.
- 🌐 Multilingual & multi‑region: Fine‑tune gl, hl, lr, cr for localized and language‑specific feeds.
- 🧾 Structured, analytics‑ready output: Clean snippets, normalized dates, and base64 thumbnails in consistent JSON records.
- 🧪 Developer‑ready: Python-based actor; results accessible via the Apify API from any environment (Python, Node.js, etc.).
- 🛡️ Robust reliability: Automatic proxy fallback (none ➜ datacenter ➜ residential) plus retries and backoff.
- 💸 Cost‑effective: Test runs with trial minutes before scaling to larger workloads.
- 🔌 Easy integrations: Export to CSV/JSON/Excel or pull via API into tools and pipelines.
Unlike brittle browser extensions, this production-grade Google News scraping without API keys relies on stable RSS endpoints and resilient networking to deliver consistent, high‑quality data.
Is it legal / ethical to use Google News Scraper?
Yes — when done responsibly. This actor processes publicly available Google News RSS feeds and does not access private or authenticated data.
Guidelines for compliant use:
- Respect platform terms and robots directives.
- Avoid abusive traffic; let retries/backoff handle transient issues.
- Use data for permitted, ethical purposes (research, monitoring, analytics).
- Attribute original publishers where appropriate and follow fair-use principles.
- Consult your legal team for edge cases in your jurisdiction.
Input parameters & output format
Example input JSON
{"query": "Tesla","maxItems": 200,"gl": "United States","hl": "English","lr": "English","cr": "United States","time_period": "last_day","nfpr": 1,"filter": 1,"proxyConfiguration": {"useApifyProxy": false}}
Parameters
| Field | Type | Description | Default | Required |
|---|---|---|---|---|
| maxItems | integer | Maximum number of search results to retrieve (min 100, max 5000). If out of range, the actor clamps to the nearest bound. | 100 | Yes |
| query | string | The search term to use. | “Elon Musk” | Yes |
| gl | string (enum) | Google Country to use for the query (e.g., “United States”). | — | No |
| hl | string (enum) | Google UI Language for returned results (e.g., “English”). | — | No |
| lr | string (enum) | Limit results to a specific language (Language Results). | — | No |
| cr | string (enum) | Limit results to a specific country (Country Results). | — | No |
| time_period | string (enum) | Time period for results: last_hour, last_day, last_week, last_month, last_year, custom. | — | No |
| time_period_min | string | Minimum date for custom time period (MM/DD/YYYY). | — | No |
| time_period_max | string | Maximum date for custom time period (MM/DD/YYYY). | — | No |
| nfpr | integer | Exclude results from auto‑corrected queries (0 or 1). | 0 | No |
| filter | integer | Enable/disable Similar Results and Omitted Results filters (0 or 1). | 1 | No |
| proxyConfiguration | object | Configure proxies. The run starts with no proxy, then falls back to datacenter, then residential if needed. | {"useApifyProxy": false} | No |
Example output item (JSON)
{"position": 1,"title": "Tesla shares rise after delivery update","link": "https://www.bloomberg.com/news/...","domain": "bloomberg.com","source": "Bloomberg","date": "2 hours ago","date_utc": "2026-04-03T03:25:12+00:00","snippet": "Tesla posted stronger-than-expected deliveries for the quarter, boosting investor sentiment ahead of earnings...","thumbnail": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQ...truncated...","block_position": 1}
Notes:
- thumbnail may be empty if no suitable image is found or retrievable.
- date is a friendly relative time; date_utc is an ISO‑8601 UTC timestamp for precise filtering.
FAQ
Is there a free tier or trial?
Yes. This actor includes trial minutes (120) so you can evaluate performance and output before subscribing.
Does it support Python or Node.js integrations?
Yes. The actor is built in Python, and you can consume results from the Apify API using Python, Node.js, or any HTTP client, making it a flexible Google News scraper API for your stack.
How many results can I scrape per run?
You can request between 100 and 5000 items via maxItems. The actor uses multiple strategies and pagination to increase coverage within your configured limits.
What filters can I use for language, country, and time?
You can set gl (Google Country), hl (UI Language), lr (Language Results), cr (Country Results), and time_period (including custom ranges with time_period_min and time_period_max).
What data fields are included in the output?
Each item includes position, title, link, domain, source, date, date_utc, snippet, thumbnail (base64 data URL when available), and block_position.
Does it fetch article images?
Yes. It tries Open Graph/Twitter Card images first, then inspects on‑page images. Valid images are downloaded and returned as base64 data URLs in thumbnail.
How does the proxy system work?
The run starts without a proxy. If blocks or errors occur, it automatically falls back to datacenter proxies and then to residential proxies, with retries and backoff for resilience.
Is this a Google News RSS feed scraper or a SERP scraper?
This actor targets Google News RSS endpoints for reliability and speed, then enriches results (e.g., thumbnails) by visiting article pages when needed.
Final thoughts
Google News Scraper is built to turn Google News into structured, analysis‑ready data for marketers, developers, analysts, and researchers. With robust filters (language, country, time), resilient proxy fallback, and clean JSON output, it streamlines everything from media monitoring to SEO trend analysis and AI dataset creation. Developers can automate end‑to‑end using the Apify API to power real‑time Google News scraping without API keys or browser automation. Start extracting smarter news insights today.