# Google News Scraper (`solidcode/google-news-scraper`) Actor

Search Google News by keyword and extract article titles, sources, publication dates, URLs, and thumbnails. Filter by time range, language, and country.

- **URL**: https://apify.com/solidcode/google-news-scraper.md
- **Developed by:** [SolidCode](https://apify.com/solidcode) (community)
- **Categories:** News, Automation, Developer tools
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Google News Scraper

Extract fresh news articles from Google News at scale — headlines, publishers, bylines, publication dates, thumbnails, and direct publisher URLs for any topic, keyword, or news section worldwide. Built for analysts, researchers, and teams who need reliable news data without monthly rental fees.

### Why This Scraper?

- **Every Google News surface in one actor** — keyword search, topic sections (Business, Tech, Sports…), publisher feeds, and Full Coverage story pages, all with the same clean output
- **Advanced search operators** — exact-phrase matching, title-only search, include/exclude specific publishers, and term exclusions — no need to craft Google query strings yourself
- **Bypass the 100-article cap** — set a date range and the scraper automatically sweeps it in windows to return thousands of articles per keyword
- **Direct publisher URLs** — optional one-click resolution of Google's `/read/...` redirects into the real publisher links (nytimes.com, bbc.com, reuters.com, …)
- **19 languages, 47 countries** — get localized news from the US, UK, France, Germany, Japan, Brazil, India, and more
- **Author bylines included** — extracted automatically when Google exposes them (~30% of articles)
- **Clean, predictable output** — 13 typed fields, ISO 8601 timestamps, no base64 image bloat, no emoji-laden enum values
- **Pay only for what you get** — no monthly rental, transparent per-result pricing

### Use Cases

**Media Monitoring & PR**
- Track brand, product, or executive mentions across global news outlets
- Monitor competitor coverage in near real-time with the "Past hour" or "Past 24 hours" filter
- Build press clip archives by publisher for stakeholder reports

**Market & Trend Research**
- Measure volume and sentiment around emerging topics, industries, or technologies
- Pull a full year of coverage on a topic with the date-range sweep
- Localize research by country and language for international markets

**Financial & Investment Intelligence**
- Gather news around public companies, tickers, or macro themes
- Track regulatory, policy, or geopolitical stories by country
- Feed fresh news into sentiment or event-driven trading models

**Content & SEO**
- Identify trending stories inside a niche or topic section
- Analyze which publishers dominate coverage of a given keyword
- Curate topical newsletters automatically

**Academic & Journalistic Research**
- Build news corpora for NLP, bias, or misinformation research
- Investigate how a single event was covered by different publishers with Full Coverage pages
- Archive news for longitudinal studies with arbitrary date ranges

### Getting Started

#### Simple Keyword Search

The minimum input — one or more keywords:

```json
{
    "keywords": ["Artificial Intelligence"],
    "maxResults": 50
}
````

#### Recent News Only

Limit to the past 24 hours and sort newest first:

```json
{
    "keywords": ["OpenAI", "Anthropic"],
    "timeFilter": "24h",
    "sortBy": "date",
    "maxResults": 100
}
```

#### Advanced Operators — Precise Queries

Match an exact phrase, limit to the article title, and filter by publisher:

```json
{
    "keywords": ["climate change"],
    "exactPhrase": true,
    "inTitleOnly": true,
    "sources": ["BBC", "Reuters", "The Guardian"],
    "excludeTerms": ["opinion"],
    "timeFilter": "week",
    "maxResults": 100
}
```

#### Historical Sweep — Beyond the 100-Article Cap

Set a custom date range to pull thousands of articles by automatically splitting the range into smaller windows:

```json
{
    "keywords": ["Federal Reserve"],
    "dateFrom": "2024-01-01",
    "dateTo": "2024-12-31",
    "maxResults": 0,
    "sortBy": "date"
}
```

#### Topic Sections, Publisher Feeds, and Story Pages

Paste any Google News URL — topic, publisher, or Full Coverage page:

```json
{
    "startUrls": [
        "https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB",
        "https://news.google.com/publications/CAAqBwgKMKbdrQww0L-7Aw",
        "https://news.google.com/stories/CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lnODRLd0NoRC1sTU9qQmlnQVAB"
    ],
    "maxResults": 100
}
```

#### Direct Publisher URLs

Resolve Google's `/read/...` redirects into the real publisher links:

```json
{
    "keywords": ["SpaceX launch"],
    "timeFilter": "week",
    "resolvePublisherUrls": true,
    "maxResults": 25
}
```

#### Localized News — Any Language, Any Country

Get news for France, in French:

```json
{
    "keywords": ["élection"],
    "language": "fr",
    "country": "FR",
    "timeFilter": "week",
    "maxResults": 50
}
```

### Input Reference

#### Search

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `keywords` | string\[] | `["Technology"]` | Topics or terms to search. Each keyword runs independently. Use `OR` inside a single entry (e.g. `"apple OR tesla"`) to combine two queries into one. |
| `exactPhrase` | boolean | `false` | Wrap each keyword in quotes so Google matches the phrase exactly. |
| `inTitleOnly` | boolean | `false` | Only match keywords that appear in the article title. Cuts noise from articles that mention the term in passing. |
| `excludeTerms` | string\[] | `[]` | Words to exclude from results (e.g. `["opinion", "sponsored"]`). |
| `sources` | string\[] | `[]` | Restrict results to specific publishers (e.g. `["BBC", "Reuters"]`). Multiple sources combined with OR. |
| `excludeSources` | string\[] | `[]` | Skip articles from specific publishers. |

#### URL Mode

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `startUrls` | URL\[] | `[]` | Any `news.google.com` URL — search, topic, publisher feed, or Full Coverage story page. Each URL is fetched and all article cards are extracted. Bypasses keyword operators. |

Either `keywords` or `startUrls` must be provided. Both can be set — keywords are processed first.

#### Time Filter

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `timeFilter` | string | `"any"` | Preset window: `any`, `hour`, `24h`, `week`, `month`, `year`. |
| `dateFrom` | string | — | Start of custom date range (`YYYY-MM-DD`, inclusive). |
| `dateTo` | string | — | End of custom date range (`YYYY-MM-DD`, inclusive). When both are set, the range is swept in smaller windows — can return thousands of articles per keyword. |

#### Localization

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `language` | string | `"en"` | Interface language — English, Spanish, French, German, Italian, Portuguese, Dutch, Polish, Russian, Turkish, Japanese, Korean, Chinese, Arabic, Hindi, Indonesian, Thai, Vietnamese. |
| `country` | string | `"US"` | Country bias for results — 47 countries supported, or `"any"` for no regional preference. |

#### Output Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `includeAuthor` | boolean | `true` | Extract the article byline when shown on the card. |
| `resolvePublisherUrls` | boolean | `false` | Recover the real publisher URL for each Google News redirect. Adds a small cost per resolved article. |
| `sortBy` | string | `"relevance"` | `"relevance"` keeps Google's native ordering; `"date"` re-sorts by publication date (newest first). |
| `deduplicateAcrossKeywords` | boolean | `false` | Drop articles that appear in more than one keyword's results. Off by default — each keyword's results are independent. |

#### Limits

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `maxResults` | integer | `50` | Max articles per keyword or start URL. Use `0` for as many as Google returns. Without a date range, Google caps at ~100 per keyword. |
| `maxRequestsPerKeyword` | integer | `300` | Safety cap on date-range sweep queries per keyword. |

### Output

Each record is one article with up to 13 structured fields:

```json
{
    "keyword": "Artificial Intelligence",
    "sourceUrl": "https://news.google.com/search?q=Artificial+Intelligence+when%3A7d&hl=en-US&gl=US&ceid=US%3Aen",
    "title": "OpenAI unveils new reasoning model with major benchmark gains",
    "description": null,
    "source": "The New York Times",
    "author": "Cade Metz",
    "url": "https://news.google.com/read/CBMiiAFBVV95cUxPa...",
    "publisherUrl": "https://www.nytimes.com/2026/04/18/technology/openai-reasoning-model.html",
    "publishedAt": "2026-04-18T14:32:00Z",
    "publishedRelative": "2 days ago",
    "imageUrl": "https://lh3.googleusercontent.com/...",
    "storyUrl": "https://news.google.com/stories/CAAqNggKIjBDQklTSGpvSmMzUnZjbmt0TXpZd1NoRUtEd2lnODRLd0NoRC1sTU9qQmlnQVAB",
    "scrapedAt": "2026-04-20T13:45:12Z"
}
```

#### All Available Fields

| Field | Type | Description |
|-------|------|-------------|
| `keyword` | string | null | The search keyword that produced this result. `null` for start-URL results. |
| `sourceUrl` | string | The URL this record was fetched from — lets you trace records back to their input. |
| `title` | string | Article headline. |
| `description` | null | Always `null` — Google News cards don't include article snippets. Kept for schema stability. |
| `source` | string | Publisher name (e.g. "The New York Times"). |
| `author` | string | null | Article byline when Google exposes it on the card. |
| `url` | string | Google News redirect URL for the article. Always present. |
| `publisherUrl` | string | null | Direct publisher URL. Populated when `resolvePublisherUrls` is enabled. |
| `publishedAt` | string | null | ISO 8601 publication timestamp (UTC). |
| `publishedRelative` | string | Raw relative timestamp as shown by Google in the selected language (e.g. `"3 days ago"`, `"Il y a 2 heures"`). |
| `imageUrl` | string | null | Thumbnail URL (~99% of cards have one). |
| `storyUrl` | string | null | URL to the Full Coverage page for this story, when Google groups the article into one. |
| `scrapedAt` | string | ISO 8601 timestamp of when the record was captured. |

### Tips for Best Results

- **Beat the 100-article cap with a date range** — any query capped at ~100 results can be expanded into thousands by setting `dateFrom` and `dateTo`. The scraper automatically splits the range into smaller windows.
- **Use exact-phrase + title-only for precision** — combining `exactPhrase: true` with `inTitleOnly: true` removes ~90% of noise from broad keyword searches.
- **Narrow by publisher with `sources`** — ideal for monitoring a specific outlet's coverage of a topic without scraping their entire site.
- **Turn on `resolvePublisherUrls` selectively** — it costs two extra HTTP requests per article. Leave it off for large scans and only enable when direct publisher links matter.
- **Match `language` to `country`** — for best results, pair the market you care about with its primary language (e.g. `language: "ja"` + `country: "JP"`). Mismatches can silently return generic results.
- **Sort by date for freshness** — `sortBy: "date"` is perfect for newsfeed-style use cases where the newest articles matter most.

### Pricing

**$1.00 per 1,000 articles** — no monthly rental, no tiered plans. You pay only for the articles you actually receive.

| Articles | Estimated Cost |
|----------|----------------|
| 100 | $0.10 |
| 1,000 | $1.00 |
| 10,000 | $10.00 |
| 100,000 | $100.00 |

**Optional**: `resolvePublisherUrls` adds **$1.00 per 1,000 resolved URLs** on top (charged only when a direct publisher URL was successfully recovered). Failed decodes are never charged.

Platform fees (compute, proxy, storage) are additional and depend on your Apify plan.

### Integrations

Export data in JSON, CSV, Excel, XML, or RSS. Connect to 1,500+ apps via:

- **Zapier** / **Make** / **n8n** — Workflow automation
- **Google Sheets** — Direct spreadsheet export
- **Slack** / **Email** — Notifications on new results
- **Webhooks** — Get notified when a run completes
- **Apify API** — Full programmatic access

### Legal & Ethical Use

This actor is designed for legitimate media monitoring, market research, and journalistic work. Users are responsible for complying with applicable laws and Google's Terms of Service, as well as the terms of the publishers whose articles appear in the results. Respect copyright when redistributing headlines or summaries, and do not use collected data for spam, harassment, or any illegal purpose.

# Actor input Schema

## `keywords` (type: `array`):

One or more topics or terms to search on Google News. Each keyword is searched independently and results are combined in the output. You can also use 'OR' inside a single entry (e.g. "apple OR tesla") to merge two queries into one.

## `exactPhrase` (type: `boolean`):

Wrap each keyword in quotes so Google matches the phrase exactly (e.g. "artificial intelligence" — articles must contain those words next to each other in that order, not just both words anywhere).

## `inTitleOnly` (type: `boolean`):

Only match keywords that appear in the article title (uses Google's intitle: operator). Reduces noise from articles that mention the term in passing.

## `excludeTerms` (type: `array`):

Words to exclude from results (e.g. \["opinion", "sponsored"]). Applied as -term to every keyword.

## `sources` (type: `array`):

Restrict results to specific publishers (e.g. \["BBC", "Reuters"]). Use the publisher name as it appears on Google News. Multiple sources are combined with OR.

## `excludeSources` (type: `array`):

Skip articles from specific publishers (e.g. \["CNN", "Fox News"]).

## `startUrls` (type: `array`):

Any news.google.com URL — search URL, topic page (e.g. /topics/CAAqI...), publisher feed (/publications/...), or Full Coverage story page (/stories/...). Each URL is fetched and all article cards on the page are extracted. Bypasses keyword and operator settings (it's a direct pass-through).

## `timeFilter` (type: `string`):

Limit results to articles published within this time window. Ignored when a custom date range is provided below.

## `dateFrom` (type: `string`):

Earliest publication date to include, in YYYY-MM-DD format (e.g. 2024-01-01). When both start and end dates are set, the scraper sweeps the full range and can return far more than 100 results per keyword by splitting the range into smaller windows.

## `dateTo` (type: `string`):

Latest publication date to include, in YYYY-MM-DD format (e.g. 2024-12-31). Both bounds are inclusive. When set with a start date, it overrides the time range filter above.

## `language` (type: `string`):

Interface language for news results.

## `country` (type: `string`):

Country bias for results. Pick 'Any' for no regional preference.

## `includeAuthor` (type: `boolean`):

Extract the article author / byline when shown on the card (~30% of articles include one — Google strips it from the rest).

## `resolvePublisherUrls` (type: `boolean`):

Recover the real publisher URL (e.g. nytimes.com/...) for each Google News /read/... link. Costs two extra HTTP requests per article (Google encrypts the publisher URL — decoding requires fetching a per-article signature first). Leave off unless you need direct publisher links.

## `sortBy` (type: `string`):

Order of records in the output. 'Relevance' keeps Google's native ordering; 'Date (newest first)' re-sorts by publication date after fetching.

## `deduplicateAcrossKeywords` (type: `boolean`):

When you search multiple keywords or start URLs, drop articles that appear in more than one set of results (kept once, attributed to whichever keyword saw it first). By default each keyword's results are independent.

## `maxResults` (type: `integer`):

Maximum number of articles to return for each keyword or start URL. Use 0 for as many as Google returns. Without a date range, Google caps responses at ~100 articles per keyword; with a date range, the scraper can return thousands.

## `maxRequestsPerKeyword` (type: `integer`):

Safety limit on how many date-window queries the scraper will fire for a single keyword. Raise for wide ranges on high-volume keywords; lower to cap compute. Only applies when a custom date range is set.

## Actor input object example

```json
{
  "keywords": [
    "Technology",
    "Artificial Intelligence"
  ],
  "exactPhrase": false,
  "inTitleOnly": false,
  "excludeTerms": [],
  "sources": [],
  "excludeSources": [],
  "startUrls": [],
  "timeFilter": "any",
  "language": "en",
  "country": "US",
  "includeAuthor": true,
  "resolvePublisherUrls": false,
  "sortBy": "relevance",
  "deduplicateAcrossKeywords": false,
  "maxResults": 50,
  "maxRequestsPerKeyword": 300
}
```

# Actor output Schema

## `overview` (type: `string`):

Table of scraped news articles with title, source, author, publication date, and link.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "keywords": [
        "Technology",
        "Artificial Intelligence"
    ],
    "exactPhrase": false,
    "inTitleOnly": false,
    "excludeTerms": [],
    "sources": [],
    "excludeSources": [],
    "startUrls": [],
    "timeFilter": "any",
    "language": "en",
    "country": "US",
    "includeAuthor": true,
    "resolvePublisherUrls": false,
    "sortBy": "relevance",
    "deduplicateAcrossKeywords": false,
    "maxResults": 50,
    "maxRequestsPerKeyword": 300
};

// Run the Actor and wait for it to finish
const run = await client.actor("solidcode/google-news-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "keywords": [
        "Technology",
        "Artificial Intelligence",
    ],
    "exactPhrase": False,
    "inTitleOnly": False,
    "excludeTerms": [],
    "sources": [],
    "excludeSources": [],
    "startUrls": [],
    "timeFilter": "any",
    "language": "en",
    "country": "US",
    "includeAuthor": True,
    "resolvePublisherUrls": False,
    "sortBy": "relevance",
    "deduplicateAcrossKeywords": False,
    "maxResults": 50,
    "maxRequestsPerKeyword": 300,
}

# Run the Actor and wait for it to finish
run = client.actor("solidcode/google-news-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "keywords": [
    "Technology",
    "Artificial Intelligence"
  ],
  "exactPhrase": false,
  "inTitleOnly": false,
  "excludeTerms": [],
  "sources": [],
  "excludeSources": [],
  "startUrls": [],
  "timeFilter": "any",
  "language": "en",
  "country": "US",
  "includeAuthor": true,
  "resolvePublisherUrls": false,
  "sortBy": "relevance",
  "deduplicateAcrossKeywords": false,
  "maxResults": 50,
  "maxRequestsPerKeyword": 300
}' |
apify call solidcode/google-news-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=solidcode/google-news-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Google News Scraper",
        "description": "Search Google News by keyword and extract article titles, sources, publication dates, URLs, and thumbnails. Filter by time range, language, and country.",
        "version": "1.0",
        "x-build-id": "c2NkB3bQWLHd8Pb9p"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/solidcode~google-news-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-solidcode-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/solidcode~google-news-scraper/runs": {
            "post": {
                "operationId": "runs-sync-solidcode-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/solidcode~google-news-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-solidcode-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "keywords": {
                        "title": "Search keywords",
                        "type": "array",
                        "description": "One or more topics or terms to search on Google News. Each keyword is searched independently and results are combined in the output. You can also use 'OR' inside a single entry (e.g. \"apple OR tesla\") to merge two queries into one.",
                        "default": [
                            "Technology"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "exactPhrase": {
                        "title": "Match exact phrase",
                        "type": "boolean",
                        "description": "Wrap each keyword in quotes so Google matches the phrase exactly (e.g. \"artificial intelligence\" — articles must contain those words next to each other in that order, not just both words anywhere).",
                        "default": false
                    },
                    "inTitleOnly": {
                        "title": "Title-only match",
                        "type": "boolean",
                        "description": "Only match keywords that appear in the article title (uses Google's intitle: operator). Reduces noise from articles that mention the term in passing.",
                        "default": false
                    },
                    "excludeTerms": {
                        "title": "Exclude terms",
                        "type": "array",
                        "description": "Words to exclude from results (e.g. [\"opinion\", \"sponsored\"]). Applied as -term to every keyword.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "sources": {
                        "title": "Include only these publishers",
                        "type": "array",
                        "description": "Restrict results to specific publishers (e.g. [\"BBC\", \"Reuters\"]). Use the publisher name as it appears on Google News. Multiple sources are combined with OR.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeSources": {
                        "title": "Exclude these publishers",
                        "type": "array",
                        "description": "Skip articles from specific publishers (e.g. [\"CNN\", \"Fox News\"]).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Start URLs (advanced)",
                        "type": "array",
                        "description": "Any news.google.com URL — search URL, topic page (e.g. /topics/CAAqI...), publisher feed (/publications/...), or Full Coverage story page (/stories/...). Each URL is fetched and all article cards on the page are extracted. Bypasses keyword and operator settings (it's a direct pass-through).",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "timeFilter": {
                        "title": "Time range",
                        "enum": [
                            "any",
                            "hour",
                            "24h",
                            "week",
                            "month",
                            "year"
                        ],
                        "type": "string",
                        "description": "Limit results to articles published within this time window. Ignored when a custom date range is provided below.",
                        "default": "any"
                    },
                    "dateFrom": {
                        "title": "Start date (custom range)",
                        "pattern": "^[0-9]{4}-[0-9]{2}-[0-9]{2}$",
                        "type": "string",
                        "description": "Earliest publication date to include, in YYYY-MM-DD format (e.g. 2024-01-01). When both start and end dates are set, the scraper sweeps the full range and can return far more than 100 results per keyword by splitting the range into smaller windows."
                    },
                    "dateTo": {
                        "title": "End date (custom range)",
                        "pattern": "^[0-9]{4}-[0-9]{2}-[0-9]{2}$",
                        "type": "string",
                        "description": "Latest publication date to include, in YYYY-MM-DD format (e.g. 2024-12-31). Both bounds are inclusive. When set with a start date, it overrides the time range filter above."
                    },
                    "language": {
                        "title": "Language",
                        "enum": [
                            "en",
                            "es",
                            "fr",
                            "de",
                            "it",
                            "pt",
                            "nl",
                            "pl",
                            "ru",
                            "tr",
                            "ja",
                            "ko",
                            "zh-CN",
                            "zh-TW",
                            "ar",
                            "hi",
                            "id",
                            "th",
                            "vi"
                        ],
                        "type": "string",
                        "description": "Interface language for news results.",
                        "default": "en"
                    },
                    "country": {
                        "title": "Country / region",
                        "enum": [
                            "any",
                            "US",
                            "GB",
                            "CA",
                            "AU",
                            "IE",
                            "IN",
                            "NZ",
                            "ZA",
                            "FR",
                            "DE",
                            "ES",
                            "IT",
                            "NL",
                            "BE",
                            "PT",
                            "PL",
                            "AT",
                            "CH",
                            "SE",
                            "NO",
                            "DK",
                            "FI",
                            "BR",
                            "MX",
                            "AR",
                            "CL",
                            "CO",
                            "JP",
                            "KR",
                            "CN",
                            "TW",
                            "HK",
                            "SG",
                            "ID",
                            "TH",
                            "VN",
                            "PH",
                            "MY",
                            "RU",
                            "TR",
                            "SA",
                            "AE",
                            "EG",
                            "IL",
                            "NG",
                            "KE"
                        ],
                        "type": "string",
                        "description": "Country bias for results. Pick 'Any' for no regional preference.",
                        "default": "US"
                    },
                    "includeAuthor": {
                        "title": "Include author byline",
                        "type": "boolean",
                        "description": "Extract the article author / byline when shown on the card (~30% of articles include one — Google strips it from the rest).",
                        "default": true
                    },
                    "resolvePublisherUrls": {
                        "title": "Resolve direct publisher URLs",
                        "type": "boolean",
                        "description": "Recover the real publisher URL (e.g. nytimes.com/...) for each Google News /read/... link. Costs two extra HTTP requests per article (Google encrypts the publisher URL — decoding requires fetching a per-article signature first). Leave off unless you need direct publisher links.",
                        "default": false
                    },
                    "sortBy": {
                        "title": "Sort order",
                        "enum": [
                            "relevance",
                            "date"
                        ],
                        "type": "string",
                        "description": "Order of records in the output. 'Relevance' keeps Google's native ordering; 'Date (newest first)' re-sorts by publication date after fetching.",
                        "default": "relevance"
                    },
                    "deduplicateAcrossKeywords": {
                        "title": "Deduplicate across keywords",
                        "type": "boolean",
                        "description": "When you search multiple keywords or start URLs, drop articles that appear in more than one set of results (kept once, attributed to whichever keyword saw it first). By default each keyword's results are independent.",
                        "default": false
                    },
                    "maxResults": {
                        "title": "Max results per keyword",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum number of articles to return for each keyword or start URL. Use 0 for as many as Google returns. Without a date range, Google caps responses at ~100 articles per keyword; with a date range, the scraper can return thousands.",
                        "default": 50
                    },
                    "maxRequestsPerKeyword": {
                        "title": "Max requests per keyword (date-range mode)",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Safety limit on how many date-window queries the scraper will fire for a single keyword. Raise for wide ranges on high-volume keywords; lower to cap compute. Only applies when a custom date range is set.",
                        "default": 300
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
