# 📰 Google News Extractor / Scraper (`khadinakbar/google-news-scraper`) Actor

Extract Google News articles by keyword or topic. No login, API key, or cookies required. Bulk search queries, 50+ regions, full-text extraction for AI/RAG, deduplication, MCP-optimized output schema. Export to JSON/CSV or integrate via API.

- **URL**: https://apify.com/khadinakbar/google-news-scraper.md
- **Developed by:** [Khadin Akbar](https://apify.com/khadinakbar) (community)
- **Categories:** News, Automation
- **Stats:** 3 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 article scrapeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 📰 Google News Extractor — No Login, Full Text & Bulk Search

Extract structured news article data from Google News by keyword, topic, or URL — no login, API key, cookies, or browser required. Built for developers, data teams, AI/LLM pipelines, and automated news monitoring workflows.

Runs on [Apify](https://apify.com) — schedule it, call it via API, chain it with other actors, or integrate it directly into your MCP-compatible AI agent.

---

### What It Does

This actor fetches Google News RSS feeds, parses them into clean structured records, and optionally visits article pages to extract full body text. Each run produces a consistent JSON dataset you can export to CSV, Excel, JSON, or push to any downstream system.

**Output fields per article:** `title`, `source_name`, `source_url`, `google_news_url`, `published_at`, `description`, `image_url`, `search_query`, `topic`, `full_text`, `word_count`, `scraped_at`

---

### Key Features

**No login or API key required** — works out of the box with no authentication setup.

**Bulk keyword search** — pass multiple search queries in one run. Each query gets its own RSS feed and result set. Run 50 queries in a single call.

**Built-in topic sections** — monitor World, Technology, Business, Science, Health, Sports, Entertainment, or Nation headlines with a single click.

**50+ region/language editions** — scrape Google News in any edition: US English, German, French, Japanese, Arabic, Spanish, and 45+ more. Every language correctly sets `hl`, `gl`, and `ceid` parameters.

**Google search operators supported** — use quotes for exact phrases, minus to exclude terms, `OR` for alternatives, and `site:` to filter by publisher:
| Operator | Example | Effect |
|----------|---------|--------|
| `"exact phrase"` | `"interest rate hike"` | Match exact wording |
| `-keyword` | `apple -fruit` | Exclude term |
| `OR` | `AI OR "machine learning"` | Either term |
| `site:` | `site:reuters.com earnings` | Publisher filter |

**Time range filtering** — filter by past hour, 24 hours, 7 days, 30 days, or 1 year.

**Full article text extraction** — optionally visit each article URL and extract the full body text. Produces `full_text` and `word_count` fields. Designed for AI/RAG pipelines, sentiment analysis, and NLP workloads.

**Deduplication** — across multiple queries and topics, duplicate articles are removed automatically.

**Custom topic URLs** — paste any Google News section URL to extract niche topic feeds not covered by built-in sections.

**Structured output schema** — fully defined dataset schema with typed fields, descriptions, and examples. MCP-compatible: field names and descriptions are optimized for LLM tool-use routing.

---

### Input Options

| Field | Description |
|-------|-------------|
| `searchQueries` | Array of keywords/phrases. Supports Google operators. |
| `topics` | Built-in sections: WORLD, TECHNOLOGY, BUSINESS, etc. |
| `topicUrls` | Custom Google News section URLs (HTML or RSS format). |
| `startUrls` | Raw RSS feed URLs for advanced use. |
| `maxResultsPerQuery` | 1–100 articles per query/topic (default: 100). |
| `regionLanguage` | Edition code like `US:en`, `DE:de`, `JP:ja` (50+ options). |
| `timeRange` | `1h`, `1d`, `7d`, `30d`, `1y`, or `any`. |
| `extractFullText` | Visit article pages and extract body text. |
| `decodeUrls` | Attempt to resolve real article URLs from Google redirects. |
| `deduplicateResults` | Remove duplicate articles across queries (default: true). |

---

### Output Schema

Each article record has these fields:

| Field | Type | Description |
|-------|------|-------------|
| `title` | string | Article headline, cleaned (source name suffix removed) |
| `source_name` | string | Publisher name (e.g. "Reuters", "BBC News") |
| `source_url` | string\|null | Real article URL (when resolvable) |
| `google_news_url` | string | Google News redirect URL (always present) |
| `published_at` | string | ISO 8601 publication datetime |
| `description` | string\|null | Article snippet/summary |
| `image_url` | string\|null | Article thumbnail image URL |
| `search_query` | string\|null | The query that found this article |
| `topic` | string\|null | Built-in topic section (if applicable) |
| `full_text` | string\|null | Full article body text (requires `extractFullText`) |
| `word_count` | integer\|null | Word count of full text (requires `extractFullText`) |
| `scraped_at` | string | ISO 8601 extraction timestamp |

---

### Usage Examples

#### Monitor a brand or topic daily
```json
{
  "searchQueries": ["OpenAI", "Anthropic", "Google DeepMind"],
  "maxResultsPerQuery": 20,
  "timeRange": "1d",
  "deduplicateResults": true
}
````

#### Collect top headlines across sections

```json
{
  "topics": ["TECHNOLOGY", "BUSINESS", "SCIENCE", "HEALTH"],
  "maxResultsPerQuery": 10
}
```

#### AI/RAG pipeline — full text extraction

```json
{
  "searchQueries": ["large language models", "AI regulation EU"],
  "maxResultsPerQuery": 15,
  "extractFullText": true,
  "timeRange": "7d"
}
```

#### Monitor non-English news

```json
{
  "searchQueries": ["Künstliche Intelligenz", "Bundesliga"],
  "regionLanguage": "DE:de",
  "maxResultsPerQuery": 25
}
```

#### Bulk competitive intelligence

```json
{
  "searchQueries": [
    "Tesla earnings", "Ford EV", "GM electric",
    "Rivian news", "Lucid Motors", "NIO stock",
    "BYD sales", "Volkswagen EV", "BMW electric", "Mercedes EV"
  ],
  "maxResultsPerQuery": 10,
  "timeRange": "7d",
  "deduplicateResults": true
}
```

***

### Performance & Cost

| Mode | Articles/min | Cost estimate |
|------|-------------|---------------|
| Metadata only | ~200 | $0.003 per article |
| With full text | ~30–60 | $0.003 + $0.005 per article |

**Example costs:**

- 100 articles (metadata): **$0.30**
- 100 articles (with full text): **$0.80**
- 1,000 articles (metadata only): **$3.00**
- Daily monitoring, 10 queries × 20 articles: **$0.60/day**

Google News RSS returns up to 100 articles per feed. A single run with 10 search queries can collect up to 1,000 articles.

***

### API & MCP Integration

#### Call via Apify API

```bash
curl -X POST "https://api.apify.com/v2/acts/khadinakbar~google-news-scraper/runs?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"searchQueries": ["artificial intelligence"], "maxResultsPerQuery": 10}'
```

#### Use in an AI agent (MCP)

This actor is optimized for use via the [Apify MCP server](https://apify.com/apify/actors-mcp-server). When an LLM agent calls `call-actor` with `khadinakbar/google-news-scraper`, the structured input schema and output schema enable precise tool-use routing without hallucination.

**Typical LLM agent prompt:**

> "Get the latest 10 news articles about AI regulation from the past week using Google News"

**Agent call:**

```json
{
  "searchQueries": ["AI regulation"],
  "maxResultsPerQuery": 10,
  "timeRange": "7d"
}
```

***

### Comparison vs. Competitors

| Feature | This Actor | data\_xplorer | easyapi | scrapestorm |
|---------|-----------|-------------|---------|-------------|
| Output schema (typed) | ✅ | ❌ | ❌ | ❌ |
| MCP-optimized fields | ✅ | ❌ | ❌ | ❌ |
| Bulk keywords (1 run) | ✅ | ✅ | ❌ | ✅ |
| Custom topic URLs | ✅ | ❌ | ❌ | ❌ |
| Full text extraction | ✅ | ❌ | ❌ | ❌ |
| Deduplication | ✅ | ❌ | ❌ | ❌ |
| 50+ regions | ✅ | ✅ | ✅ | ✅ |
| Dataset schema | ✅ | ❌ | ❌ | ❌ |
| Limited permissions | ✅ | ❌ | ❌ | ❌ |

***

### Notes

- **`source_url`** may be null for some articles. Google News encodes article URLs using a Base64 scheme that requires JavaScript to decode. The `decodeUrls` option attempts HTTP redirect following but cannot resolve all URLs. The `google_news_url` field is always populated and can be used to visit the article in a browser.
- **`full_text`** requires `source_url` to be resolvable. Articles where `source_url` is null will have `full_text: null` even when `extractFullText` is enabled.
- **Rate limits**: Google News RSS feeds are public and rate-limit tolerant. This actor runs within safe request limits.
- **No blocked content**: This actor only fetches publicly available RSS data and article pages. It does not bypass paywalls.

***

### Support

Found an issue or want a feature? Open a request via the Issues tab on the actor page.

# Actor input Schema

## `searchQueries` (type: `array`):

List of search keywords or phrases to look up on Google News. Each query fetches its own RSS feed. Supports Google search operators: use quotes for exact match ("climate change"), minus to exclude (-bitcoin), OR for alternatives (AI OR "machine learning"), site: to filter by publisher (site:reuters.com). Leave empty if using topics, topicUrls, or startUrls instead.

## `topics` (type: `array`):

Select from Google News built-in topic sections. Each selected topic fetches its own RSS feed of top headlines. Use this for broad category monitoring without keywords. Can be combined with searchQueries.

## `topicUrls` (type: `array`):

Advanced: Paste the URL of any Google News section, topic page, or custom RSS feed directly. Both HTML page URLs (https://news.google.com/topics/...) and RSS URLs (https://news.google.com/rss/topics/...) are accepted — they are automatically converted. Use this for niche topics not covered by built-in topic sections.

## `startUrls` (type: `array`):

Advanced: Provide raw Google News RSS feed URLs directly. Use for custom queries already formatted as RSS (e.g. from Google Alerts exports). Each URL must be a valid RSS feed returning XML.

## `maxResultsPerQuery` (type: `integer`):

Maximum number of articles to extract per search query or topic feed. Google News RSS feeds return up to 100 articles per request. Default is 100. For bulk jobs with many queries, set lower (e.g. 10–20) to stay within budget.

## `regionLanguage` (type: `string`):

Controls the Google News edition to query — determines language, regional sources, and geographically relevant articles. Format: COUNTRY\_CODE:language\_code (e.g. US:en, GB:en, DE:de, FR:fr, JP:ja). Defaults to US:en (US English). Use this to monitor non-English news or regional publications.

## `timeRange` (type: `string`):

Filter articles by how recently they were published. Use '1h' for breaking news, '1d' for daily monitoring, '7d' for weekly digests. Defaults to 'any' (no time filter — returns all available articles).

## `extractFullText` (type: `boolean`):

When enabled, the actor visits each article page and extracts the full body text. Produces a full\_text field and word\_count field on each record. Ideal for AI/LLM pipelines, RAG (Retrieval-Augmented Generation), sentiment analysis, and NLP workloads. Requires source\_url to be resolvable — increases run time and cost (additional article-full-text charge applies per article with text extracted).

## `decodeUrls` (type: `boolean`):

When enabled, attempts to resolve the real article URL (source\_url) by following Google News redirect links. Note: Google News uses JavaScript-based URL encoding that cannot be fully resolved via HTTP redirects alone — source\_url may still be null for some articles. Increases run time. Disable for faster metadata-only extraction.

## `deduplicateResults` (type: `boolean`):

When enabled (default), removes duplicate articles across queries and topics based on URL. Prevents the same article from appearing multiple times when it matches several search queries or topics simultaneously. Disable only if you need to know which queries each article appeared in (the search\_query field tracks this).

## Actor input object example

```json
{
  "searchQueries": [
    "artificial intelligence",
    "climate change",
    "stock market crash"
  ],
  "topics": [
    "TECHNOLOGY",
    "BUSINESS"
  ],
  "topicUrls": [
    "https://news.google.com/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB"
  ],
  "startUrls": [
    {
      "url": "https://news.google.com/rss/search?q=bitcoin&hl=en&gl=US&ceid=US:en"
    }
  ],
  "maxResultsPerQuery": 25,
  "regionLanguage": "US:en",
  "timeRange": "1d",
  "extractFullText": false,
  "decodeUrls": false,
  "deduplicateResults": true
}
```

# Actor output Schema

## `results` (type: `string`):

Array of news article objects from Google News RSS feeds. Each record has consistent fields with null for missing values.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("khadinakbar/google-news-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("khadinakbar/google-news-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call khadinakbar/google-news-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=khadinakbar/google-news-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "📰 Google News Extractor / Scraper",
        "description": "Extract Google News articles by keyword or topic. No login, API key, or cookies required. Bulk search queries, 50+ regions, full-text extraction for AI/RAG, deduplication, MCP-optimized output schema. Export to JSON/CSV or integrate via API.",
        "version": "0.1",
        "x-build-id": "8yqko3Zb7fWSPymeD"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/khadinakbar~google-news-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-khadinakbar-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/khadinakbar~google-news-scraper/runs": {
            "post": {
                "operationId": "runs-sync-khadinakbar-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/khadinakbar~google-news-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-khadinakbar-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchQueries": {
                        "title": "Search Queries (Keywords)",
                        "type": "array",
                        "description": "List of search keywords or phrases to look up on Google News. Each query fetches its own RSS feed. Supports Google search operators: use quotes for exact match (\"climate change\"), minus to exclude (-bitcoin), OR for alternatives (AI OR \"machine learning\"), site: to filter by publisher (site:reuters.com). Leave empty if using topics, topicUrls, or startUrls instead.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "topics": {
                        "title": "Built-in Google News Topics",
                        "type": "array",
                        "description": "Select from Google News built-in topic sections. Each selected topic fetches its own RSS feed of top headlines. Use this for broad category monitoring without keywords. Can be combined with searchQueries.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "WORLD",
                                "NATION",
                                "BUSINESS",
                                "TECHNOLOGY",
                                "ENTERTAINMENT",
                                "SPORTS",
                                "SCIENCE",
                                "HEALTH"
                            ],
                            "enumTitles": [
                                "🌍 World",
                                "🗽 Nation (US)",
                                "💼 Business",
                                "💻 Technology",
                                "🎬 Entertainment",
                                "⚽ Sports",
                                "🔬 Science",
                                "❤️ Health"
                            ]
                        }
                    },
                    "topicUrls": {
                        "title": "Custom Google News Section URLs",
                        "type": "array",
                        "description": "Advanced: Paste the URL of any Google News section, topic page, or custom RSS feed directly. Both HTML page URLs (https://news.google.com/topics/...) and RSS URLs (https://news.google.com/rss/topics/...) are accepted — they are automatically converted. Use this for niche topics not covered by built-in topic sections.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Direct RSS Feed URLs",
                        "type": "array",
                        "description": "Advanced: Provide raw Google News RSS feed URLs directly. Use for custom queries already formatted as RSS (e.g. from Google Alerts exports). Each URL must be a valid RSS feed returning XML.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxResultsPerQuery": {
                        "title": "Max Articles Per Query / Topic",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Maximum number of articles to extract per search query or topic feed. Google News RSS feeds return up to 100 articles per request. Default is 100. For bulk jobs with many queries, set lower (e.g. 10–20) to stay within budget.",
                        "default": 100
                    },
                    "regionLanguage": {
                        "title": "Region & Language",
                        "enum": [
                            "US:en",
                            "GB:en",
                            "AU:en",
                            "CA:en",
                            "IN:en",
                            "DE:de",
                            "AT:de",
                            "CH:de",
                            "FR:fr",
                            "BE:fr",
                            "CH:fr",
                            "ES:es",
                            "MX:es",
                            "AR:es",
                            "CO:es",
                            "IT:it",
                            "PT:pt",
                            "BR:pt",
                            "NL:nl",
                            "PL:pl",
                            "RU:ru",
                            "JP:ja",
                            "CN:zh-Hans",
                            "TW:zh-Hant",
                            "KR:ko",
                            "SA:ar",
                            "EG:ar",
                            "TR:tr",
                            "IL:he",
                            "SE:sv",
                            "NO:no",
                            "DK:da",
                            "FI:fi",
                            "CZ:cs",
                            "HU:hu",
                            "RO:ro",
                            "GR:el",
                            "UA:uk",
                            "ID:id",
                            "TH:th",
                            "VN:vi",
                            "NG:en",
                            "ZA:en",
                            "KE:en",
                            "GH:en",
                            "PK:en",
                            "BD:en",
                            "PH:en",
                            "SG:en",
                            "NZ:en",
                            "IE:en"
                        ],
                        "type": "string",
                        "description": "Controls the Google News edition to query — determines language, regional sources, and geographically relevant articles. Format: COUNTRY_CODE:language_code (e.g. US:en, GB:en, DE:de, FR:fr, JP:ja). Defaults to US:en (US English). Use this to monitor non-English news or regional publications.",
                        "default": "US:en"
                    },
                    "timeRange": {
                        "title": "Time Range (Published Within)",
                        "enum": [
                            "any",
                            "1h",
                            "1d",
                            "7d",
                            "30d",
                            "1y"
                        ],
                        "type": "string",
                        "description": "Filter articles by how recently they were published. Use '1h' for breaking news, '1d' for daily monitoring, '7d' for weekly digests. Defaults to 'any' (no time filter — returns all available articles).",
                        "default": "any"
                    },
                    "extractFullText": {
                        "title": "Extract Full Article Text",
                        "type": "boolean",
                        "description": "When enabled, the actor visits each article page and extracts the full body text. Produces a full_text field and word_count field on each record. Ideal for AI/LLM pipelines, RAG (Retrieval-Augmented Generation), sentiment analysis, and NLP workloads. Requires source_url to be resolvable — increases run time and cost (additional article-full-text charge applies per article with text extracted).",
                        "default": false
                    },
                    "decodeUrls": {
                        "title": "Decode Real Article URLs",
                        "type": "boolean",
                        "description": "When enabled, attempts to resolve the real article URL (source_url) by following Google News redirect links. Note: Google News uses JavaScript-based URL encoding that cannot be fully resolved via HTTP redirects alone — source_url may still be null for some articles. Increases run time. Disable for faster metadata-only extraction.",
                        "default": false
                    },
                    "deduplicateResults": {
                        "title": "Deduplicate Results",
                        "type": "boolean",
                        "description": "When enabled (default), removes duplicate articles across queries and topics based on URL. Prevents the same article from appearing multiple times when it matches several search queries or topics simultaneously. Disable only if you need to know which queries each article appeared in (the search_query field tracks this).",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
