# Google News Scraper - Articles, Sources & Monitoring (`scrapesage/google-news-scraper`) Actor

Scrape Google News by keyword, topic, top headlines or city. Get real publisher URLs (decoded, not redirect links), source, date, snippet, related coverage and full article text, author & image. Monitor mode returns only new articles. Export JSON, CSV, Excel.

- **URL**: https://apify.com/scrapesage/google-news-scraper.md
- **Developed by:** [Scrape Sage](https://apify.com/scrapesage) (community)
- **Categories:** News, Automation, AI
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $4.00 / 1,000 article scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Google News Scraper — Articles, Real URLs & Monitoring (Source, Date, Full Text)

Extract **complete Google News data** by keyword, topic, top headlines, or city — including the field other scrapers get wrong: the **real publisher URL**. This actor decodes Google News' redirect links into the actual article URL (`https://www.reuters.com/...`), then optionally opens each article for the **full text, author, publish date, lead image, section, and keywords**. Turn on **monitor mode** to return only articles you haven't seen before — perfect for brand, competitor, and topic tracking.

No login, no cookies, no browser, no API key — fast RSS + JSON extraction with 99%+ reliability.

### Why this Google News scraper?

Most Google News scrapers hand back the encoded `news.google.com/rss/articles/CBMi…` redirect link — useless for outreach, de-duplication, or content extraction — and stop at the headline. This actor ships the **richest dataset in the category**:

| Data | Typical scrapers | This actor |
|---|---|---|
| Real publisher URL (decoded, not a redirect) | ❌ Google redirect link | ✅ `https://publisher.com/article` |
| Source / publisher name | partial | ✅ |
| Publish date (ISO) | partial | ✅ |
| Snippet / description | partial | ✅ |
| **Related coverage cluster** (other outlets on the same story) | ❌ | ✅ |
| **Full article text** | ❌ | ✅ opt-in |
| Author(s), publish & modified dates | ❌ | ✅ opt-in |
| Lead image, section, keywords, word count | ❌ | ✅ opt-in |
| Search / topic / local / headlines feeds in one actor | partial | ✅ |
| **Monitor mode** — only new articles since last run | ❌ | ✅ |
| Any language & country edition | partial | ✅ |

### Use cases

- **Media & brand monitoring** — track every mention of your brand, product, or executives across thousands of publishers. Run on a [Schedule](https://docs.apify.com/platform/schedules) with **monitor mode** to get only the new articles each time.
- **Competitor & market intelligence** — watch competitors, categories, and industry topics; feed alerts into Slack or a CRM the moment something breaks.
- **PR & reputation tracking** — measure coverage volume, see which outlets pick up a story (via the related-coverage cluster), and capture author bylines for outreach.
- **AI, RAG & LLM pipelines** — clean, LLM-ready JSON with full article text is ideal for summarization, sentiment, embeddings, and retrieval-augmented generation.
- **News aggregation & newsletters** — power apps, digests, and dashboards with structured, deduplicated news for any keyword, topic, or city.
- **Finance & trading signals** — pull the latest news for tickers, companies, and sectors with precise timestamps and real source URLs.

### How to use

1. [Sign up for Apify](https://console.apify.com/sign-up) — the free plan is enough to try this actor.
2. Open the **Google News Scraper**, enter **search terms** (and/or **topics**, **locations**, or paste Google News **URLs**), and click **Start**.
3. Watch results stream into the dataset table.
4. **Export** as JSON, CSV, Excel, XML, or RSS — or pull results programmatically via the [Apify API](https://docs.apify.com/api/v2).

### Input

```json
{
    "searchTerms": ["electric vehicles", "\"interest rates\""],
    "topics": ["TECHNOLOGY", "BUSINESS"],
    "locations": ["San Francisco"],
    "includeTopHeadlines": false,
    "language": "en-US",
    "country": "US",
    "timeWindow": "7d",
    "resolveArticleUrls": true,
    "includeArticleContent": true,
    "maxItems": 200,
    "monitorMode": true,
    "monitorKey": "ev-watch"
}
````

- **searchTerms** — keywords/phrases. Google News operators work: `"exact phrase"`, `OR`, `-exclude`, `site:reuters.com`, `intitle:`, `when:7d`.
- **topics** — `WORLD`, `NATION`, `BUSINESS`, `TECHNOLOGY`, `ENTERTAINMENT`, `SPORTS`, `SCIENCE`, `HEALTH`, or a raw `CAAq…` topic token.
- **locations** — cities/regions for local news (`New York`, `California`, `London`).
- **includeTopHeadlines** — also pull the top-stories feed for your edition.
- **startUrls** — paste Google News search/topic/RSS/article URLs (or any publisher article URL) directly.
- **language / country** — the Google News edition (`hl` / `gl`), e.g. `en-GB` + `GB`, `de` + `DE`, `pt-BR` + `BR`.
- **timeWindow** — restrict search-term feeds to recent articles (`1h`, `12h`, `1d`, `7d`, `1y`).
- **resolveArticleUrls** *(default true)* — decode every Google News link into the real publisher URL.
- **includeArticleContent** *(default false)* — open each article for full text, author, dates, image, section, keywords, word count.
- **includeRelatedArticles** *(default true)* — attach the related-coverage cluster for each story.
- **maxItems / maxItemsPerFeed** — output caps to control cost.
- **sinceDate** — keep only articles since an ISO date or relative window (`24h`, `3d`, `2w`).
- **includeKeywords / excludeKeywords / includeSources / excludeSources** — client-side filters.
- **dedupeByResolvedUrl** *(default true)* — drop the same story repeated across feeds.
- **monitorMode / monitorKey** *(default false / `default`)* — return only articles not seen in previous runs.

### Output

One record per article (`type: "article"`):

```json
{
    "type": "article",
    "title": "The Cybercab is the lightest, most efficient Tesla ever made",
    "source": "The Verge",
    "url": "https://www.theverge.com/transportation/950596/tesla-cybercab-efficient-weight-range-epa",
    "urlResolved": true,
    "googleNewsUrl": "https://news.google.com/rss/articles/CBMiqwFBVV95cUx…",
    "publishedAt": "2026-06-16T15:02:39.000Z",
    "snippet": "Against all odds, the Tesla Cybercab is in production…",
    "imageUrl": "https://platform.theverge.com/wp-content/uploads/…/cybercab.jpg",
    "feedType": "search",
    "query": "tesla",
    "topic": null,
    "locationQuery": null,
    "language": "en-US",
    "country": "US",
    "relatedArticles": [
        { "title": "Tesla starts Cybercab production", "source": "Electrek", "googleNewsUrl": "https://news.google.com/rss/articles/CBMi…" }
    ],
    "relatedCount": 2,
    "contentExtracted": true,
    "author": "Andrew J. Hawkins",
    "authors": ["Andrew J. Hawkins"],
    "articlePublishedAt": "2026-06-16T15:02:39.000Z",
    "articleModifiedAt": "2026-06-16T15:18:02.000Z",
    "section": "Transportation",
    "keywords": ["Autonomous Cars", "Electric Cars", "Tesla", "Transportation"],
    "wordCount": 496,
    "canonicalUrl": "https://www.theverge.com/transportation/950596/tesla-cybercab-efficient-weight-range-epa",
    "fullText": "Against all odds, the Tesla Cybercab is in production. And while…",
    "scrapedAt": "2026-06-16T16:49:35.000Z"
}
```

#### What to expect (field coverage)

| Field group | Always present | Present when enabled / published |
|---|---|---|
| **Core** | title, source, googleNewsUrl, publishedAt, feedType, discovery context | snippet (search feeds & content) |
| **Real URL** | — | `url` + `urlResolved` with `resolveArticleUrls` (~95–100% resolve) |
| **Related coverage** | relatedCount | `relatedArticles` on clustered stories (topic/headlines/local) |
| **Full content** | — | fullText, author(s), dates, imageUrl, section, keywords, wordCount with `includeArticleContent` |

A blank field means the publisher didn't expose it (e.g. some sites omit author/section, and paywalled articles return only a summary) — never because the scraper skipped it. Nothing is dropped, so you always get the richest record available.

### Automate & schedule

Run this actor on autopilot and pull results into your own stack:

- **[Apify API](https://docs.apify.com/api/v2)** — start runs, fetch datasets, and manage schedules over REST.
- **[apify-client for JavaScript](https://docs.apify.com/api/client/js/)** and **[apify-client for Python](https://docs.apify.com/api/client/python/)** — official SDKs.
- **[Schedules](https://docs.apify.com/platform/schedules)** — run it hourly/daily to monitor a brand, competitor, topic, or city. Pair with **monitor mode** so each run returns only the new articles.
- **[Webhooks](https://docs.apify.com/platform/integrations/webhooks)** — trigger downstream actions (CRM import, Slack alert, summarizer) the moment a run finishes.

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'MY_APIFY_TOKEN' });

const run = await client.actor('scrapesage/google-news-scraper').call({
    searchTerms: ['"my brand"'],
    language: 'en-US',
    country: 'US',
    resolveArticleUrls: true,
    includeArticleContent: true,
    monitorMode: true,
    monitorKey: 'my-brand',
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Got ${items.length} new articles`);
```

### Integrate with any app

Connect the dataset to 5,000+ apps — no code required:

- **[Make](https://docs.apify.com/platform/integrations/make)** — multi-step automation scenarios.
- **[Zapier](https://docs.apify.com/platform/integrations/zapier)** — push new articles straight into Slack, Sheets, or your CRM.
- **[Slack](https://docs.apify.com/platform/integrations/slack)** — get notified when a monitored search finds news.
- **[Google Drive / Sheets](https://docs.apify.com/platform/integrations/drive)** — auto-export every run to a spreadsheet.
- **[Airbyte](https://docs.apify.com/platform/integrations/airbyte)** — pipe results into your data warehouse.
- **[GitHub](https://docs.apify.com/platform/integrations/github)** — trigger runs from commits or releases.

### Use with AI assistants (MCP)

The output is clean, LLM-ready JSON with full article text. Call this actor from Claude, ChatGPT, or any agent framework through the **[Apify MCP server](https://docs.apify.com/platform/integrations/mcp)** — ask your assistant to "monitor Google News for my company and summarize today's coverage" and let it run this scraper for you.

### More scrapers from scrapesage

Build a complete **media monitoring & market-intelligence stack**:

- **[Telegram Scraper](https://apify.com/scrapesage/telegram-scraper)** — channels, messages, media, and search.
- **[Substack Scraper](https://apify.com/scrapesage/substack-scraper)** — newsletters, posts, and creator leads.
- **[YouTube Scraper](https://apify.com/scrapesage/youtube-scraper)** — channels, videos, and creator leads.
- **[Bluesky Scraper](https://apify.com/scrapesage/bluesky-scraper)** — profiles, posts, followers, and leads.
- **[Google Ads Transparency Scraper](https://apify.com/scrapesage/google-ads-transparency-scraper)** — who's advertising what on Google.
- **[Facebook Ad Library Scraper](https://apify.com/scrapesage/facebook-ad-library-scraper)** — competitor ad intelligence on Meta & Instagram.
- **[Product Hunt Scraper](https://apify.com/scrapesage/product-hunt-scraper)** — launches, makers, and leads.
- **[GitHub Scraper](https://apify.com/scrapesage/github-scraper)** — repos, developers, and contact leads.

### Tips

- **Real URLs are the value** — keep `resolveArticleUrls` on. It decodes the Google News redirect into the actual publisher link so you can de-duplicate, click through, and extract content.
- **Full text** — turn on `includeArticleContent` for summarization, sentiment, and embeddings. It adds one fast request per article; paywalled sites return only a summary.
- **Monitoring** — combine [Schedules](https://docs.apify.com/platform/schedules) with `monitorMode` + a unique `monitorKey` per brand/topic to receive only new articles each run. Monitor mode is independent of the schedule — the schedule starts the run, monitor mode deduplicates against prior runs.
- **Coverage depth** — Google News returns up to ~100 articles per feed. To go wider, split into several `searchTerms`, add `topics`/`locations`, or narrow with `timeWindow`.
- **Editions** — set `language` + `country` to target a specific edition (e.g. `en-GB` + `GB`, `de` + `DE`). Local news uses `locations`.

### FAQ

**How do I get the real article URL instead of the Google News link?** It's automatic — `resolveArticleUrls` is on by default. Every record includes both `url` (the decoded publisher URL) and `googleNewsUrl` (the original Google News link).

**Does it need the Google News API or a key?** No. Google News has no public API for this; the actor reads the public RSS feeds and resolves URLs the same way a browser does — no key or login needed.

**Can I get the full article text?** Yes — enable `includeArticleContent`. The actor opens each article and extracts the body, author, dates, image, section, and keywords from the page's structured data, with a readability fallback.

**How do I monitor news automatically?** Create a [Schedule](https://docs.apify.com/platform/schedules) (e.g. hourly), turn on `monitorMode`, and give each watch-list a unique `monitorKey`. Each run returns only articles not seen before. Add a [webhook](https://docs.apify.com/platform/integrations/webhooks) or [Zapier zap](https://docs.apify.com/platform/integrations/zapier) to push them into Slack or your CRM.

**Does monitor mode conflict with Apify Schedules?** No — they're complementary. The schedule decides *when* the actor runs; monitor mode decides *what's new* by comparing against a named key-value store from previous runs.

**Can I scrape news in other languages and countries?** Yes. Set `language` (e.g. `fr`, `de`, `pt-BR`) and `country` (e.g. `FR`, `DE`, `BR`) to any Google News edition.

**Can I export to Google Sheets, CSV, or Excel?** Yes — one click in the dataset view, or automatically on every run via the [Google Drive integration](https://docs.apify.com/platform/integrations/drive).

**A field is empty — why?** Some publishers don't expose an author, section, or keywords, and paywalled articles return only a summary. Fields are blank only when the data isn't published — never because the scraper skipped them.

**Is scraping Google News legal?** This actor collects publicly available data only. You're responsible for using the data in compliance with applicable laws (e.g. GDPR/CCPA for personal data), Google's terms, and each publisher's terms — including copyright when storing full article text.

### Need help?

Open an issue on the actor's **Issues** tab, or visit the [Apify help center](https://help.apify.com/). Feature requests are welcome — this actor is actively maintained.

# Actor input Schema

## `searchTerms` (type: `array`):

Keywords or phrases to search Google News for. Each term runs as its own feed and returns up to ~100 of the latest matching articles. Google News search operators work too: quotes for exact phrases (`"electric vehicles"`), `OR`, `-exclude`, `site:reuters.com`, `intitle:`, `allintitle:`, and `when:7d` for a time window.

## `topics` (type: `array`):

Google News topic feeds. Use a friendly name — `WORLD`, `NATION`, `BUSINESS`, `TECHNOLOGY`, `ENTERTAINMENT`, `SPORTS`, `SCIENCE`, `HEALTH` — or paste a raw Google News topic token (the `CAAq…` string from a topic URL) for niche topics.

## `locations` (type: `array`):

Cities, states or places to pull local news for, e.g. `New York`, `San Francisco`, `California`, `London`. Each returns the Google News local headlines feed for that place.

## `includeTopHeadlines` (type: `boolean`):

Also pull the Google News top-stories feed for the chosen language/country.

## `startUrls` (type: `array`):

Paste Google News URLs directly: search pages (`news.google.com/search?q=…`), topic pages (`/topics/…`), RSS feeds (`/rss/search?q=…`), or individual article links (`/articles/…` / `/rss/articles/…`). Non-Google article URLs are also accepted and scraped for content directly.

## `language` (type: `string`):

Google News interface language, e.g. `en-US`, `en-GB`, `es`, `fr`, `de`, `pt-BR`. Controls the `hl` parameter.

## `country` (type: `string`):

Two-letter country/edition, e.g. `US`, `GB`, `CA`, `AU`, `IN`, `DE`. Controls the `gl` parameter and the news edition.

## `timeWindow` (type: `string`):

Restrict search-term feeds to recent articles using a Google News `when:` window — e.g. `1h`, `12h`, `1d`, `7d`, `1y`. Leave empty for no restriction. (Applies to `searchTerms`; topics/headlines/locations are always the latest.)

## `resolveArticleUrls` (type: `boolean`):

Decode each Google News redirect link into the article's REAL publisher URL (e.g. `https://www.reuters.com/...`). This is what makes the data usable for outreach, de-duplication and content extraction. Turn off for a faster, cheaper run that keeps only the Google News links.

## `includeArticleContent` (type: `boolean`):

Open each resolved article and extract the full text, author(s), publish/modified dates, lead image, section, keywords/tags and word count (from JSON-LD + meta tags with a readability fallback). Adds one request per article. Some paywalled sites expose only a summary.

## `includeRelatedArticles` (type: `boolean`):

Attach the cluster of related articles Google News groups under each story (title, source and Google News link for each).

## `maxItems` (type: `integer`):

Maximum number of articles to output across the whole run (hard cap to control cost).

## `maxItemsPerFeed` (type: `integer`):

Cap how many articles to take from each individual feed (0 = no per-feed cap; Google News returns up to ~100 per feed).

## `sinceDate` (type: `string`):

Keep only articles published on/after this point. Accepts an ISO date (`2026-06-01`) or a relative window (`24h`, `3d`, `2w`, `6m`).

## `includeKeywords` (type: `array`):

Keep only articles whose title or snippet contains at least one of these words/phrases (case-insensitive).

## `excludeKeywords` (type: `array`):

Drop articles whose title or snippet contains any of these words/phrases (case-insensitive).

## `includeSources` (type: `array`):

Keep only articles from these publishers (matches the source name, case-insensitive substring), e.g. `Reuters`, `BBC`, `TechCrunch`.

## `excludeSources` (type: `array`):

Drop articles from these publishers (source-name substring, case-insensitive).

## `dedupeByResolvedUrl` (type: `boolean`):

Remove duplicate articles (the same story can appear across several feeds) by their resolved publisher URL within a run.

## `monitorMode` (type: `boolean`):

Remember which articles were emitted in previous runs (in a named key-value store) and return ONLY articles not seen before — ideal for brand, competitor and topic monitoring. Works with Apify Schedules: the schedule starts the run, monitor mode deduplicates against prior runs.

## `monitorKey` (type: `string`):

Names the monitor store so you can keep separate watch-lists (e.g. one per brand or topic). Lowercase letters, numbers and hyphens.

## `maxConcurrency` (type: `integer`):

Maximum number of requests fetched in parallel (feed, URL-decode and content fetches).

## `proxyConfiguration` (type: `object`):

Proxy settings. Google News serves clean RSS to Apify datacenter IPs, so the default Apify proxy is plenty.

## Actor input object example

```json
{
  "searchTerms": [
    "artificial intelligence"
  ],
  "includeTopHeadlines": false,
  "language": "en-US",
  "country": "US",
  "timeWindow": "",
  "resolveArticleUrls": true,
  "includeArticleContent": false,
  "includeRelatedArticles": true,
  "maxItems": 100,
  "maxItemsPerFeed": 0,
  "sinceDate": "",
  "dedupeByResolvedUrl": true,
  "monitorMode": false,
  "monitorKey": "default",
  "maxConcurrency": 8,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `results` (type: `string`):

All scraped article records as JSON items in the default dataset.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchTerms": [
        "artificial intelligence"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("scrapesage/google-news-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "searchTerms": ["artificial intelligence"] }

# Run the Actor and wait for it to finish
run = client.actor("scrapesage/google-news-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchTerms": [
    "artificial intelligence"
  ]
}' |
apify call scrapesage/google-news-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scrapesage/google-news-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Google News Scraper - Articles, Sources & Monitoring",
        "description": "Scrape Google News by keyword, topic, top headlines or city. Get real publisher URLs (decoded, not redirect links), source, date, snippet, related coverage and full article text, author & image. Monitor mode returns only new articles. Export JSON, CSV, Excel.",
        "version": "1.0",
        "x-build-id": "m2BephXms5D9cbx0P"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scrapesage~google-news-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scrapesage-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scrapesage~google-news-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scrapesage-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scrapesage~google-news-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scrapesage-google-news-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchTerms": {
                        "title": "Search terms / keywords",
                        "type": "array",
                        "description": "Keywords or phrases to search Google News for. Each term runs as its own feed and returns up to ~100 of the latest matching articles. Google News search operators work too: quotes for exact phrases (`\"electric vehicles\"`), `OR`, `-exclude`, `site:reuters.com`, `intitle:`, `allintitle:`, and `when:7d` for a time window.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "topics": {
                        "title": "Topics",
                        "type": "array",
                        "description": "Google News topic feeds. Use a friendly name — `WORLD`, `NATION`, `BUSINESS`, `TECHNOLOGY`, `ENTERTAINMENT`, `SPORTS`, `SCIENCE`, `HEALTH` — or paste a raw Google News topic token (the `CAAq…` string from a topic URL) for niche topics.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "locations": {
                        "title": "Locations (local news)",
                        "type": "array",
                        "description": "Cities, states or places to pull local news for, e.g. `New York`, `San Francisco`, `California`, `London`. Each returns the Google News local headlines feed for that place.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "includeTopHeadlines": {
                        "title": "Include top headlines",
                        "type": "boolean",
                        "description": "Also pull the Google News top-stories feed for the chosen language/country.",
                        "default": false
                    },
                    "startUrls": {
                        "title": "Google News URLs",
                        "type": "array",
                        "description": "Paste Google News URLs directly: search pages (`news.google.com/search?q=…`), topic pages (`/topics/…`), RSS feeds (`/rss/search?q=…`), or individual article links (`/articles/…` / `/rss/articles/…`). Non-Google article URLs are also accepted and scraped for content directly.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "language": {
                        "title": "Language (hl)",
                        "type": "string",
                        "description": "Google News interface language, e.g. `en-US`, `en-GB`, `es`, `fr`, `de`, `pt-BR`. Controls the `hl` parameter.",
                        "default": "en-US"
                    },
                    "country": {
                        "title": "Country (gl)",
                        "type": "string",
                        "description": "Two-letter country/edition, e.g. `US`, `GB`, `CA`, `AU`, `IN`, `DE`. Controls the `gl` parameter and the news edition.",
                        "default": "US"
                    },
                    "timeWindow": {
                        "title": "Time window (search terms only)",
                        "type": "string",
                        "description": "Restrict search-term feeds to recent articles using a Google News `when:` window — e.g. `1h`, `12h`, `1d`, `7d`, `1y`. Leave empty for no restriction. (Applies to `searchTerms`; topics/headlines/locations are always the latest.)",
                        "default": ""
                    },
                    "resolveArticleUrls": {
                        "title": "Resolve real publisher URLs",
                        "type": "boolean",
                        "description": "Decode each Google News redirect link into the article's REAL publisher URL (e.g. `https://www.reuters.com/...`). This is what makes the data usable for outreach, de-duplication and content extraction. Turn off for a faster, cheaper run that keeps only the Google News links.",
                        "default": true
                    },
                    "includeArticleContent": {
                        "title": "Extract full article content",
                        "type": "boolean",
                        "description": "Open each resolved article and extract the full text, author(s), publish/modified dates, lead image, section, keywords/tags and word count (from JSON-LD + meta tags with a readability fallback). Adds one request per article. Some paywalled sites expose only a summary.",
                        "default": false
                    },
                    "includeRelatedArticles": {
                        "title": "Include related coverage",
                        "type": "boolean",
                        "description": "Attach the cluster of related articles Google News groups under each story (title, source and Google News link for each).",
                        "default": true
                    },
                    "maxItems": {
                        "title": "Max articles",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of articles to output across the whole run (hard cap to control cost).",
                        "default": 100
                    },
                    "maxItemsPerFeed": {
                        "title": "Max articles per feed",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Cap how many articles to take from each individual feed (0 = no per-feed cap; Google News returns up to ~100 per feed).",
                        "default": 0
                    },
                    "sinceDate": {
                        "title": "Only articles since",
                        "type": "string",
                        "description": "Keep only articles published on/after this point. Accepts an ISO date (`2026-06-01`) or a relative window (`24h`, `3d`, `2w`, `6m`).",
                        "default": ""
                    },
                    "includeKeywords": {
                        "title": "Must include keywords",
                        "type": "array",
                        "description": "Keep only articles whose title or snippet contains at least one of these words/phrases (case-insensitive).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeKeywords": {
                        "title": "Exclude keywords",
                        "type": "array",
                        "description": "Drop articles whose title or snippet contains any of these words/phrases (case-insensitive).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "includeSources": {
                        "title": "Only these sources",
                        "type": "array",
                        "description": "Keep only articles from these publishers (matches the source name, case-insensitive substring), e.g. `Reuters`, `BBC`, `TechCrunch`.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeSources": {
                        "title": "Exclude these sources",
                        "type": "array",
                        "description": "Drop articles from these publishers (source-name substring, case-insensitive).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "dedupeByResolvedUrl": {
                        "title": "Deduplicate by article URL",
                        "type": "boolean",
                        "description": "Remove duplicate articles (the same story can appear across several feeds) by their resolved publisher URL within a run.",
                        "default": true
                    },
                    "monitorMode": {
                        "title": "Monitor mode — only new articles",
                        "type": "boolean",
                        "description": "Remember which articles were emitted in previous runs (in a named key-value store) and return ONLY articles not seen before — ideal for brand, competitor and topic monitoring. Works with Apify Schedules: the schedule starts the run, monitor mode deduplicates against prior runs.",
                        "default": false
                    },
                    "monitorKey": {
                        "title": "Monitor key",
                        "type": "string",
                        "description": "Names the monitor store so you can keep separate watch-lists (e.g. one per brand or topic). Lowercase letters, numbers and hyphens.",
                        "default": "default"
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Maximum number of requests fetched in parallel (feed, URL-decode and content fetches).",
                        "default": 8
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings. Google News serves clean RSS to Apify datacenter IPs, so the default Apify proxy is plenty.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
