# Google News Intelligence Scraper (`lokki/google-news-intelligence-scraper`) Actor

Monitor brands, competitors, markets, and topics from Google News RSS with clean article data, source signals, freshness, deduplication, and business-ready intelligence labels.

- **URL**: https://apify.com/lokki/google-news-intelligence-scraper.md
- **Developed by:** [Ian Dikhtiar](https://apify.com/lokki) (community)
- **Categories:** News
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Google News Intelligence Scraper

Turn Google News into a daily intelligence feed for brand monitoring, competitor tracking, PR alerts, SEO research, and market trend discovery.

This Actor does **not** just dump headlines. It returns clean, flat, business-ready records with freshness, source, country edition, query context, deduplication, and lightweight intelligence labels so teams can plug the output straight into Sheets, Notion, Slack, n8n, dashboards, CRMs, or alerting workflows.

### Why this scraper exists

Most news scrapers solve the easy problem: “give me article links.”

Customers usually need the harder, recurring problem:

- What changed about my brand today?
- Which competitors are getting press?
- Which product launches, lawsuits, funding events, security issues, or partnerships should I care about?
- What sources are repeatedly covering a topic?
- Can I run this every morning and get a clean dataset without babysitting it?

That is the product angle here: **daily monitoring, not one-off scraping**.

### Best use cases

- **Brand monitoring** — track mentions for your company, founders, products, and campaigns.
- **Competitor intelligence** — monitor rival launches, funding, partnerships, layoffs, and legal issues.
- **PR and media monitoring** — discover who is writing about your market.
- **SEO and content research** — collect fresh headlines and angles for a topic.
- **Investor / analyst workflows** — follow public-company news, earnings, M&A, regulatory events, and security incidents.
- **Automation feeds** — schedule daily runs and send high-signal results to Slack, email, Notion, Airtable, or n8n.

### What you get

Each result is a flat record:

- `query` — your original search term
- `searchQuery` — the exact Google News query used
- `country` — Google News edition, e.g. `US`, `GB`, `CA`
- `language` — language code, e.g. `en`
- `rank` — ranking inside the feed for that query/country
- `title` — article headline
- `source` — publisher name
- `sourceUrl` — publisher URL when Google provides it
- `publishedAt` — normalized ISO timestamp
- `ageHours` — article age at scrape time
- `freshnessBucket` — `breaking_0_6h`, `today_6_24h`, `recent_1_3d`, `week_3_7d`, or `older`
- `isRecent` — true when the article is under 24 hours old
- `intelligenceType` — rule-based business label such as `funding`, `product_launch`, `legal_regulatory`, `security`, `partnership`, `earnings_financial`, `m_and_a`, `hiring_people`, `review_analysis`, or `general`
- `url` — article URL / Google News URL, or canonical URL when article metadata fetching is enabled
- `googleNewsUrl` — original Google News link
- `guid` — Google News item ID
- `snippet` — cleaned article summary from the RSS feed
- `metaDescription`, `image`, `textPreview` — optional fields when article-page fetching is enabled
- `scrapedAt` — run timestamp

### Input options

#### `queries`
Brands, competitors, products, people, markets, or advanced Google News queries.

Examples:

```json
["OpenAI", "Anthropic", "Apify", "\"AI agent\" funding", "site:techcrunch.com startup"]
````

#### `mode`

Controls how queries are interpreted:

- `raw` — keep your query exactly as written. Best for advanced operators.
- `exactPhrase` — wraps each query in quotes.
- `allWords` — requires all query words.

#### `countries`

Google News editions to query. Examples: `US`, `GB`, `CA`, `AU`, `DE`, `FR`.

#### `language`

Language code, default `en`.

#### `maxItemsPerQuery`

Maximum results per query/country. Google News RSS usually returns up to about 100.

#### `freshnessHours`

Only keep articles published within this many hours. Default: `168` for the last 7 days. Use `24` for daily alerts or `0` to disable freshness filtering.

#### `includeArticleText`

Optional. Visits article pages to extract canonical URL, meta description, OpenGraph image, and a lightweight text preview. Leave off for fast, reliable daily monitoring. Turn on when you need richer metadata.

#### `classifyIntent`

Adds business intelligence labels using transparent rules.

#### `deduplicate`

Removes duplicate stories across query/country combinations.

### Example input

```json
{
  "queries": ["OpenAI", "Anthropic", "Apify"],
  "mode": "raw",
  "countries": ["US", "GB"],
  "language": "en",
  "maxItemsPerQuery": 25,
  "freshnessHours": 168,
  "includeArticleText": false,
  "classifyIntent": true,
  "deduplicate": true
}
```

### Example output item

```json
{
  "query": "OpenAI",
  "searchQuery": "OpenAI",
  "country": "US",
  "language": "en",
  "rank": 1,
  "title": "Example headline about OpenAI",
  "source": "Example News",
  "publishedAt": "2026-06-01T01:00:00.000Z",
  "ageHours": 1.5,
  "freshnessBucket": "breaking_0_6h",
  "isRecent": true,
  "intelligenceType": "product_launch",
  "url": "https://news.google.com/rss/articles/...",
  "snippet": "Clean article summary...",
  "scrapedAt": "2026-06-01T02:30:00.000Z"
}
```

### Reliability notes

- Uses Google News RSS endpoints, so it is lightweight and fast.
- No login required.
- Browser automation is not required.
- Proxies are usually unnecessary.
- Best scheduled daily or hourly for recurring monitoring workflows.

### Recommended schedules

- **Brand monitoring:** every 6–12 hours
- **Competitor intelligence:** daily at 7am
- **Breaking news / crisis monitoring:** hourly
- **SEO content research:** weekly

### Marketplace differentiation

This Actor is positioned against generic Google News scrapers by focusing on:

- cleaner flat output
- daily monitoring use cases
- source and freshness fields
- deduplication across query/country combinations
- business signal labeling
- optional article-page metadata
- README and schema designed for non-technical buyers

### FAQ

#### Is this the same as Google Alerts?

No. Google Alerts sends emails. This Actor returns structured data you can automate, store, filter, enrich, and push into your own workflows.

#### Can I track competitors every day?

Yes. Put competitor names in `queries`, create an Apify task, and schedule it daily.

#### Does it scrape full article text?

By default, no. It collects Google News RSS data for speed and reliability. Enable `includeArticleText` for lightweight page metadata and previews.

#### Can I use Google News search operators?

Yes. Use `mode: raw` and write advanced queries such as `"AI agent" funding`, `site:techcrunch.com robotics`, or `OpenAI OR Anthropic`.

# Actor input Schema

## `queries` (type: `array`):

Brands, competitors, people, products, keywords, or market topics to monitor.

## `mode` (type: `string`):

How to interpret each query. Exact phrase wraps terms in quotes; all words requires every word; raw advanced keeps Google News operators untouched.

## `countries` (type: `array`):

Google News editions to query. Use ISO-like country codes such as US, GB, CA, AU, DE, FR. Each query runs once per country.

## `language` (type: `string`):

Google News language code.

## `maxItemsPerQuery` (type: `integer`):

Google News RSS usually returns up to about 100 results. Keep this modest for daily monitoring.

## `freshnessHours` (type: `integer`):

Only keep articles published within this many hours. Set 0 to disable. Useful for daily recurring monitoring.

## `includeArticleText` (type: `boolean`):

When enabled, the actor visits article URLs to extract canonical URL, meta description, image, and lightweight text preview. Slower but richer.

## `classifyIntent` (type: `boolean`):

Adds simple rule-based labels: funding, product launch, hiring, legal, partnership, earnings, security, layoffs, M\&A, executive move, review/analysis, or general.

## `deduplicate` (type: `boolean`):

Remove duplicate articles across queries/countries using canonical URL/title/source fingerprints.

## `proxyConfiguration` (type: `object`):

Optional Apify proxy configuration. Not required for Google News RSS in normal use.

## Actor input object example

```json
{
  "queries": [
    "OpenAI",
    "Anthropic",
    "Apify"
  ],
  "mode": "raw",
  "countries": [
    "US"
  ],
  "language": "en",
  "maxItemsPerQuery": 50,
  "freshnessHours": 168,
  "includeArticleText": false,
  "classifyIntent": true,
  "deduplicate": true,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `results` (type: `string`):

No description

## `summary` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        "OpenAI",
        "Anthropic",
        "Apify"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("lokki/google-news-intelligence-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "queries": [
        "OpenAI",
        "Anthropic",
        "Apify",
    ] }

# Run the Actor and wait for it to finish
run = client.actor("lokki/google-news-intelligence-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    "OpenAI",
    "Anthropic",
    "Apify"
  ]
}' |
apify call lokki/google-news-intelligence-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=lokki/google-news-intelligence-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Google News Intelligence Scraper",
        "description": "Monitor brands, competitors, markets, and topics from Google News RSS with clean article data, source signals, freshness, deduplication, and business-ready intelligence labels.",
        "version": "1.0",
        "x-build-id": "ayuVsiIlBy2jZ9BW4"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/lokki~google-news-intelligence-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-lokki-google-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/lokki~google-news-intelligence-scraper/runs": {
            "post": {
                "operationId": "runs-sync-lokki-google-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/lokki~google-news-intelligence-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-lokki-google-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "queries"
                ],
                "properties": {
                    "queries": {
                        "title": "Search queries",
                        "type": "array",
                        "description": "Brands, competitors, people, products, keywords, or market topics to monitor.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "mode": {
                        "title": "Search mode",
                        "enum": [
                            "raw",
                            "exactPhrase",
                            "allWords"
                        ],
                        "type": "string",
                        "description": "How to interpret each query. Exact phrase wraps terms in quotes; all words requires every word; raw advanced keeps Google News operators untouched.",
                        "default": "raw"
                    },
                    "countries": {
                        "title": "Countries / editions",
                        "type": "array",
                        "description": "Google News editions to query. Use ISO-like country codes such as US, GB, CA, AU, DE, FR. Each query runs once per country.",
                        "default": [
                            "US"
                        ],
                        "items": {
                            "type": "string"
                        }
                    },
                    "language": {
                        "title": "Language",
                        "type": "string",
                        "description": "Google News language code.",
                        "default": "en"
                    },
                    "maxItemsPerQuery": {
                        "title": "Max articles per query/country",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Google News RSS usually returns up to about 100 results. Keep this modest for daily monitoring.",
                        "default": 50
                    },
                    "freshnessHours": {
                        "title": "Freshness window in hours",
                        "minimum": 0,
                        "maximum": 8760,
                        "type": "integer",
                        "description": "Only keep articles published within this many hours. Set 0 to disable. Useful for daily recurring monitoring.",
                        "default": 168
                    },
                    "includeArticleText": {
                        "title": "Fetch article pages for metadata",
                        "type": "boolean",
                        "description": "When enabled, the actor visits article URLs to extract canonical URL, meta description, image, and lightweight text preview. Slower but richer.",
                        "default": false
                    },
                    "classifyIntent": {
                        "title": "Add business intelligence labels",
                        "type": "boolean",
                        "description": "Adds simple rule-based labels: funding, product launch, hiring, legal, partnership, earnings, security, layoffs, M&A, executive move, review/analysis, or general.",
                        "default": true
                    },
                    "deduplicate": {
                        "title": "Deduplicate articles",
                        "type": "boolean",
                        "description": "Remove duplicate articles across queries/countries using canonical URL/title/source fingerprints.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional Apify proxy configuration. Not required for Google News RSS in normal use.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
