# Google News Extractor (`kawsar/google-news-extractor`) Actor

Google News Scraper that pulls article titles, publisher names, direct URLs, and publish dates from any keyword search, so you can track news coverage without setting up your own infrastructure.

- **URL**: https://apify.com/kawsar/google-news-extractor.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** News, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Google News Scraper

Google News Scraper pulls articles from Google News search results. Give it one keyword or a list of keywords and it returns titles, publisher names, direct article URLs, publish dates, and thumbnails for every result.

Supports locale and country targeting, so you can pull regional news feeds in any language Google News covers. Run multiple queries in a single actor run and all results land in one dataset, tagged by query.

### What it collects

For each article:

- `articleTitle` - the headline as it appears on Google News
- `sourceName` - publisher or outlet name (e.g. Reuters, BBC, TechCrunch)
- `articleUrl` - direct link to the original article on the publisher's site, not a Google redirect
- `googleNewsUrl` - the full Google News read link
- `publishedAt` - ISO 8601 datetime in UTC (e.g. `2026-05-18T01:14:00Z`)
- `publishedText` - the relative label shown on the page (e.g. "4 hours ago", "Yesterday")
- `thumbnailUrl` - article image from Google News CDN
- `searchQuery` - which query produced this result (useful when running multiple queries)
- `scrapedAt` - ISO 8601 timestamp of when the article was collected
- `error` - null on success, error message if extraction failed for that item

### How to use

1. Open the actor and click **Try for free**
2. Enter one search query in the **Search query** field, or add multiple terms in the **Search queries** list
3. Set language and country for regional results (optional, defaults to US English)
4. Set max items per query and click **Start**

Results land in the **Storage** tab as JSON, one item per article.

### Single query

```json
{
  "query": "Tesla USA",
  "language": "en-US",
  "country": "US",
  "maxItems": 50
}
````

### Multiple queries in one run

Use the `queries` field to run several searches at once. Each result includes a `searchQuery` field so you can filter or group by keyword downstream.

```json
{
  "queries": ["Tesla USA", "Apple earnings", "AI regulation EU", "Bitcoin price"],
  "language": "en-US",
  "country": "US",
  "maxItems": 50
}
```

This collects up to 50 articles per query, so the above would return up to 200 total results in one run.

You can also combine both fields. If you set `query` and `queries`, the actor merges them and deduplicates:

```json
{
  "query": "Tesla USA",
  "queries": ["Apple earnings", "Tesla USA"],
  "maxItems": 30
}
```

"Tesla USA" appears in both but runs only once.

### All inputs

| Field | Required | Default | Description |
|-------|----------|---------|-------------|
| query | No | - | Single search term (e.g. "Tesla USA") |
| queries | No | - | List of search terms for multi-query runs |
| language | No | en-US | Google News language code (e.g. en-US, en-GB, de, fr, ja, es, pt-BR) |
| country | No | US | Two-letter country code (e.g. US, GB, DE, FR, JP, BR) |
| ceid | No | auto | Edition string like "US:en". Built automatically from country + language if left blank |
| maxItems | No | 100 | Max articles per query, up to 1000 |
| requestTimeoutSecs | No | 30 | Per-request timeout in seconds (5-120) |

At least one of `query` or `queries` must be provided.

### Example output

```json
{
  "articleTitle": "Tesla Raises U.S. Model Y Prices. Its Stock Just Forged A Lower Buy Point.",
  "sourceName": "Investor's Business Daily",
  "articleUrl": "https://www.investors.com/news/tesla-raises-u-s-model-y-prices-lower-buy-point/",
  "googleNewsUrl": "https://news.google.com/read/CBMihgFBVV95cUxN...",
  "publishedAt": "2026-05-18T01:14:00Z",
  "publishedText": "4 hours ago",
  "thumbnailUrl": "https://news.google.com/api/attachments/CC8iK0Nn...",
  "searchQuery": "Tesla USA",
  "scrapedAt": "2026-05-18T05:30:00.000000+00:00",
  "error": null
}
```

### Regional targeting

Google News results vary by language and country. Set both fields to get the right regional feed.

**UK tech news:**

```json
{
  "query": "AI regulation",
  "language": "en-GB",
  "country": "GB"
}
```

**German business coverage:**

```json
{
  "query": "Volkswagen",
  "language": "de",
  "country": "DE"
}
```

**Brazilian Portuguese news:**

```json
{
  "query": "Petrobras",
  "language": "pt-BR",
  "country": "BR"
}
```

**Japanese tech news:**

```json
{
  "query": "Sony",
  "language": "ja",
  "country": "JP"
}
```

Leave `ceid` blank in all cases. The actor builds it from country and language automatically.

### Common language and country codes

| Region | language | country |
|--------|----------|---------|
| US English | en-US | US |
| UK English | en-GB | GB |
| German | de | DE |
| French | fr | FR |
| Spanish (Spain) | es | ES |
| Spanish (Mexico) | es-419 | MX |
| Portuguese (Brazil) | pt-BR | BR |
| Japanese | ja | JP |
| Korean | ko | KR |
| Chinese (Simplified) | zh-CN | CN |
| Italian | it | IT |
| Dutch | nl | NL |
| Polish | pl | PL |
| Russian | ru | RU |

### What people use it for

**Brand monitoring** - run your company name, product names, and key competitors as a query list. Schedule the actor daily or weekly to track coverage over time.

**SEO and content research** - see which publishers dominate news results for a keyword. Useful for identifying where to pitch stories or which outlets to monitor.

**Market intelligence** - collect news around earnings, product launches, regulatory changes, or industry events. Pipe results into a spreadsheet or database for analysis.

**Competitor tracking** - run competitor brand names as queries. The `searchQuery` field lets you group results by competitor in your downstream tool.

**Journalist research** - quickly gather source material, check who has covered a story, and see the publication timeline across outlets.

**News datasets** - build labeled training data, news corpora, or topic-specific archives. Each article includes the source name and publish datetime for filtering.

**Alerts and notifications** - schedule the actor on a cron and connect it to a Zapier or Make webhook to push new articles into Slack, email, or any other channel.

### Scheduling

You can schedule this actor to run automatically from the Apify console:

1. Open the actor and click **Schedule**
2. Set the interval (e.g. every 6 hours, once a day)
3. The dataset accumulates results across runs

To get only new articles since the last run, filter by `publishedAt` in your downstream tool using the timestamp from the previous run.

### Output fields reference

| Field | Type | Description |
|-------|------|-------------|
| articleTitle | string | Article headline |
| sourceName | string | Publisher name |
| articleUrl | string | Direct URL to the original article |
| googleNewsUrl | string | Full Google News read URL |
| publishedAt | string | ISO 8601 publish datetime (UTC) |
| publishedText | string | Relative time label (e.g. "Yesterday", "3 hours ago") |
| thumbnailUrl | string | Thumbnail image URL from Google News CDN |
| searchQuery | string | The query that returned this article |
| scrapedAt | string | ISO 8601 timestamp of when the article was scraped (UTC) |
| error | string | Error message if extraction failed for this item, otherwise null |

### Performance

A single query typically returns up to 100 articles in under 30 seconds. Google News shows up to 100 results per search page.

For multi-query runs, the actor processes each query sequentially. A 5-query run with 100 articles each completes in roughly 2-3 minutes.

Set `maxItems` lower if you only need the most recent headlines and want faster runs.

### Frequently asked questions

**Can I scrape Google News without a search query?**
The actor requires at least one search term. Google News does have topic feeds and top stories sections, but those require different URL patterns. This actor is focused on keyword search results.

**Are the article URLs direct links to the publisher?**
Yes. The actor extracts the real article URL from each result, not the Google News redirect URL. The `articleUrl` field points straight to the publisher's page.

**How many results does Google News return per query?**
Typically around 100 results per search page. The actor collects whatever Google News shows for your query, up to the `maxItems` limit you set.

**What happens if one query fails?**
The actor logs the error and continues to the next query. Results for successful queries are still saved to the dataset. No partial run crashes the whole job.

**Can I run the same query for multiple countries at once?**
Not in a single run - country and language apply to all queries in a run. To get US and UK results for the same keyword, run the actor twice with different country settings, or use the Apify scheduler with different input configurations.

**Does it collect the full article text?**
No. The actor collects the data visible on the Google News search results page: title, source, URL, publish date, and thumbnail. To get full article text you would need to follow the `articleUrl` and scrape the publisher's page separately.

# Actor input Schema

## `queries` (type: `array`):

List of search queries to run in a single actor run. Results for each query are collected separately and include a 'searchQuery' field so you can tell them apart.

## `language` (type: `string`):

Google News language code. Examples: en-US, en-GB, de, fr, es, ja.

## `country` (type: `string`):

Two-letter country code for regional results. Examples: US, GB, DE, FR, JP.

## `maxItems` (type: `integer`):

Maximum number of articles to collect per run.

## `requestTimeoutSecs` (type: `integer`):

Per-request timeout in seconds.

## Actor input object example

```json
{
  "queries": [
    "Tesla USA",
    "Apple earnings",
    "AI regulation"
  ],
  "language": "en-US",
  "country": "US",
  "maxItems": 100,
  "requestTimeoutSecs": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        "Tesla USA"
    ],
    "language": "en-US",
    "country": "US"
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/google-news-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "queries": ["Tesla USA"],
    "language": "en-US",
    "country": "US",
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/google-news-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    "Tesla USA"
  ],
  "language": "en-US",
  "country": "US"
}' |
apify call kawsar/google-news-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/google-news-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Google News Extractor",
        "description": "Google News Scraper that pulls article titles, publisher names, direct URLs, and publish dates from any keyword search, so you can track news coverage without setting up your own infrastructure.",
        "version": "0.0",
        "x-build-id": "RmofbBjbKC2OAzXvf"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~google-news-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-google-news-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~google-news-extractor/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-google-news-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~google-news-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-google-news-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "queries": {
                        "title": "Search queries (multiple)",
                        "type": "array",
                        "description": "List of search queries to run in a single actor run. Results for each query are collected separately and include a 'searchQuery' field so you can tell them apart.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "language": {
                        "title": "Language code (hl)",
                        "type": "string",
                        "description": "Google News language code. Examples: en-US, en-GB, de, fr, es, ja.",
                        "default": "en-US"
                    },
                    "country": {
                        "title": "Country code (gl)",
                        "type": "string",
                        "description": "Two-letter country code for regional results. Examples: US, GB, DE, FR, JP.",
                        "default": "US"
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of articles to collect per run.",
                        "default": 100
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Per-request timeout in seconds.",
                        "default": 30
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
