# News Intelligence Scraper — AI Agent Real-Time News API (`logiover/news-intelligence-scraper`) Actor

Multi-source real-time news aggregator for AI agents: Google News, Bing News and DuckDuckGo News merged, deduplicated, source-ranked and sentiment-scored. One topic or company to clean structured news feed. No API key, no browser.

- **URL**: https://apify.com/logiover/news-intelligence-scraper.md
- **Developed by:** [Logiover](https://apify.com/logiover) (community)
- **Categories:** News, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## News Intelligence Scraper — AI Agent Real-Time News API

> **Multi-source real-time news aggregator for AI agents.** Drop in a topic, company name or keyword and get back a clean, deduplicated, sentiment-scored news feed merged from Google News, Bing News and DuckDuckGo News — all in one Apify Actor run. No API key, no headless browser, no per-source scrapers to maintain.

Built for the new wave of **AI agents that need fresh, grounded information** — analyst agents tracking a market, brand-monitoring agents watching sentiment, research agents summarizing "what's happening with X this week", and RAG pipelines that must cite current sources instead of relying on training-data knowledge with a cutoff date.

---

### 🎯 What this Actor is for

Large language models don't know what happened yesterday. When an AI agent is asked *"what's the latest on OpenAI?"* or *"summarize this week's electric-vehicle news"*, it needs **structured, current, multi-source news** — not a single publisher's RSS or a raw HTML page to re-parse. `news-intelligence-scraper` is that real-time grounding layer:

- **One topic → many sources.** A single query is fanned out to Google News, Bing News and DuckDuckGo News RSS feeds in parallel, then merged.
- **Deduplicated.** The same story syndicated across outlets (or the same wire story on multiple aggregators) is collapsed into one row with a `duplicateCount` and the list of source feeds that carried it.
- **Sentiment-scored.** A lightweight lexicon model tags each headline + snippet with a `-1..+1` score and a `positive`/`negative`/`neutral` label — no ML dependencies, fast, and good enough for trend signals.
- **Time-filtered.** Keep only the last N days; sort newest-first.
- **AI-agent friendly schema.** Predictable fields, ISO dates, nullable values, per-item source attribution. Drop straight into a prompt or a vector store.
- **Batch by default.** Feed 50 topics and get 50 merged feeds back in one run — perfect for monitoring dashboards and trend reports.
- **No keys, no browser.** Pure HTTP + RSS parsing on a small Node 20 container. Cheap, fast, resilient.

---

### ✨ Key features

- **🌐 Three source feeds** — Google News RSS, Bing News RSS, DuckDuckGo News RSS, fetched in parallel per query. Pick any subset.
- **🔀 Cross-source deduplication** — exact key dedup (source-domain + normalized title) plus fuzzy token-Jaccard title similarity (>0.72) to catch syndicated/wire copies across outlets.
- **📈 Sentiment scoring** — AFINN-style lexicon (~250 weighted terms + negation handling) producing a normalized `-1..+1` score and `positive`/`negative`/`neutral` label on every item.
- **📅 Time filtering** — `daysBack` keeps only items published within a window (0 = no filter). Items without a parseable date are kept (sorted last).
- **🏷️ Source attribution** — every item carries the outlet name, the source domain, and the list of feeds (`sourceFeeds`) that surfaced it. `duplicateCount` shows how many raw copies were merged.
- **🌍 Localization** — Google News `hl`/`gl` params for language + country targeting (en-US, tr-TR, de-DE, fr-FR, …).
- **📰 Company mode** — pass a company name or domain to track brand news specifically.
- **📚 Bulk mode** — many topics in one run, each producing its own merged feed, tagged with the originating `query`.
- **🌐 Proxy-aware** — Apify datacenter proxy by default to avoid per-IP rate limits on news RSS endpoints.
- **💰 Pay-per-result** — charged per saved news item, not per run. Empty results (no matches) are free.

---

### 🤖 Why AI agents need this

News is one of the highest-value grounding tasks for agentic systems. The reasons are simple: news is **time-sensitive** (yesterday's answer is wrong today), **fragmented** (no single source has everything), and **noisy** (the same story is republished dozens of times). An agent that browses one publisher gets a biased, partial view. An agent that hits a single news API gets rate-limited or charged per call. `news-intelligence-scraper` solves all three at once:

1. **Brand & reputation monitoring.** A comms agent watches a company name across three feeds, deduplicates syndications, and surfaces the sentiment trend over 30 days.
2. **Market intelligence.** An analyst agent queries a basket of 20 industry keywords weekly and builds a sentiment-weighted news index.
3. **Event grounding.** A research agent answering *"why did X stock move?"* pulls this week's deduped news for the ticker's company, sorted by sentiment, and summarizes the negative cluster.
4. **Competitor tracking.** A GTM agent monitors competitor names and surfaces only the genuinely new items (dedup kills the wire echo chamber).
5. **RAG freshness.** A support/analyst agent embeds the latest N news items per topic into a vector store so its answers cite current events instead of stale training data.
6. **Crisis detection.** A monitoring agent runs every hour on a watchlist and alerts when the negative-sentiment item count crosses a threshold.

Each of these is one Actor call (or a scheduled run). The output is a clean table of articles ready for an LLM to read, summarize, or cite.

---

### 📦 What you get (output schema)

Every run streams **one news article per row** to the default dataset. An item looks like:

```json
{
  "query": "openai",
  "title": "OpenAI announces new reasoning model",
  "url": "https://techcrunch.com/2026/07/01/openai-...",
  "snippet": "The company said the new model improves on... (first 500 chars)",
  "source": "TechCrunch",
  "sourceDomain": "techcrunch.com",
  "sourceFeeds": ["googleNews", "bingNews"],
  "publishedAt": "Tue, 01 Jul 2026 14:30:00 GMT",
  "publishedDate": "2026-07-01",
  "language": "en-US",
  "sentimentScore": 0.42,
  "sentimentLabel": "positive",
  "duplicateCount": 3,
  "scrapedAt": "2026-07-02T12:00:00.000Z"
}
````

Use the **Overview** view to scan all items newest-first with sentiment, or the **By query** view to pivot on the originating topic.

***

### 🚀 How to use

#### 1. Aggregate news for one topic

```json
{
  "mode": "topic",
  "query": "openai",
  "sources": ["googleNews", "bingNews", "duckduckgoNews"],
  "maxPerSource": 50,
  "maxResults": 100,
  "daysBack": 7,
  "sentiment": true
}
```

#### 2. Track a company's news

```json
{
  "mode": "company",
  "query": "stripe.com",
  "daysBack": 30,
  "maxResults": 200
}
```

#### 3. Bulk: many topics in one run

```json
{
  "mode": "bulk",
  "queries": ["openai", "anthropic", "mistral ai", "electric vehicles", "AI regulation"],
  "daysBack": 7,
  "maxResults": 40
}
```

#### From code (Apify SDK)

```js
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('logiover/news-intelligence-scraper').call({
  mode: 'topic',
  query: 'openai',
  daysBack: 7,
  sentiment: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const positive = items.filter(i => i.sentimentLabel === 'positive');
console.log(`${positive.length} positive items of ${items.length}`);
```

#### As an MCP tool for AI agents

Wrap this Actor in an MCP server. An agent calls the tool with a topic and receives a clean, deduplicated, sentiment-tagged news feed in its context — no browsing, no HTML parsing, no per-source API juggling on the agent side.

***

### 🔧 Input fields

| Field | Type | Default | Description |
|---|---|---|---|
| `mode` | enum | `topic` | `topic` (one topic), `company` (company news), `bulk` (many topics). |
| `query` | string | — | Topic/keyword/company for `topic` & `company` modes. Quoted phrases respected. |
| `queries` | array | — | Topics for `bulk` mode. |
| `sources` | array | all | Which feeds to aggregate: `googleNews`, `bingNews`, `duckduckgoNews`. |
| `maxPerSource` | int | 50 | Cap pulled from each source per query (1–200). |
| `maxResults` | int | 200 | Final cap on deduplicated items saved per query (1–2000). |
| `daysBack` | int | 30 | Keep only items within N days. 0 = no filter (0–365). |
| `language` | string | `en-US` | Google News `hl` (e.g. `tr-TR`, `de-DE`). |
| `country` | string | `US` | Google News `gl` (e.g. `GB`, `DE`). |
| `dedup` | bool | true | Merge near-duplicates across sources (URL + title similarity). |
| `sentiment` | bool | true | Run lexicon sentiment on title+snippet. |
| `useApifyProxy` | bool | true | Route through Apify datacenter proxy. |

***

### 🧩 How it works

1. **Build feed URLs.** For the query, construct the RSS URL for each enabled source: Google News (`/rss/search?q=…&hl=…&gl=…&ceid=…`), Bing News (`/news/search?q=…&format=rss`), DuckDuckGo News (`/?q=…&iar=news&format=rss`).
2. **Fetch in parallel.** All sources for one query are fetched concurrently over the Apify proxy with a browser-like User-Agent and a retry/backoff policy for transient errors.
3. **Parse RSS.** A source-agnostic regex parser extracts `<item>` blocks and reads `title`, `link`, `description`, `pubDate`/`dc:date`, and `<source>` (name + url). HTML entities and tags are stripped.
4. **Normalize.** Each item is mapped to a flat record with `title`, `url`, `snippet`, `source` (outlet name), `sourceDomain`, `sourceFeed`, `publishedAt`.
5. **Exact dedup.** Items are keyed by `sourceDomain + normalized-title-prefix`. Collisions merge: `duplicateCount` increments, `sourceFeeds` unions, the richer snippet/earliest date wins.
6. **Fuzzy dedup.** Across different domains, a token-Jaccard similarity on normalized titles (>0.72) collapses syndicated/wire copies (e.g. the same AP story on 12 outlets) into one row.
7. **Time filter.** If `daysBack > 0`, items with a parseable date older than the cutoff are dropped; items without a date are kept (sorted last).
8. **Sort.** Newest-first by `publishedAt`.
9. **Sentiment.** The title + snippet are tokenized; each token is looked up in the lexicon (with negation handling), and the score is normalized to `-1..+1` and bucketed into `positive`/`negative`/`neutral`.
10. **Stream.** Each item is pushed to the dataset and one `result` event is charged.

***

### 💡 Tips & best practices

- **Use all three sources for coverage.** Google News is broadest; Bing and DuckDuckGo catch outlets Google deprioritizes. The dedup step makes more sources strictly better (up to your `maxResults`).
- **Set `daysBack` for freshness.** For dashboards, 7 days; for trend reports, 30; for historical deep-dives, raise `maxResults` and widen the window.
- **Bulk mode for watchlists.** Pass 20–50 topics and let the Actor loop. Each topic's items are tagged with `query` so you can pivot downstream.
- **Sentiment is a signal, not a verdict.** Lexicon sentiment is fast and cheap but misses sarcasm and context. Use it to *rank* and *filter*, not to make final judgments — let the LLM read the items for nuance.
- **Localize for non-English markets.** Set `language: "de-DE"`, `country: "DE"` for German news; `tr-TR`/`TR` for Turkish, etc. The sentiment lexicon is English-centric, so consider disabling `sentiment` for non-English or treating labels as approximate.
- **Schedule recurring runs.** News changes hourly. Schedule a run every few hours over your watchlist and diff datasets to detect new items.
- **Combine with related Actors.** Pair with `company-deep-research-scraper` (for company context), `discussion-intelligence-scraper` (for social/forum opinion), and `bulk-rss-feed-reader` (for direct publisher feeds).

***

### ❓ FAQ

#### Does this Actor need any API keys?

No. It reads public RSS feeds from Google News, Bing News and DuckDuckGo News. Just an Apify account.

#### Why three sources instead of just Google News?

Single-source news is biased and incomplete. Different aggregators surface different outlets and different rankings. Merging three and deduplicating gives broader coverage and a `duplicateCount` signal (how widely a story was syndicated) that's itself useful.

#### How does deduplication work?

Two stages: (1) exact key dedup on `source-domain + normalized-title-prefix` catches the same article republished; (2) fuzzy token-Jaccard title similarity (>0.72) catches wire/syndicated stories phrased slightly differently across outlets. `duplicateCount` records how many raw copies merged into the saved row.

#### Is the sentiment accurate?

It's a fast lexicon model (~250 weighted terms + negation), not a transformer. It's good for *trends and ranking* (e.g. "show me the most negative items"), less reliable on sarcasm or domain-specific jargon. For production-grade sentiment, post-process the items with an LLM.

#### How far back can I get news?

The RSS feeds return recent items (typically the last few days to weeks depending on the source and query volume). `daysBack` filters within that window. For months/years of history, combine with `wayback-machine-url-extractor` or a dedicated archive Actor.

#### Why do some items have no `publishedDate`?

Some feeds omit `<pubDate>`. Those items are kept (they may still be relevant) but sorted last. The `publishedAt` raw string is always preserved when available.

#### Can I get the full article text?

This Actor returns title + snippet (the RSS `<description>`). For full article bodies, pass the `url` field into a content extractor like `website-text-markdown-crawler`.

#### How is this priced?

Pay-per-result: one `result` event per saved (deduplicated) news item. Runs that yield zero items (no matches) are free.

#### Will I get rate-limited?

The Actor uses the Apify datacenter proxy and polite delays. News RSS endpoints are lenient. For very high-frequency runs, lower `maxPerSource` and increase the delay between bulk queries.

#### Can AI agents call this directly?

Yes. Expose it through an MCP server or Apify tool integration; the agent passes a topic and gets a clean JSON news feed back. This is the primary design target.

***

### 🔗 Related Actors

- **company-deep-research-scraper** — company dossier (tech stack, socials, contacts) for context.
- **discussion-intelligence-scraper** — Reddit + Hacker News + Product Hunt + Stack Exchange opinion.
- **bulk-rss-feed-reader** — read specific publisher RSS feeds directly.
- **substack-newsletter-scraper** — Substack newsletter posts.
- **google-news-scraper** — single-source Google News.
- **website-text-markdown-crawler** — extract full article body from a news URL.
- **hacker-news-search-scraper** — HN-specific search.
- **reddit-subreddit-scraper** / **reddit-search-scraper** — Reddit-specific.

***

### 📝 Changelog

#### 2026-07-02 — v1.0

- Initial release.
- 3 modes: `topic`, `company`, `bulk`.
- 3 sources: Google News, Bing News, DuckDuckGo News (any subset).
- Two-stage dedup (exact key + fuzzy title Jaccard).
- Lexicon sentiment (-1..+1, positive/negative/neutral).
- Time filtering (`daysBack`), localization (`hl`/`gl`).
- Apify datacenter proxy default.
- Pay-per-result (`result` event per saved item).

***

### ⚖️ Disclaimer

This Actor reads publicly available RSS feeds. It does not authenticate, bypass access controls, or scrape behind paywalls. News content is owned by the respective publishers; respect their Terms of Service. Use for monitoring, research and AI-agent grounding on data that is already public.

# Actor input Schema

## `mode` (type: `string`):

How to run.

• **topic** — aggregate news for one or many topics/keywords (highest volume)
• **company** — news about a specific company (by name or domain)
• **bulk** — batch many topics into one run, each producing a merged feed

## `query` (type: `string`):

Free-text query for **topic** or **company** mode, e.g. `openai`, `"electric vehicles"`, `AI regulation`. Quoted phrases are respected.

## `queries` (type: `array`):

Array of topics/keywords for **bulk** mode.

## `sources` (type: `array`):

Which RSS feeds to aggregate. More sources = more coverage + dedup benefit. Defaults to all.

## `maxPerSource` (type: `integer`):

Cap pulled from each source per query. Higher = more coverage before dedup.

## `maxResults` (type: `integer`):

Final cap on deduplicated items saved per query.

## `daysBack` (type: `integer`):

Only keep items published within this many days. 0 = no time filter.

## `language` (type: `string`):

BCP-47 language for Google News, e.g. `en-US`, `tr-TR`, `de-DE`, `fr-FR`.

## `country` (type: `string`):

ISO 3166-1 alpha-2 country for Google News, e.g. `US`, `GB`, `DE`.

## `dedup` (type: `boolean`):

Merge near-duplicate articles across sources (URL + title similarity). Recommended.

## `sentiment` (type: `boolean`):

Run a lightweight lexicon sentiment score (-1..+1) on each title+snippet. Cheap, no ML deps.

## `useApifyProxy` (type: `boolean`):

Route through Apify datacenter proxy. Recommended to avoid per-IP rate limits on news RSS.

## Actor input object example

```json
{
  "mode": "topic",
  "query": "openai",
  "queries": [
    "openai",
    "anthropic"
  ],
  "sources": [
    "googleNews",
    "bingNews"
  ],
  "maxPerSource": 30,
  "maxResults": 60,
  "daysBack": 7,
  "language": "en-US",
  "country": "US",
  "dedup": true,
  "sentiment": true,
  "useApifyProxy": true
}
```

# Actor output Schema

## `results` (type: `string`):

Full dataset of merged, deduplicated, sentiment-scored news items.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "mode": "topic",
    "query": "openai",
    "queries": [
        "openai",
        "anthropic"
    ],
    "sources": [
        "googleNews",
        "bingNews"
    ],
    "maxPerSource": 30,
    "maxResults": 60,
    "daysBack": 7,
    "language": "en-US",
    "country": "US",
    "dedup": true,
    "sentiment": true,
    "useApifyProxy": true
};

// Run the Actor and wait for it to finish
const run = await client.actor("logiover/news-intelligence-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "mode": "topic",
    "query": "openai",
    "queries": [
        "openai",
        "anthropic",
    ],
    "sources": [
        "googleNews",
        "bingNews",
    ],
    "maxPerSource": 30,
    "maxResults": 60,
    "daysBack": 7,
    "language": "en-US",
    "country": "US",
    "dedup": True,
    "sentiment": True,
    "useApifyProxy": True,
}

# Run the Actor and wait for it to finish
run = client.actor("logiover/news-intelligence-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "mode": "topic",
  "query": "openai",
  "queries": [
    "openai",
    "anthropic"
  ],
  "sources": [
    "googleNews",
    "bingNews"
  ],
  "maxPerSource": 30,
  "maxResults": 60,
  "daysBack": 7,
  "language": "en-US",
  "country": "US",
  "dedup": true,
  "sentiment": true,
  "useApifyProxy": true
}' |
apify call logiover/news-intelligence-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=logiover/news-intelligence-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "News Intelligence Scraper — AI Agent Real-Time News API",
        "description": "Multi-source real-time news aggregator for AI agents: Google News, Bing News and DuckDuckGo News merged, deduplicated, source-ranked and sentiment-scored. One topic or company to clean structured news feed. No API key, no browser.",
        "version": "1.0",
        "x-build-id": "gcbEYS10kIbYNtnGj"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/logiover~news-intelligence-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-logiover-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/logiover~news-intelligence-scraper/runs": {
            "post": {
                "operationId": "runs-sync-logiover-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/logiover~news-intelligence-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-logiover-news-intelligence-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Mode",
                        "enum": [
                            "topic",
                            "company",
                            "bulk"
                        ],
                        "type": "string",
                        "description": "How to run.\n\n• **topic** — aggregate news for one or many topics/keywords (highest volume)\n• **company** — news about a specific company (by name or domain)\n• **bulk** — batch many topics into one run, each producing a merged feed",
                        "default": "topic"
                    },
                    "query": {
                        "title": "Topic / keyword (topic & company)",
                        "type": "string",
                        "description": "Free-text query for **topic** or **company** mode, e.g. `openai`, `\"electric vehicles\"`, `AI regulation`. Quoted phrases are respected."
                    },
                    "queries": {
                        "title": "Topics (bulk mode)",
                        "type": "array",
                        "description": "Array of topics/keywords for **bulk** mode.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "sources": {
                        "title": "News sources",
                        "type": "array",
                        "description": "Which RSS feeds to aggregate. More sources = more coverage + dedup benefit. Defaults to all.",
                        "items": {
                            "type": "string"
                        },
                        "default": [
                            "googleNews",
                            "bingNews"
                        ]
                    },
                    "maxPerSource": {
                        "title": "Max items per source",
                        "minimum": 1,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Cap pulled from each source per query. Higher = more coverage before dedup.",
                        "default": 50
                    },
                    "maxResults": {
                        "title": "Max results (after dedup)",
                        "minimum": 1,
                        "maximum": 2000,
                        "type": "integer",
                        "description": "Final cap on deduplicated items saved per query.",
                        "default": 200
                    },
                    "daysBack": {
                        "title": "Days back",
                        "minimum": 0,
                        "maximum": 365,
                        "type": "integer",
                        "description": "Only keep items published within this many days. 0 = no time filter.",
                        "default": 30
                    },
                    "language": {
                        "title": "Language (hl)",
                        "type": "string",
                        "description": "BCP-47 language for Google News, e.g. `en-US`, `tr-TR`, `de-DE`, `fr-FR`.",
                        "default": "en-US"
                    },
                    "country": {
                        "title": "Country (gl)",
                        "type": "string",
                        "description": "ISO 3166-1 alpha-2 country for Google News, e.g. `US`, `GB`, `DE`.",
                        "default": "US"
                    },
                    "dedup": {
                        "title": "Deduplicate",
                        "type": "boolean",
                        "description": "Merge near-duplicate articles across sources (URL + title similarity). Recommended.",
                        "default": true
                    },
                    "sentiment": {
                        "title": "Score sentiment",
                        "type": "boolean",
                        "description": "Run a lightweight lexicon sentiment score (-1..+1) on each title+snippet. Cheap, no ML deps.",
                        "default": true
                    },
                    "useApifyProxy": {
                        "title": "Use Apify datacenter proxy",
                        "type": "boolean",
                        "description": "Route through Apify datacenter proxy. Recommended to avoid per-IP rate limits on news RSS.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
