Pricing

from $3.50 / 1,000 results

News Intelligence Scraper — AI Agent Real-Time News API

Multi-source real-time news aggregator for AI agents: Google News, Bing News and DuckDuckGo News merged, deduplicated, source-ranked and sentiment-scored. One topic or company to clean structured news feed. No API key, no browser.

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Actor stats

Bookmarked

Total users

Monthly active users

6 days ago

Last modified

News Intelligence Scraper — Real-Time Multi-Source News API for AI Agents (No API Key)

Drop in a topic, keyword or company name and get back a clean, deduplicated, sentiment-scored news feed merged from Google News, Bing News and DuckDuckGo News in a single Apify run. Every article arrives as a flat record with headline, URL, snippet, source outlet, source domain, publish date, sentiment score & label, and a duplicate count — the real-time grounding layer for AI agents, brand-monitoring bots and RAG pipelines that must cite current sources instead of stale training data. Fast, no browser, no API key, no per-source scrapers to maintain.

🏆 Why this news scraper?

15 fields per article · up to 2,000 deduplicated items per query, many queries per run · 3 news sources merged & cross-deduplicated in one call · lexicon sentiment (-1..+1) with no ML deps · direct HTTP + RSS (no browser, no key) · export to JSON / CSV / Excel. The multi-source news API alternative for monitoring, market intelligence and agentic grounding.

✨ What this Actor does / Key features

🌐 Three sources, one query — Google News, Bing News and DuckDuckGo News RSS feeds are fanned out per query and merged. Pick any subset.
🔀 Cross-source deduplication — exact key dedup (source-domain + normalized title) plus fuzzy token-Jaccard title similarity (>0.72) collapses syndicated/wire copies across outlets into one row.
📈 Sentiment scoring — an AFINN-style lexicon (~250 weighted terms + negation handling) tags every headline+snippet with a normalized -1..+1 sentimentScore and a positive/negative/neutral label. Fast, no ML dependencies.
🏷️ Full source attribution — each item carries the outlet source, the sourceDomain, the list of feeds (sourceFeeds) that surfaced it, and a duplicateCount showing how many raw copies were merged (a syndication signal).
📅 Time filtering — daysBack keeps only items published within a rolling window; results are sorted newest-first.
🌍 Localization — Google News hl/gl params for language + country targeting (en-US, tr-TR, de-DE, fr-FR, …).
🏢 Company mode — pass a company name or domain to track brand news specifically.
📚 Bulk mode — feed dozens of topics and get one merged feed per topic in a single run, each tagged with its originating query.
🤖 AI-agent-friendly schema — predictable flat fields, ISO dates, nullable values, per-item attribution — drop straight into a prompt or a vector store.
⚡ No keys, no browser — pure HTTP + RSS parsing on a small Node 20 container. Cheap, fast, resilient. Empty results (no matches) are free.

🚀 Quick start (3 steps)

Configure — pick a mode (topic, company or bulk), type your query (or queries for bulk), and optionally set sources, daysBack and sentiment.
Run — click Start. The Actor fetches every source in parallel, merges, deduplicates, scores sentiment and streams articles into your dataset.
Get your data — open the Output tab and export to JSON, CSV, Excel or XML, or pull it via the Apify API.

📥 Input

The only required field is mode. Use query for topic/company modes and queries for bulk mode.

Example — aggregate news for one topic

{
  "mode": "topic",
  "query": "openai",
  "sources": ["googleNews", "bingNews", "duckduckgoNews"],
  "maxPerSource": 50,
  "maxResults": 100,
  "daysBack": 7,
  "sentiment": true
}

Example — track a company's news (brand monitoring)

{
  "mode": "company",
  "query": "stripe.com",
  "daysBack": 30,
  "maxResults": 200
}

Example — bulk watchlist, many topics in one run

{
  "mode": "bulk",
  "queries": ["openai", "anthropic", "mistral ai", "electric vehicles", "AI regulation"],
  "daysBack": 7,
  "maxResults": 40
}

Field	Type	Description
`mode`	string	`topic` (one topic/keyword), `company` (company news by name or domain), or `bulk` (many topics). Required.
`query`	string	Free-text topic/keyword/company for `topic` & `company` modes. Quoted phrases are respected.
`queries`	array	Array of topics/keywords for `bulk` mode.
`sources`	array	Which feeds to aggregate: `googleNews`, `bingNews`, `duckduckgoNews`. More sources = more coverage + dedup benefit.
`maxPerSource`	integer	Cap pulled from each source per query (1–200). Default 50.
`maxResults`	integer	Final cap on deduplicated items saved per query (1–2000). Default 200.
`daysBack`	integer	Keep only items published within N days (0–365). `0` = no time filter. Default 30.
`language`	string	Google News `hl` (BCP-47), e.g. `en-US`, `tr-TR`, `de-DE`.
`country`	string	Google News `gl` (ISO 3166-1 alpha-2), e.g. `US`, `GB`, `DE`.
`dedup`	boolean	Merge near-duplicate articles across sources (URL + title similarity). Recommended.
`sentiment`	boolean	Run lexicon sentiment on each title+snippet. Cheap, no ML deps.
`useApifyProxy`	boolean	Route through the Apify datacenter proxy to avoid per-IP rate limits on news RSS. Recommended.

Tip: Using all three sources makes the result strictly better up to your maxResults, because the dedup step folds overlap away and the duplicateCount field tells you how widely each story was syndicated. Set daysBack: 7 for dashboards, 30 for trend reports.

📤 Output

One row per deduplicated article — 15 fields, exportable to JSON, CSV, Excel or XML. Here is a trimmed sample record:

{
  "query": "openai",
  "title": "OpenAI announces new reasoning model",
  "url": "https://techcrunch.com/2026/07/01/openai-reasoning-model",
  "snippet": "The company said the new model improves on prior reasoning benchmarks while cutting latency… (first ~500 chars of the RSS description)",
  "source": "TechCrunch",
  "sourceDomain": "techcrunch.com",
  "sourceFeeds": ["googleNews", "bingNews"],
  "publishedAt": "Tue, 01 Jul 2026 14:30:00 GMT",
  "publishedDate": "2026-07-01",
  "language": "en-US",
  "sentimentScore": 0.42,
  "sentimentLabel": "positive",
  "imageUrl": "https://techcrunch.com/…/thumb.jpg",
  "duplicateCount": 3,
  "scrapedAt": "2026-07-02T12:00:00.000Z"
}

Use the Overview view to scan all items newest-first with sentiment and source, or the By query view to pivot on the originating topic.

🤖 Why AI agents need this

News is one of the highest-value grounding tasks for agentic systems: it's time-sensitive (yesterday's answer is wrong today), fragmented (no single source has everything), and noisy (the same story is republished dozens of times). An agent that browses one publisher gets a partial view; an agent that hits a single news API gets rate-limited or charged per call. This Actor solves all three at once — a single call returns a clean, deduplicated, sentiment-tagged table of articles ready for an LLM to read, summarize or cite.

💡 Use cases

Brand & reputation monitoring — watch a company name across three feeds, deduplicate syndications, and surface the sentiment trend over 30 days.
Market intelligence — query a basket of industry keywords weekly and build a sentiment-weighted news index.
Event grounding — answer "why did X move?" by pulling this week's deduped news for the company, sorted by sentiment, and summarizing the negative cluster.
Competitor tracking — monitor competitor names and surface only genuinely new items — dedup kills the wire echo chamber.
RAG freshness — embed the latest N items per topic into a vector store so answers cite current events, not stale training data.
Crisis detection — run hourly on a watchlist and alert when the negative-sentiment item count crosses a threshold.

👥 Who uses it

AI-agent and RAG builders needing a real-time grounding layer · comms & PR teams running brand-monitoring dashboards · market-intelligence and equity-research analysts · GTM/competitive-intelligence teams tracking rivals · data teams building news datasets and sentiment indices.

💰 Pricing

This Actor runs on a simple pay-per-result model — one charge per saved (deduplicated) news item, with no separate Apify platform fees to calculate. Runs that yield zero items (no matches) are free. Try it on the free tier first, then scale up. See the Pricing tab on this page for the current rate.

❓ Frequently Asked Questions

Is it legal to scrape news this way? This Actor reads publicly available RSS feeds from Google News, Bing News and DuckDuckGo News. It does not authenticate, bypass access controls, or scrape behind paywalls. News content is owned by the respective publishers; respect their Terms of Service and use the data for monitoring, research and AI-agent grounding.

Do these news services have a public API? Is this an API alternative? There is no single unified, free public news API that merges Google, Bing and DuckDuckGo and deduplicates across them. This Actor works as that multi-source news API alternative: it reads each provider's public RSS, merges and cross-deduplicates the results, scores sentiment, and hands you one clean structured feed — no per-source API keys or client code to maintain.

Do I need an API key or a login? No. There is no news account, login or API key required — only an Apify account.

Can I get news without an API key or login? Yes. The Actor pulls public RSS feeds over direct HTTP, so no keys or accounts are needed on the news side.

How much data can I get? Up to 2,000 deduplicated items per query, with many queries per run in bulk mode. RSS feeds return recent items (typically the last few days to weeks depending on the source and query volume); daysBack filters within that window.

How do I export the news data to CSV or JSON? Run the Actor, then export the resulting dataset as CSV, JSON, Excel or XML from the Apify console, or pull it via the Apify API.

Why three sources instead of just Google News?

Single-source news is biased and incomplete — different aggregators surface different outlets and rankings. Merging three and deduplicating gives broader coverage plus a duplicateCount signal (how widely a story was syndicated) that's itself useful.

How does deduplication work?

Two stages: (1) exact key dedup on source-domain + normalized-title-prefix catches the same article republished; (2) fuzzy token-Jaccard title similarity (>0.72) catches wire/syndicated stories phrased slightly differently across outlets. duplicateCount records how many raw copies merged.

Is the sentiment accurate?

It's a fast lexicon model (~250 weighted terms + negation), not a transformer. It's great for trends and ranking (e.g. "show me the most negative items") and less reliable on sarcasm or domain jargon. For production-grade sentiment, post-process the items with an LLM. The lexicon is English-centric, so treat labels as approximate for non-English queries (or disable sentiment).

Why do some items have no `publishedDate`?

Some feeds omit <pubDate>. Those items are kept (they may still be relevant) but sorted last; the raw publishedAt string is preserved when available.

Can I get the full article text?

This Actor returns title + snippet (the RSS <description>). For full article bodies, pass the url field into a content extractor like the Website Text & Markdown Crawler.

Can AI agents call this directly?

Yes — this is the primary design target. Expose it through an MCP server or Apify tool integration; the agent passes a topic and gets a clean JSON news feed back, no browsing or HTML parsing on the agent side.

🔗 More AI-intelligence & research scrapers by logiover

Building an agentic or competitive-intelligence pipeline? Pair this Actor with the rest of the AI-research suite:

Actor	What it does
Google News Scraper	Single-source Google News, deep
Discussion Intelligence Scraper	Reddit + HN + Product Hunt + Stack Exchange opinion
Company Deep Research Scraper	Full company dossier: tech stack, socials, contacts
AI Deep Research	Autonomous multi-source research agent
AI Web Search	Structured web search results for agents
AI Web Extract	Extract structured data from any page
AI Citation Source Finder	Find citable sources for AI-generated claims
B2B AI Visibility Tracker	Track how brands surface in AI answers
CVE Security Advisory Monitor	Fresh CVEs & security advisories
arXiv Paper Scraper	Research paper metadata, abstracts & authors
Semantic Scholar Research Scraper	Peer-reviewed papers, citations & influence
GitHub Activity Stream	Real-time commits, releases & events

👉 Browse all logiover scrapers on Apify Store — 180+ actors across real estate, jobs, crypto, social media & B2B data.

⏰ Scheduling & integration

News changes hourly. Schedule this Actor on Apify to run every few hours over your watchlist and diff datasets to detect new items. Export results to JSON, CSV or Excel, sync to Google Sheets, or push to your database, BI tools and webhooks through the Apify API. Connect it to Make, n8n or Zapier to build automated monitoring and alerting pipelines — or wrap it in an MCP server so AI agents can pull a fresh, deduplicated news feed straight into their context.

⭐ Support & feedback

Found a bug or need an extra field? Open an issue on the Issues tab — response is usually fast. If this Actor saves you time, a ★★★★★ review on the Store page genuinely helps and is hugely appreciated. 🙏

⚖️ Legal

This Actor reads publicly available RSS feeds only. It does not authenticate, bypass access controls, or scrape behind paywalls. News content is owned by the respective publishers; respect their Terms of Service. Use for monitoring, research and AI-agent grounding on data that is already public, and comply with any applicable local laws.

📝 Changelog

2026-07-06

✨ README overhaul: keyword-rich hero + badges, full 15-field output reference with a realistic sample, ready-to-run example scenarios, high-intent FAQ (legality, API alternative, no-key, CSV/JSON export, data volume), and cross-links to the wider AI-research scraper suite.

2026-07-02 — v1.0

Initial release.
3 modes: topic, company, bulk.
3 sources: Google News, Bing News, DuckDuckGo News (any subset).
Two-stage dedup (exact key + fuzzy title Jaccard).
Lexicon sentiment (-1..+1, positive/negative/neutral).
Time filtering (daysBack), localization (hl/gl).
Apify datacenter proxy default.
Pay-per-result (result event per saved item).

Google News Scraper

kayhermes/google-news-scraper

Khoa Nguyen

Real-Time Google News Scraper (Keywords + Topics + AI-ready)

ahmed_jasarevic/google-news-actor

Extract structured, real-time news data from Google News using keywords or topic-based scraping.

Ahmed Jasarevic

Google News Scraper

muscular_quadruplet/google-news-scraper

Scrape Google News articles by keyword or topic. Get headlines, sources, publish dates, snippets. Monitor news mentions, track industry trends, build news aggregators. Real-time news scraping.

Do It

Multi News Scraper - Google, Bing & Yahoo News API

groupoject/multi-news-scraper

Scrape Google News, Bing News, and Yahoo News by keyword for brand monitoring, competitor tracking, market research, headlines, sources, dates, snippets, and article URLs. No API key required.

Group Oject

Free Google News API — Search News by Keyword + Country

s-r/google-news

Free Google News scraper — get clean structured news results for any query, country, and language. Use it as a Google News API for brand monitoring, topic alerts, news clipping, and bulk article URL harvesting.

Ultimate News API

glitch_404/Ultimate-News-Scraper

Scrape up to 10000 news articles from over 4500 news sources in less than 20 minutes, news from over 20 categories, e.g., Crypto news, World News, Latest News, Celebrities, and a lot more. You can find news on websites such as Fox News, BBC News, CNN, and Cryptocurrency-Related News Sources.