Pricing

Pay per event

Google News Scraper & RSS URL Extractor

Google News scraper that queries the public RSS feed for fresh news headlines and article URLs. Localized scoping, dedupe across queries, and deterministic article URL extraction for downstream NLP.

Pricing

Pay per event

Rating

0.0

(0)

Developer

naoki anzai

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

📰 Google News Scraper

Build robust content discovery pipelines by extracting fresh article URLs and metadata straight from Google News. This actor serves as a highly efficient discovery surface, querying Google News RSS to find the latest articles matching your target keywords. It is explicitly designed for data engineers and developers who need to feed news links into downstream processing tools, such as an Article Content Extractor or LLM ingestion scripts.

Instead of scraping entire news sites blindly, use this tool to discover highly relevant, localized content. You can configure the scraper to pull results for specific regions and languages, making it ideal for global topic monitoring or localized sentiment analysis. The built-in deduplication engine ensures that even if you run dozens of overlapping queries, your final dataset contains only unique article URLs.

Every run delivers structured records containing the target article URL, headline, publisher identity, and precise publication timestamp. This makes it incredibly easy to schedule weekly topic digests, track industry trends over time, or gather a continuous stream of training data for AI models. Stop wrestling with complex news APIs and use this fast, query-based scraper to fuel your data workflows.

Store Quickstart

Start with Quickstart (company news) for a reliable first run.
Use Brand Monitoring to track multiple companies or themes.
Use Google News → Article Cleanup when your next step is article extraction.

Where this actor fits

Surface	Best for
Google News Scraper	Discover current article URLs by query
Article Content Extractor	Clean the discovered article/news/blog pages
Website Content Extractor	Clean discovered non-article pages
RSS Feed Aggregator	Discover fresh URLs from known publishers and blogs

Key Features

🔎 Query-based discovery — Pull article URLs from Google News RSS without a paid API
🌍 Localized results — Tune by language and country
🔄 Deduplication — Remove duplicate URLs across multiple queries
📰 Publisher context — Keep headline, source, description, and publish date
⚡ Fast feeder step — Lightweight discovery before deeper extraction

Use Cases

Who	Why
PR teams	Find the latest media mentions to hand off for cleanup
Competitive intelligence	Build newsroom watchlists from search queries
Content ops	Discover trending stories before enrichment
AI / RAG teams	Create a steady article URL feed for downstream extraction

Input

Field	Type	Default	Description
`queries`	`string[]`	required	Search queries (max 50)
`language`	`string`	`en`	Google News language code
`country`	`string`	`US`	Google News country code
`maxItems`	`integer`	`25`	Max articles per query
`deduplicate`	`boolean`	`true`	Remove duplicate links across queries
`timeoutMs`	`integer`	`15000`	Request timeout
`delivery`	`string`	`dataset`	`dataset` or `webhook`
`webhookUrl`	`string`	—	Webhook target when `delivery=webhook`
`dryRun`	`boolean`	`false`	Run without saving

Input Example

{
  "queries": ["OpenAI", "Google AI"],
  "language": "en",
  "country": "US",
  "maxItems": 20,
  "deduplicate": true
}

Input Examples

Example: Daily tech news in English

{
  "queries": [
    "AI safety",
    "open source"
  ],
  "language": "en",
  "country": "US",
  "maxItemsPerQuery": 30
}

Example: Localized news (JP)

{
  "queries": [
    "人工知能"
  ],
  "language": "ja",
  "country": "JP",
  "maxItemsPerQuery": 50
}

Example: Multi-keyword dedupe run

{
  "queries": [
    "climate",
    "renewable energy"
  ],
  "language": "en",
  "country": "US",
  "maxItemsPerQuery": 40,
  "dedupeAcrossQueries": true
}

Output

Field	Type	Description
`title`	string	Article headline
`link`	string	Direct article URL for downstream cleanup
`source`	string	Publisher name
`pubDate`	string	Original RSS publish date
`pubDateISO`	string	ISO timestamp version of `pubDate`
`description`	string	Short Google News snippet
`query`	string	Search query that surfaced the row

Output Example

{
  "title": "Codex for (almost) everything",
  "link": "https://openai.com/index/codex-for-almost-everything",
  "source": "OpenAI",
  "pubDate": "Thu, 16 Apr 2026 10:00:00 GMT",
  "pubDateISO": "2026-04-16T10:00:00.000Z",
  "description": "The updated Codex app for macOS and Windows adds computer use...",
  "query": "OpenAI"
}

First-run buyer experience

Run Quickstart (company news).
Confirm the dataset shows real article URLs, not generic homepages.
Pick the top URLs and send them to Article Content Extractor.
If a discovered URL is actually a docs/product/policy page, clean it with Website Content Extractor instead.

Tips & Limitations

Use broad queries for the first run; refine later.
RSS is a discovery layer only — it does not return full article bodies.
Combine multiple narrower queries instead of one overloaded boolean query when relevance matters.

FAQ

Can I get full article text here?

No. This actor discovers URLs and returns metadata only. Use Article Content Extractor for full content.

Why use this instead of scraping the Google News UI?

The RSS surface is lighter, more stable, and better suited for recurring discovery runs.

Can I schedule recurring news monitoring?

Yes — run it on a schedule, then pass the discovered URLs into an article-cleanup step.

Content Intelligence Pack handoffs:

📰 Article Content Extractor — clean newsroom and blog article pages
📄 Website Content Extractor — clean discovered non-article pages
📡 RSS Feed Aggregator — discover fresh URLs from known publisher feeds

Cost

Pay Per Event:

actor-start: $0.01
dataset-item: $0.003 per output item

⭐ Was this helpful?

If this actor saved you time, please leave a ★ rating on Apify Store.

Google News RSS Scraper

cloud9_ai/google-news-scraper

Scrape Google News search results via RSS feed. Returns article titles, URLs, sources, publish dates, and summaries for any keyword. No API key needed.

cloud9

Google News Article Scraper

webscrap18/google-news-article-scraper

Scrape Google News, Extract full content with Title, Article Text, Images and Structured data.

WebScrap

News Aggregator - RSS Feed Parser & Article Extractor

klondikeking/news-aggregator

Extract structured news articles from any RSS feed. Get headlines, summaries, publication dates, authors, and source URLs in clean JSON. Perfect for media monitoring, content curation, and news aggregation pipelines.

Pierrick McD0nald

Google News Scraper

supermiojo/google-news-scraper

Extract headlines from Google News by keyword or topic. Get article titles, sources, dates, and URLs for news monitoring, sentiment analysis, and trend tracking.

Igor Araujo

Google News Scraper

leftwinglautus/google-news-scraper

Scrape Google News RSS feed for articles matching a search query, with full metadata extraction.

Moeeze Hassan

Google News Scraper

kayhermes/google-news-scraper

Khoa Nguyen

Google News Scraper

brilliant_gum/google-news-scraper

Scrapes Google News RSS feeds for search queries and topic categories. Returns articles with decoded URLs, sentiment analysis, reading time, full-text extraction, and deduplication across queries.

Yuliia Kulakova

Google News RSS Intelligence

smart_tech_resources/google-news-rss

Collect Google News RSS results without an API key and output normalized, AI-ready news data for monitoring, research, and reporting.

Smart Tech Resources

Google News Scraper

futurizerush/google-news-scraper

Google News Search Scraper - Real-time news aggregation from Google News. Features smart article enrichment with full content extraction. Perfect for market research, trend analysis, and content monitoring.

Rush

115

5.0

Google News Scraper — Headlines, Articles & News Data

oneary/google-news-scraper

Extract the latest Google News articles by keyword. Get headlines, publishers, snippets, publish dates, and article URLs. Perfect for media monitoring, news aggregation, and trend tracking.