Google News Scraper avatar

Google News Scraper

Pricing

$4.99/month + usage

Go to Apify Store
Google News Scraper

Google News Scraper

Scrape news articles from news.google.com with deep article content extraction

Pricing

$4.99/month + usage

Rating

0.0

(0)

Developer

codingfrontend

codingfrontend

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

14 hours ago

Last modified

Share

Google News Scraper 📰🔍

Scrape news articles directly from news.google.com with deep content extraction! 🚀

Features ✨

  • 📰 Scrape news articles from Google News (news.google.com)
  • 🔍 Search by any query term
  • 📄 Deep Scrape: visits each article page for full content (author, text, images, keywords, etc.)
  • 🌐 Support for multiple languages and countries
  • 📊 Rich metadata extraction (OG tags, Twitter cards, reading time, word count)
  • 💾 Results saved directly to your Apify dataset

Use Cases 💡

  • 📰 News monitoring and media tracking
  • 📊 Sentiment analysis and trend detection
  • 🏢 Brand monitoring and PR research
  • 📈 Content research and competitive intelligence
  • 🔍 Journalism research and fact-checking

How It Works 🛠️

  1. Navigates to news.google.com/search?q=your+query
  2. Scrolls to load articles (infinite scroll)
  3. Extracts article links, titles, sources, and dates
  4. Optionally visits each article page for deep content extraction
  5. Saves enriched results to your Apify dataset

Input Parameters 📝

ParameterTypeDescriptionDefault
querystringSearch term for Google News"Elon Musk"
maxItemsintegerMaximum number of articles to collect (10-5000)100
hlstringLanguage code (e.g., en-US, en-IN, fr-FR)"en-US"
glstringCountry code (e.g., US, IN, GB)"US"
deepScrapebooleanVisit each article page for full contenttrue

Output 📊

Overview Fields

FieldDescription
positionArticle position (1, 2, 3...)
titleArticle headline
linkFull article URL
sourcePublisher name
dateRelative date (e.g., "2 hours ago")
authorArticle author
publisherPublisher name (from meta tags)
publishedDateISO date from article page
descriptionArticle description
snippetShort text excerpt
wordCountNumber of words in article
readingTimeEstimated reading time (minutes)
deepScrapeStatus"success" or "failed"

Metadata Fields (Deep Scrape)

FieldDescription
domainArticle domain
canonicalUrlCanonical URL
ogTypeOpen Graph type
twitterCardTwitter card type
twitterSiteTwitter account
sectionArticle section/category
languageArticle language
keywordsArticle keywords
tagsArticle tags
videoUrlVideo URL if present
copyrightCopyright info
mainImageMain article image URL
articleTextFull article text

Input Example

{
"query": "Elon Musk",
"maxItems": 100,
"hl": "en-US",
"gl": "US",
"deepScrape": true
}

Output Sample

{
"position": 1,
"title": "Elon Musk's Bold Plan for AI in Orbit",
"link": "https://www.cnn.com/...",
"domain": "www.cnn.com",
"source": "CNN",
"date": "2 hours ago",
"author": "Jane Doe",
"publisher": "CNN",
"publishedDate": "2025-10-25T15:53:21.000Z",
"description": "Elon Musk unveiled plans for orbiting AI data centers...",
"snippet": "Elon Musk unveiled plans for orbiting AI data centers...",
"articleText": "Full article text extracted from the page...",
"wordCount": 850,
"readingTime": 4,
"mainImage": "https://cdn.cnn.com/image.jpg",
"canonicalUrl": "https://www.cnn.com/2025/10/25/tech/elon-musk-ai",
"ogType": "article",
"twitterCard": "summary_large_image",
"language": "en",
"keywords": ["Elon Musk", "AI", "SpaceX"],
"deepScrapeStatus": "success",
"scrapedAt": "2025-10-25T16:30:00.000Z"
}

Proxy Configuration 🌐

For best results, use RESIDENTIAL proxy group:

{
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}