Google News Scraper - Articles, Sources, Real URLs, No API Key avatar

Google News Scraper - Articles, Sources, Real URLs, No API Key

Pricing

Pay per event

Go to Apify Store
Google News Scraper - Articles, Sources, Real URLs, No API Key

Google News Scraper - Articles, Sources, Real URLs, No API Key

Pricing

Pay per event

Rating

0.0

(0)

Developer

Renzo Madueno

Renzo Madueno

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

Scrape Google News into clean structured data - with the real publisher URL, not the Google redirect

This Google News scraper turns any keyword search, topic section, top-stories feed, or location feed into structured, export-ready data. For every article you get the headline, the real resolved publisher URL (not the useless news.google.com/rss/articles/... redirect link), the source/publisher name, the publish date in ISO 8601, a clean text snippet, and the image when available.

No Google News API key. No login. No browser automation. Just fast, cheap HTTP requests against the public Google News RSS feeds, with the redirect links decoded into the actual article URLs you can click, crawl, or store.

If you have used other Google News tools and ended up with thousands of news.google.com redirect links that you could not open programmatically, this actor solves exactly that problem. Resolving the real article URL is the single most important feature when you actually want to use the data downstream.

Why this scraper is different

Most Google News scrapers hand back the Google redirect link (https://news.google.com/rss/articles/CBMi...). That link is encoded and cannot be opened, crawled, or de-duplicated reliably - it changes per session and is bound to Google. This actor decodes that token through Google's own resolution endpoint and returns the genuine publisher URL, for example https://www.nytimes.com/2026/06/10/business/economy/back-office-workers-ai.html. Each resolved URL is correctly paired back to its own article, so titles, sources, and URLs always line up.

You also get four input modes in one actor - keyword search, top stories, topic sections, and local/geo headlines - plus full control over language and country.

What data can you extract?

FieldTypeDescription
titlestringArticle headline as shown on Google News
urlstringReal publisher article URL (resolved from the Google redirect)
urlResolvedbooleantrue if the real URL was decoded, false if only the redirect is available
googleNewsUrlstringThe original Google News redirect link (kept for reference)
sourcestringPublisher / source name (e.g., "The New York Times")
sourceUrlstringPublisher homepage URL
pubDatestringPublish date in ISO 8601 (e.g., 2026-06-10T09:02:42.000Z)
descriptionstringClean text snippet / summary of the article
imagestringArticle image URL when available
languagestringLanguage code used for the request (e.g., en-US)
countrystringCountry code used for the request (e.g., US)
querystringThe search query (search mode)
topicstringThe topic section (topic mode)
modestringWhich mode produced the record
scrapedAtstringISO 8601 timestamp of when the record was collected

Use cases

  • Media monitoring & PR - Track every time a brand, person, product, or competitor is mentioned in the news, with direct links to the original coverage.
  • Brand & reputation tracking - Watch keyword feeds and store the real article URLs for clipping reports and sentiment analysis.
  • Market & competitive intelligence - Follow an industry topic (BUSINESS, TECHNOLOGY) and build a dated dataset of headlines and sources.
  • News aggregation apps - Power a newsfeed product with clean titles, snippets, images, and clickable publisher links.
  • SEO & content research - See which publishers rank in Google News for a query and what angles they cover.
  • AI / LLM pipelines & RAG - Feed an LLM real, fetchable article URLs (not dead redirects) so it can read and summarize the source.
  • Academic & data science - Collect structured, time-stamped news data for research on coverage, framing, and media trends.

Input parameters

ParameterTypeRequiredDefaultDescription
modestringYessearchsearch, topStories, topic, or location
querystringWhen mode=searchartificial intelligenceKeyword(s). Supports Google News operators like "openai" OR "anthropic"
topicstringWhen mode=topicBUSINESSOne of TOP, WORLD, NATION, BUSINESS, TECHNOLOGY, ENTERTAINMENT, SPORTS, SCIENCE, HEALTH
locationstringWhen mode=location-City / place name for local headlines (e.g., "New York")
languagestringNoen-USInterface language code (e.g., en-US, es-419, fr, de)
countrystringNoUSTwo-letter country code (e.g., US, GB, ES, IN)
maxItemsintegerNo50Maximum number of articles to return (1-100)
resolveUrlsbooleanNotrueDecode Google redirect links into the real publisher URL

Example input

{
"mode": "search",
"query": "artificial intelligence",
"language": "en-US",
"country": "US",
"maxItems": 25,
"resolveUrls": true
}

Example output

{
"title": "The Hidden Workers Most Threatened by A.I. - The New York Times",
"url": "https://www.nytimes.com/2026/06/10/business/economy/back-office-workers-ai.html",
"urlResolved": true,
"googleNewsUrl": "https://news.google.com/rss/articles/CBMihgFBVV95cUx...?oc=5",
"source": "The New York Times",
"sourceUrl": "https://www.nytimes.com",
"pubDate": "2026-06-10T09:02:42.000Z",
"description": "The Hidden Workers Most Threatened by A.I.",
"image": null,
"language": "en-US",
"country": "US",
"query": "artificial intelligence",
"mode": "search",
"scrapedAt": "2026-06-12T10:31:11.000Z"
}

How it works

The actor fetches the public Google News RSS feed for your chosen mode, parses each item (title, link, source, date, description, image), de-duplicates the results, and then resolves each Google redirect link into the real publisher URL by calling Google's own article-resolution endpoint in efficient batches. The whole pipeline is HTTP-only - no headless browser - which keeps runs fast and inexpensive. If resolution ever fails for an item, you still keep the headline, source, date, snippet, and the original Google link, and urlResolved is set to false.

Pricing

This actor uses pay-per-event pricing: a small charge per run start plus a small charge per article returned. You only pay for the articles you actually get. There are no monthly platform fees beyond your Apify plan.

Frequently asked questions

Do I need a Google News API key?

No. Google News does not offer a public articles API. This actor uses the public RSS feeds plus URL resolution, so there is nothing to sign up for and no key to manage.

Does it return the real article URL or the Google redirect?

The real publisher URL. With resolveUrls enabled (default), every Google News redirect link is decoded into the actual article URL you can open, crawl, or store. The original redirect is also kept in googleNewsUrl.

Can I scrape Google News in other languages or countries?

Yes. Set language (e.g., es-419, fr, de) and country (e.g., ES, GB, IN) to get localized results, just like changing your Google News region.

What modes are supported?

Keyword search, top stories, topic sections (Business, Technology, Sports, Science, Health, and more), and local/geo headlines for a city or place.

How many articles can I get per run?

Up to 100 articles per run via maxItems. Run the actor multiple times with different queries, topics, or regions to build a larger dataset.

This actor only reads publicly available Google News RSS feeds. You are responsible for complying with Google's terms and applicable laws, and for how you use the collected data. Review the feed's usage notes and consult your legal counsel for commercial use.

What if an article URL cannot be resolved?

You still get the headline, source, publish date, snippet, and the original Google link. The urlResolved field tells you which records have the real URL so you can filter if needed.

Can I use the output with other tools?

Yes. Export to JSON, CSV, or Excel from the Apify dataset, or pull it via the Apify API. The resolved URLs make the data immediately usable in crawlers, dashboards, spreadsheets, and LLM pipelines.