Google News Scraper + Article Extractor avatar

Google News Scraper + Article Extractor

Pricing

from $1.00 / 1,000 results

Go to Apify Store
Google News Scraper + Article Extractor

Google News Scraper + Article Extractor

Search Google News by keyword or browse 8 built-in topics - World, Business, Technology, Sports, Science, Health and more. Add-on: extract full article text for NLP and sentiment. Near-zero start fee ($0.00005/run). Site filter, date range, 50+ languages. From $0.004/result.

Pricing

from $1.00 / 1,000 results

Rating

5.0

(2)

Developer

Phantom Coder

Phantom Coder

Maintained by Community

Actor stats

3

Bookmarked

3

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract structured news data from Google News by keyword search or topic feed - headlines, publishers, timestamps, article links, and optionally the full article body text, ready for export to JSON, CSV, or any downstream pipeline.

What you get

  • Title - article headline, cleaned (no publisher suffix)
  • URL - Google News article link that redirects to the full article
  • Source - publisher name (e.g., Reuters, BBC, TechCrunch)
  • Source URL - publisher homepage (e.g., reuters.com)
  • Published at - ISO 8601 timestamp
  • Snippet - article summary shown on Google News
  • Query - the keyword or topic that surfaced this article
  • Article URL (with extractFullText add-on) - direct publisher URL, resolved from the Google News redirect
  • Text (with extractFullText add-on) - full article body text extracted from the publisher page

Results are deduplicated automatically across multiple queries and topics in a single run.

How to use

  1. Go to the Input tab and enter one or more search queries, select topics, or both.
  2. Set optional filters: date range, site restriction, exclude words, language, and country.
  3. To get the full article text (for NLP, sentiment analysis, or content pipelines), enable Extract full article text in the Add-on section.
  4. Click Start. A typical run of 100 articles finishes in under 10 seconds (or ~15 minutes with full text extraction enabled).
  5. Download results from the Output tab as JSON, CSV, or XLSX - or connect directly via the Apify API.

Note: Google News RSS returns up to 100 articles per query. To get broader coverage, run multiple focused queries (e.g., "Tesla earnings" and "Tesla stock") rather than one broad query.

Input configuration

ParameterTypeDefaultDescription
queriesstring[]Search keywords. Each query runs separately and results are deduplicated.
topicsenum[]Browse predefined Google News topic feeds. Can be combined with keyword queries.
maxResultsPerQueryinteger100Max articles per query or topic (1–100).
extractFullTextbooleanfalseFetch and extract full article body text from each publisher page. Adds articleUrl and text fields. Memory is set automatically. Charged separately per successful extraction.
dateRangeenumanytimeFilter by recency. Applies to keyword searches only.
siteFilterstringRestrict results to one domain, e.g. reuters.com.
excludeWordsstring[]Words to exclude from search results, e.g. ["opinion", "sponsored"].
languagestringenISO 639-1 language code (e.g. en, de, fr, ja).
countrystringUSISO 3166-1 country code (e.g. US, GB, DE, JP).

Example input - keyword search with filters:

{
"queries": ["artificial intelligence regulation", "OpenAI GPT"],
"maxResultsPerQuery": 50,
"dateRange": "past_week",
"excludeWords": ["opinion", "sponsored"],
"language": "en",
"country": "US"
}

Example input - topic browsing:

{
"topics": ["TECHNOLOGY", "BUSINESS", "SCIENCE"],
"maxResultsPerQuery": 100,
"language": "en",
"country": "US"
}

Example input - full article text extraction:

{
"queries": ["climate change policy"],
"maxResultsPerQuery": 50,
"extractFullText": true
}

Available topics

ValueGoogle News feed
WORLDTop world headlines
NATIONUS national news
BUSINESSBusiness and finance
TECHNOLOGYTech and science
ENTERTAINMENTEntertainment and culture
SPORTSSports news
SCIENCEScience and environment
HEALTHHealth and medicine

Note: siteFilter, excludeWords, and dateRange apply to keyword queries only. Topic feeds are curated by Google and have no query string to modify, so those filters have no effect when using topics.

Output format

Each article is a flat JSON record:

{
"title": "OpenAI releases GPT-5 with improved reasoning",
"url": "https://news.google.com/rss/articles/CBMi...",
"source": "The Verge",
"sourceUrl": "https://www.theverge.com",
"publishedAt": "2026-06-09T14:30:00.000Z",
"snippet": "OpenAI releases GPT-5 with improved reasoning The Verge",
"query": "OpenAI GPT"
}

For topic results, the query field is prefixed with topic::

{
"query": "topic:BUSINESS",
...
}

Full article text extraction

When extractFullText: true is set, the Actor launches a browser for each article, resolves the Google News redirect to the real publisher URL, fetches the article page, and extracts the main body text. Two additional fields are added to each result where extraction succeeds:

{
"title": "OpenAI releases GPT-5 with improved reasoning",
"url": "https://news.google.com/rss/articles/CBMi...",
"source": "The Verge",
"sourceUrl": "https://www.theverge.com",
"publishedAt": "2026-06-09T14:30:00.000Z",
"snippet": "OpenAI releases GPT-5 with improved reasoning The Verge",
"query": "OpenAI GPT",
"articleUrl": "https://www.theverge.com/2026/6/9/openai-gpt5",
"text": "OpenAI has released GPT-5, its most capable language model to date..."
}

What to expect:

  • ~60% success rate across mixed publishers. Open-access sites (BBC, Reuters, TechCrunch) extract reliably. Paywalled content (WSJ, FT, NYT) is attempted but returns no text.
  • Only successful extractions are charged - paywalled and inaccessible articles are not charged.
  • Memory is set automatically to 1,024 MB when this option is enabled.
  • Performance: ~8 seconds per article. A run of 50 articles with extraction takes roughly 7 minutes total.

Note: If extraction fails for a specific article (paywall, geo-block, or timeout), that article is still returned in the dataset - it just won't have articleUrl or text fields, and won't be charged for the add-on.

Use cases and scheduling

Brand monitoring - run every hour to catch news as it breaks:

0 * * * *

Input: { "queries": ["your brand name", "your competitor"], "dateRange": "past_hour" }

Daily industry digest - aggregate tech and business headlines every morning:

0 8 * * *

Input: { "topics": ["TECHNOLOGY", "BUSINESS"], "maxResultsPerQuery": 100 }

NLP and sentiment analysis pipeline - extract full article text for downstream processing:

{
"queries": ["electric vehicles", "EV market"],
"maxResultsPerQuery": 50,
"dateRange": "past_week",
"extractFullText": true
}

Feed the text field into your sentiment model, summarizer, or LLM — no need to follow and scrape each article separately.

Weekly research report - broad keyword sweep once a week:

0 9 * * 1

Input: { "queries": ["your topic"], "dateRange": "past_week", "maxResultsPerQuery": 100 }

Site-specific monitoring - track what one publication says about a topic:

{
"queries": ["climate change"],
"siteFilter": "nytimes.com",
"dateRange": "past_day"
}

Connect results to a Google Sheet or Slack channel via Apify integrations for zero-code delivery.

Pricing

This Actor uses pay-per-event pricing — you only pay for what you scrape.

Base results

TierPrice per articlePer 1,000 articles
Free$0.004$4.00
Starter$0.003$3.00
Scale$0.002$2.00
Business$0.001$1.00

Full text extraction add-on (extractFullText: true)

Charged in addition to the base result price, only when text is successfully extracted. Paywalled and inaccessible articles are not charged.

TierPrice per extracted articlePer 1,000 extractions
Free$0.010$10.00
Starter$0.008$8.00
Scale$0.006$6.00
Business$0.004$4.00

Example: 100 articles with extraction on the Free tier = (100 × $0.004) + (100 × $0.010) = $1.40 total.

The per-run start fee is $0.00005 - effectively negligible. Most competitors in this category charge $0.05–$0.09 just to start a run, which adds up fast if you run the Actor frequently for monitoring. At 100 runs per month, that is $5–$9 in start fees alone before counting results.

Cost example (no extraction): 500 articles/day × 30 days = 15,000 articles/month. At the Free tier: $60. At the Business tier: $15.

Limitations

  1. Google News RSS caps at 100 articles per query - this is a Google-imposed limit, not a scraper limitation. Use multiple focused queries to get broader coverage.
  2. The url field is a Google News redirect link. It opens the full article in a browser but requires JavaScript to resolve to the final publisher URL programmatically. Enable extractFullText to get the resolved articleUrl.
  3. The date range filter applies to keyword searches only - topic feeds always return the latest articles regardless of this setting.
  4. Article snippet is the summary shown on Google News, which is often just the title and publisher name for individual articles.
  5. Proxy rotation is used for reliable RSS access - this is handled automatically, no setup needed.

Privacy and ethical use

This Actor scrapes Google News RSS feeds, a publicly available data source designed for programmatic access. It does not access publisher paywalls, user accounts, or private content. Use in accordance with Google's Terms of Service and applicable data protection regulations.