Guardian News Scraper
Pricing
from $2.00 / 1,000 results
Guardian News Scraper
Scrape full The Guardian articles with headline, body, authors, section, and tags. Supports `mode: latest` to get newest news via Guardian world RSS. HTTP-only.
Pricing
from $2.00 / 1,000 results
Rating
0.0
(0)
Developer
Xtractoo
Maintained by CommunityActor stats
0
Bookmarked
3
Total users
2
Monthly active users
2 days ago
Last modified
Categories
Share
The Guardian Article Scraper
Extract full article text, author, publication date, section, and description from any theguardian.com article URL. The Guardian is one of the world's most-read English-language news sites with extensive international coverage across politics, culture, and science.
Why Use This Actor?
- Academic research - Guardian long-form journalism is widely used in media studies and political research.
- Content curation - aggregate Guardian articles by topic for newsletters or reading lists.
- Sentiment and bias analysis - Guardian editorial stance makes it a reference in media bias research.
- Open access - Guardian content is freely available globally with no paywall or geo-restriction.
How It Works
This actor uses only HTTP requests - no browser, no Selenium, no Playwright. Articles are extracted in seconds with RAM usage well under 256 MB.
Input
{"url": "https://www.theguardian.com/world/2026/apr/13/example-article","urls": ["https://www.theguardian.com/world/2026/apr/13/article-one","https://www.theguardian.com/technology/2026/apr/12/article-two"],"mode": "article","limit": 10}
Output
{"url": "https://www.theguardian.com/world/2026/may/15/mali-airstrikes-rebel-alliance-separatists","source": "The Guardian","title": "Mali’s forces target rebel alliance in junta’s fight to keep power","description": "Army supported by Russian mercenaries launches airstrikes after offensive by coalition of Islamist extremists and Tuareg separatists","content": "Mali’s armed forces, supported by Russian mercenaries, have launched airstrikes targeting a rebel alliance of Islamist extremists and Tuareg separatists as the ruling junta struggles to maintain its hold on power in the unstable west African country. Earlier this week warplanes targeted the key northern town of Kidal,which was lostwhen the rebels launched a surprise offensive across much of Mali in late April....","image": "https://i.guim.co.uk/img/media/e6d26af1123d872554af9a427c5d33abf01bc499/650_22_3090_2473/master/3090.jpg?width=1200&height=630&quality=85&auto=format&fit=crop&precrop=40:21,offset-x50,offset-y0&overlay-align=bottom%2Cleft&overlay-width=100p&overlay-base64=L2ltZy9zdGF0aWMvb3ZlcmxheXMvdGctZGVmYXVsdC5wbmc&enable=upscale&s=46f9527d36a676fc922f988649bb5fe9","language": "en","word_count": 847,"published_date": "2026-05-15T14:57:35.000Z","modified_date": "","authors": [],"categories": "","tags": ""}
Fetch Latest News
Set mode to "latest" to fetch the newest article URLs and titles from The Guardian instead of extracting a single article.
Input:
{"mode": "latest","limit": 10}
Output - array of objects:
[{"url": "https://www.theguardian.com/world/2026/apr/20/madagascar-gen-z-protesters-fear-new-regime","title": "Arrests fuel fears among Madagascar’s gen Z protesters that new regime no better than one they overthrew","published_date": "Mon, 20 Apr 2026 04:00:02 GMT","source": "The Guardian"}//...]
Source: https://www.theguardian.com/world/rss (RSS feed)
Cron Schedule: Auto-Fetch Newest Articles
Combine mode: "latest" and mode: "article" to keep a fresh feed running on autopilot:
- Schedule a recurring run of this Actor with
{"mode": "latest", "limit": 20}via Apify Schedules (UI ▸ Schedules ▸ Create new). A cron expression like*/30 * * * *runs it every 30 minutes. - Webhook the dataset of the latest run into another Actor run with
mode: "article"and the new URLs as input — Apify integrations let you chain runs via the "Actor finished" webhook without any glue code. - The article-mode run extracts the full body, image, authors, and metadata for each URL and appends to your master dataset.
Common cron expressions:
| Frequency | Cron |
|---|---|
| Every 15 minutes | */15 * * * * |
| Hourly | 0 * * * * |
| Every 6 hours | 0 */6 * * * |
| Daily at 06:00 UTC | 0 6 * * * |
Notes
- The Guardian rarely paywalls content; full article text is usually returned
- For high-volume production use, register for The Guardian's free Content API
Other News Actors
Need a different news source? All actors in this collection:
| Actor | Source |
|---|---|
aljazeera-scraper | Al Jazeera |
apnews-scraper | AP News |
bbc-scraper | BBC News |
cnbc-scraper | CNBC |
forbes-scraper | Forbes |
fortune-scraper | Fortune |
ft-scraper | Financial Times |
guardian-scraper | The Guardian |
msn-scraper | MSN News |
nytimes-scraper | New York Times |
reuters-scraper | Reuters |
scmp-scraper | South China Morning Post |
techcrunch-scraper | TechCrunch |
upi-scraper | UPI |
yahoo-finance-scraper | Yahoo Finance |
smart-news-loader | Any URL - adaptive HTTP loader |
bloomberg-scraper | Bloomberg |
All actors support mode: "latest" for fetching newest article URLs from each source.