GDELT Worldwide News Scraper
Pricing
$2.00 / 1,000 article returneds
GDELT Worldwide News Scraper
Search worldwide news with the public GDELT 2.0 DOC API — no API key, no login. Filter by timespan, source country, and language; sort by newest or relevance. Returns clean articles with title, URL, domain, country, language, publish date, and image.
Pricing
$2.00 / 1,000 article returneds
Rating
0.0
(0)
Developer
Dami's Studio
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share
Search worldwide news through the GDELT 2.0 DOC API — a public, no-key, no-login JSON endpoint that indexes online news from across the planet in 65+ languages. No proxy or anti-bot handling needed.
What it does
Given a search query, the actor calls the GDELT DOC API in ArtList mode and returns clean, structured articles. You can filter by recency, source country, and source language, and sort by newest, oldest, or relevance.
Input
| Field | Type | Default | Notes |
|---|---|---|---|
query | string | — (required) | Keywords. Quote phrases: "climate change". Very short/common single words may be rejected by GDELT. |
maxItems | integer | 100 | Capped at 250 — GDELT returns at most 250 articles per request. |
sort | string | DateDesc | DateDesc (newest), DateAsc (oldest), HybridRel (relevance). |
timespan | string | — | Recency window, e.g. 1d, 3d, 1w, 1m, 3m. GDELT covers ~the last 3 months. |
sourceCountry | string | — | Appended as sourcecountry:{code} (e.g. US, UK, FR). |
sourceLang | string | — | Appended as sourcelang:{code} (e.g. english, french). |
proxyConfiguration | object | — | Optional; not needed (public API). |
Output
Each successful row:
{"ok": true,"title": "…","url": "https://…","domain": "example.com","sourceCountry": "United States","language": "English","publishedAt": "2026-06-11T12:00:00.000Z","socialImage": "https://…"}
Results are de-duplicated by URL. Each ok:true article is billed one article charge unit. Diagnostic rows (ok:false) and empty/blocked runs are never charged.
Nullable fields: GDELT does not always populate every field. Any of title, url, domain, sourceCountry, language, publishedAt, and socialImage can be null for a given article (e.g. socialImage is often missing, and publishedAt is null when GDELT's seendate is unparseable). Rows with neither a url nor a title are dropped before charging.
Diagnostics
The actor never fails silently. Instead it writes a single diagnostic row (ok:false) with an errorCode and never charges for it:
BAD_INPUT— GDELT rejected the query (e.g. "query too short"). Quote phrases and avoid overly short/common terms.NO_RESULTS— the query was valid but matched nothing. Broaden it or widen the timespan.RATE_LIMITED/SERVER_ERROR/NETWORK— transient issues; the actor retried with backoff first.
Notes / quirks
- GDELT requires the query to be URL-encoded and phrases to be quoted — the actor handles both.
- On a malformed query GDELT may return a
text/plainerror string (sometimes with HTTP 200) or an empty body instead of JSON. The actor guardsJSON.parseand surfaces a clearBAD_INPUTdiagnostic. - GDELT's index covers roughly the last 3 months of news.
- The actor rotates a real browser User-Agent per request attempt for retry resilience, and supports an optional proxy (
proxyConfiguration). Neither is required — GDELT is a public no-key API with no anti-bot — so leave the proxy unset unless you hit IP-level rate limits.