Wikidata Scraper avatar

Wikidata Scraper

Pricing

$0.20 / 1,000 item returneds

Go to Apify Store
Wikidata Scraper

Wikidata Scraper

Search Wikidata by name or resolve Q-ids to full records via the public Wikibase API. Get labels, descriptions, aliases, instance-of, occupation, citizenship, simplified claims, and the Wikipedia link. No key, no login.

Pricing

$0.20 / 1,000 item returneds

Rating

0.0

(0)

Developer

Dami's Studio

Dami's Studio

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

Search Wikidata by name or phrase, or resolve a list of Q-ids into full, structured records — straight from the public MediaWiki / Wikibase API. No API key, no login.

Two modes:

  • Search — free-text query (e.g. douglas adams) → matching entities with their Q-id, label and description.
  • Entities — a list of Q-ids (e.g. ["Q42","Q5"]) → full records: label, description, aliases, instance-of, occupation, citizenship, a simplified claims summary, and the linked Wikipedia article.

Input

FieldNotes
modesearch (default) or entities.
querySearch term. Used in search mode.
idsArray of Q-ids, e.g. ["Q42","Q5"]. Used in entities mode. Batched 50/request.
languageTerm language, e.g. en, fr, de. Falls back to the multilingual (mul) value when missing.
maxItemsMax entities to return (search mode is capped at 50 per query by the API).

Output

One dataset row per entity.

Search mode rows:

{ ok, id, label, description, aliases[], url }

Entities mode rows:

{ ok, id, label, description, aliases[],
instanceOf[], // P31, as Q-ids
occupation[], // P106, as Q-ids
countryOfCitizenship[], // P27, as Q-ids
claimsSummary, // { Pxx: [simplified values] }
enwikiTitle, enwikiUrl, // linked English Wikipedia article (null if none)
url } // https://www.wikidata.org/wiki/{id}

How claims are simplified

claimsSummary flattens each property's statements into a list of scalar values:

  • wikibase-entityid"Q…" (the referenced item; not re-resolved — look it up separately if you need its label)
  • time → the time string (e.g. +1952-03-11T00:00:00Z)
  • quantity → the amount (e.g. +1.96)
  • monolingualtext / string → the text
  • globecoordinate"lat,lon"
  • anything else → compact JSON, trimmed

To keep rows a sensible size, the summary keeps up to 60 properties and up to 20 values per property. instanceOf, occupation and countryOfCitizenship are surfaced as their own fields for convenience (also as Q-ids).

The multilingual (mul) fallback

Many international entities store their canonical label/aliases under Wikidata's special mul (multilingual) language rather than per-language. For example, Q42's label lives under mul, not en. This actor always requests both your chosen language and mul, so you still get label: "Douglas Adams" for Q42.

Pricing

Pay-per-result: you are charged one item per genuine entity row (ok: true). You are never charged for:

  • empty/invalid input — a single ok: false diagnostic row with errorCode: "BAD_INPUT",
  • no matches / no existing ids (NO_RESULTS),
  • rate limits or network errors (RATE_LIMITED / NETWORK).

Proxy

The Wikidata API is a public, no-auth JSON API with no anti-bot, so no proxy is required and the default runs without one (saving proxy credits). Only enable Apify Proxy if you hit IP rate limits at very high volume. A descriptive User-Agent is sent on every request per Wikimedia's API etiquette.

Examples

Search:

{ "mode": "search", "query": "douglas adams", "language": "en", "maxItems": 10 }

Resolve entities:

{ "mode": "entities", "ids": ["Q42", "Q5"], "language": "en" }