Wikidata Entity Search Scraper
Pricing
from $14.00 / 1,000 result items
Wikidata Entity Search Scraper
Search Wikidata's open knowledge graph of 100M+ entities (people, places, brands, books, films) by name. Returns Q-ID, label, description, aliases, all claims (P-properties), sitelinks to every Wikipedia language, structured facts and image. Filter by entity type, language and full-claims fetching.
Pricing
from $14.00 / 1,000 result items
Rating
0.0
(0)
Developer
ParseForge
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
a day ago
Last modified
Categories
Share

🌐 Wikidata Entity Search Scraper
🚀 Search Wikidata's open knowledge graph of 100M+ entities by name.
🕒 Last updated: 2026-05-06 · 📊 22 fields per record · 100M+ entities · people, places, brands, books, films, concepts · all claims, sitelinks, multilingual labels
The Wikidata Entity Search Scraper searches Wikidata's open knowledge graph of 100M+ entities by name. Output includes the canonical Q-ID, label, description, aliases, all claims (P-properties), sitelinks to every Wikipedia language edition, and structured facts.
Wikidata is the structured-data backbone of Wikipedia and one of the largest open knowledge graphs in the world. Filters run server-side, so a single run can resolve every entity matching a name, fetch full claim trees, or pull entities in non-English languages.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| ML pipelines, knowledge-graph engineers, journalists, fact-checkers, content recommendation engines, search developers | Entity resolution, knowledge-graph augmentation, fact-checking, content enrichment, multilingual search, ML training datasets |
📋 What the Wikidata Entity Search Scraper does
Five filtering workflows in a single run:
- 🔍 Free-text search. Match entity labels and aliases.
- 🌐 Multilingual. Search in 20+ languages (en, es, fr, de, it, ja, zh, ko, ar, hi, pt, nl, ru).
- 🆔 Item or property. Search Q-entities (items) or P-entities (properties).
- 📊 Full claims fetch. Optional: pull every statement, sitelink, and structured fact per entity.
- 🏷️ Image extraction. Auto-extracts the entity's primary image from claim P18.
💡 Why it matters: clean, server-side filtering removes the parser-and-pagination work from your team and keeps your dataset fresh on every run.
🎬 Full Demo
🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.
⚙️ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan up to 1,000,000. |
query | string | "tesla" | Term to search Wikidata entities. |
language | string | "en" | Search language code (ISO 639). |
entityType | string | "item" | `item` (Q) or `property` (P). |
fetchClaims | boolean | true | Fetch full claims, sitelinks, aliases per entity. |
Example: all entities matching Tesla.
{"maxItems": 50,"query": "tesla","language": "en","fetchClaims": true}
Example: Spanish-language Madrid entities.
{"maxItems": 100,"query": "madrid","language": "es"}
📊 Output
Each record contains 22 fields. Download the dataset as CSV, Excel, JSON, or XML.
🧾 Schema
| Field | Type | Example |
|---|---|---|
🖼️ thumbnailUrl | string | null |
🆔 entityId | string | "Q478214" |
📛 label | string | "Tesla" |
📝 description | string | "American automotive, energy storage and solar power company" |
🏷️ aliases | array | ["Tesla Inc","Tesla Motors"] |
🆔 instanceOfId | string | null |
📊 sitelinkCount | number | 106 |
📊 claimCount | number | 147 |
📊 claims | object | {P31:[Q43229],P159:[Q485176]} |
🌐 wikidataUrl | string | "https://www.wikidata.org/wiki/Q478214" |
📚 wikipediaEnUrl | string | "https://en.wikipedia.org/wiki/Tesla,_Inc." |
📦 Sample records
✨ Why choose this Actor
| Capability | |
|---|---|
| 📚 | 100M+ entities. People, places, brands, books, films, concepts in a single query. |
| 🌐 | Multilingual. 20+ languages with native-language labels and aliases. |
| 📊 | Full structured facts. All P-property claims for entity-resolution pipelines. |
| 🔗 | Sitelinks to Wikipedia. Direct links to every Wikipedia language edition. |
| ⚡ | Fast. 100 entities in under 30 seconds. |
📈 How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| ⭐ This Actor | $5 free credit | 100M+ entities | Live per run | query, language, type, claims | ⚡ 2 min |
| Wikidata SPARQL endpoint | Free | All | Live | SPARQL | 🐢 SPARQL knowledge |
| Manual Wikidata browse | Free | Manual | Live | Web filters | 🕒 Manual |
| DBpedia | Free | Subset | Stale | SPARQL | 🐢 Setup |
Pick this Actor when you want broad coverage, server-side filtering, and no pipeline maintenance.
🚀 How to use
- 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
- 🌐 Open the Actor. Go to the Wikidata Entity Search Scraper page on the Apify Store.
- 🎯 Set input. Pick your filters and
maxItems. - 🚀 Run it. Click Start and let the Actor collect your data.
- 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.
⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.
💼 Business use cases
🔌 Automating Wikidata Entity Search Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- 🟢 Node.js. Install the
apify-clientNPM package. - 🐍 Python. Use the
apify-clientPyPI package. - 📚 See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly, daily, or weekly refreshes keep downstream databases in sync automatically.
🌟 Beyond business use cases
Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
🤖 Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- 💬 ChatGPT
- 🧠 Claude
- 🔍 Perplexity
- 🅒 Copilot
❓ Frequently Asked Questions
🧩 How does it work?
Provide a query and language. The Actor queries Wikidata's wbsearchentities endpoint and optionally fetches full claims via wbgetentities.
📊 How many fields per record?
22 base fields plus a claims object with every P-property and a sitelinks map across Wikipedia languages.
🌐 Which languages are supported?
20+, including English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Japanese, Chinese, Korean, Arabic, Hindi, Turkish, Polish, Swedish, Finnish, Danish, Norwegian, Czech.
🆔 What's the difference between Q and P entities?
Q-entities are items (people, places, things). P-entities are properties (relations like 'instance of', 'located in', 'date of birth').
🔁 Can I schedule runs?
Yes. Use Apify Schedules to refresh entity caches or track entity creations on a topic.
⚖️ Is this data public?
Yes. Wikidata publishes under CC0; you can use the data freely without attribution.
💳 Do I need a paid Apify plan?
No. The free plan covers preview runs. A paid plan unlocks higher item counts and scheduling.
🆘 What if a run fails?
Apify retries transient errors. Partial datasets are preserved.
🖼️ Does it return entity images?
Yes when the entity has claim P18 set. The Actor extracts the Commons image URL automatically.
📚 How do I use the claims field?
Each P-property maps to an array of values. Decode P-IDs via Wikidata's property page.
🔌 Integrate with any app
Wikidata Entity Search Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get run notifications in your channels
- Airbyte - Pipe data into your warehouse
- GitHub - Trigger runs from commits and releases
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to trigger downstream actions when a run finishes.
🔗 Recommended Actors
- 📚 Open Library Books - 30M+ books and editions
- 📖 Project Gutenberg Books - 75,000+ free public-domain books
- 🎨 Openverse Media - 800M+ openly licensed images and audio
- 📰 Hacker News Search - Every HN story since 2007
- 🌏 World Bank Indicators - Country economic indicators
💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Wikimedia Foundation, Wikidata, Wikipedia, or any contributing editor. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.