Pricing

from $14.00 / 1,000 result items

Wikidata Entity Search Scraper

Search Wikidata's open knowledge graph of 100M+ entities (people, places, brands, books, films) by name. Returns Q-ID, label, description, aliases, all claims (P-properties), sitelinks to every Wikipedia language, structured facts and image. Filter by entity type, language and full-claims fetching.

Pricing

from $14.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

🌐 Wikidata Entity Search Scraper

🚀 Search Wikidata's open knowledge graph of 100M+ entities by name.

🕒 Last updated: 2026-05-06 · 📊 22 fields per record · 100M+ entities · people, places, brands, books, films, concepts · all claims, sitelinks, multilingual labels

The Wikidata Entity Search Scraper searches Wikidata's open knowledge graph of 100M+ entities by name. Output includes the canonical Q-ID, label, description, aliases, all claims (P-properties), sitelinks to every Wikipedia language edition, and structured facts.

Wikidata is the structured-data backbone of Wikipedia and one of the largest open knowledge graphs in the world. Filters run server-side, so a single run can resolve every entity matching a name, fetch full claim trees, or pull entities in non-English languages.

🎯 Target Audience	💡 Primary Use Cases
ML pipelines, knowledge-graph engineers, journalists, fact-checkers, content recommendation engines, search developers	Entity resolution, knowledge-graph augmentation, fact-checking, content enrichment, multilingual search, ML training datasets

📋 What the Wikidata Entity Search Scraper does

Five filtering workflows in a single run:

🔍 Free-text search. Match entity labels and aliases.
🌐 Multilingual. Search in 20+ languages (en, es, fr, de, it, ja, zh, ko, ar, hi, pt, nl, ru).
🆔 Item or property. Search Q-entities (items) or P-entities (properties).
📊 Full claims fetch. Optional: pull every statement, sitelink, and structured fact per entity.
🏷️ Image extraction. Auto-extracts the entity's primary image from claim P18.

💡 Why it matters: clean, server-side filtering removes the parser-and-pagination work from your team and keeps your dataset fresh on every run.

🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.

⚙️ Input

Input	Type	Default	Behavior
maxItems	integer	10	Records to return. Free plan caps at 10, paid plan up to 1,000,000.
query	string	"tesla"	Term to search Wikidata entities.
language	string	"en"	Search language code (ISO 639).
entityType	string	"item"	`item` (Q) or `property` (P).
fetchClaims	boolean	true	Fetch full claims, sitelinks, aliases per entity.

Example: all entities matching Tesla.

{
    "maxItems": 50,
    "query": "tesla",
    "language": "en",
    "fetchClaims": true
}

Example: Spanish-language Madrid entities.

{
    "maxItems": 100,
    "query": "madrid",
    "language": "es"
}

📊 Output

Each record contains 22 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

Field	Type	Example
🖼️ `thumbnailUrl`	string	null
🆔 `entityId`	string	`"Q478214"`
📛 `label`	string	`"Tesla"`
📝 `description`	string	`"American automotive, energy storage and solar power company"`
🏷️ `aliases`	array	`["Tesla Inc","Tesla Motors"]`
🆔 `instanceOfId`	string	null
📊 `sitelinkCount`	number	`106`
📊 `claimCount`	number	`147`
📊 `claims`	object	`{P31:[Q43229],P159:[Q485176]}`
🌐 `wikidataUrl`	string	`"https://www.wikidata.org/wiki/Q478214"`
📚 `wikipediaEnUrl`	string	`"https://en.wikipedia.org/wiki/Tesla,_Inc."`

📦 Sample records

✨ Why choose this Actor

	Capability
📚	100M+ entities. People, places, brands, books, films, concepts in a single query.
🌐	Multilingual. 20+ languages with native-language labels and aliases.
📊	Full structured facts. All P-property claims for entity-resolution pipelines.
🔗	Sitelinks to Wikipedia. Direct links to every Wikipedia language edition.
⚡	Fast. 100 entities in under 30 seconds.

📈 How it compares to alternatives

Approach	Cost	Coverage	Refresh	Filters	Setup
⭐ This Actor	$5 free credit	100M+ entities	Live per run	query, language, type, claims	⚡ 2 min
Wikidata SPARQL endpoint	Free	All	Live	SPARQL	🐢 SPARQL knowledge
Manual Wikidata browse	Free	Manual	Live	Web filters	🕒 Manual
DBpedia	Free	Subset	Stale	SPARQL	🐢 Setup

Pick this Actor when you want broad coverage, server-side filtering, and no pipeline maintenance.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Wikidata Entity Search Scraper page on the Apify Store.
🎯 Set input. Pick your filters and maxItems.
🚀 Run it. Click Start and let the Actor collect your data.
📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.

💼 Business use cases

🤖 Knowledge Graphs

Entity resolution and disambiguation
Augment internal KGs with Wikidata facts
Build cross-language entity links
Train named-entity-recognition models

🔍 Search & Discovery

Power semantic search with structured facts
Build autocomplete with multilingual labels
Resolve ambiguous entity names
Cross-language search experiments

📰 Journalism & Fact-Checking

Verify entities mentioned in stories
Pull biographical and corporate facts
Cross-reference claims via P-properties
Map relationship networks

🤖 ML & NLP

Train entity-linking models
Build retrieval-augmented agents
Generate training datasets for NER
Multilingual KB embedding

🔌 Automating Wikidata Entity Search Scraper

Control the scraper programmatically for scheduled runs and pipeline integrations:

🟢 Node.js. Install the apify-client NPM package.
🐍 Python. Use the apify-client PyPI package.
📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval. Hourly, daily, or weekly refreshes keep downstream databases in sync automatically.

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

Knowledge-graph research
Reproducible KB snapshots
Cross-cultural KB studies
Course material on Wikidata

🎨 Personal and creative

Personal knowledge dashboards
Side projects with structured facts
Newsletter content
Hobbyist KB exploration

🤝 Non-profit and civic

Open-knowledge contributions
Civic literacy datasets
Cultural heritage cataloging
Multilingual literacy projects

🧪 Experimentation

Train entity-linking ML models
Prototype KB-aware chat agents
Build entity-resolution pipelines
Test cross-language search

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:

❓ Frequently Asked Questions

🧩 How does it work?

Provide a query and language. The Actor queries Wikidata's wbsearchentities endpoint and optionally fetches full claims via wbgetentities.

📊 How many fields per record?

22 base fields plus a claims object with every P-property and a sitelinks map across Wikipedia languages.

🌐 Which languages are supported?

20+, including English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Japanese, Chinese, Korean, Arabic, Hindi, Turkish, Polish, Swedish, Finnish, Danish, Norwegian, Czech.

🆔 What's the difference between Q and P entities?

Q-entities are items (people, places, things). P-entities are properties (relations like 'instance of', 'located in', 'date of birth').

🔁 Can I schedule runs?

Yes. Use Apify Schedules to refresh entity caches or track entity creations on a topic.

⚖️ Is this data public?

Yes. Wikidata publishes under CC0; you can use the data freely without attribution.

💳 Do I need a paid Apify plan?

No. The free plan covers preview runs. A paid plan unlocks higher item counts and scheduling.

🆘 What if a run fails?

Apify retries transient errors. Partial datasets are preserved.

🖼️ Does it return entity images?

Yes when the entity has claim P18 set. The Actor extracts the Commons image URL automatically.

📚 How do I use the claims field?

Each P-property maps to an array of values. Decode P-IDs via Wikidata's property page.

🔌 Integrate with any app

Wikidata Entity Search Scraper connects to any cloud service via Apify integrations:

Make - Automate multi-step workflows
Zapier - Connect with 5,000+ apps
Slack - Get run notifications in your channels
Airbyte - Pipe data into your warehouse
GitHub - Trigger runs from commits and releases
Google Drive - Export datasets straight to Sheets

You can also use webhooks to trigger downstream actions when a run finishes.

🔗 Recommended Actors

📚 Open Library Books - 30M+ books and editions
📖 Project Gutenberg Books - 75,000+ free public-domain books
🎨 Openverse Media - 800M+ openly licensed images and audio
📰 Hacker News Search - Every HN story since 2007
🌏 World Bank Indicators - Country economic indicators

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Wikimedia Foundation, Wikidata, Wikipedia, or any contributing editor. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

Wikidata Entity API

automly/wikidata-entity-api

Search Wikidata entities and export structured entity records with labels, descriptions, aliases, sitelinks, and search-match context.

Automly

Wikidata Knowledge Base Scraper

cloud9_ai/wikidata-scraper

Query and extract structured data from Wikidata. Get entities, properties, and relationships from 100M+ items. No API key needed.

cloud9

Wikipedia Data Scraper - Wikidata Alternative

fatihai-tools/wikipedia-scraper

Scrape data from Wikipedia fast. Bulk URL or query input, structured JSON/CSV output, no login required. Free trial — perfect alternative to Wikidata. Use for lead generation, market research, competitive analysis.

fatih dağüstü

Google Knowledge Graph

seemuapps/google-knowledge-graph

Enrich a list of entity names (people, companies, places, things) with metadata from the Google Knowledge Graph.

Andrew

Wikidata Lexemes Scraper

parseforge/wikidata-lexemes-scraper

Search and extract Wikidata Lexemes (L-namespace). Returns lemma, language QID, lexical category, senses, glosses, statements, and optional inflected forms for each lexeme. Distinct from Q-entities.

ParseForge

Wikipedia Article Search

ryanclinton/wikipedia-article-search

Search and retrieve structured data from Wikipedia articles across 15 language editions. This Apify actor queries the MediaWiki Search API to find relevant articles, then enriches each result with plain-text summaries, descriptions, Wikidata IDs, and thumbnail images via the Wikipedia REST API.

Ryan Clinton

Wikipedia Scraper

gio21/wikipedia-scraper

Search Wikipedia and return article summaries or full text via the public REST API. Supports 300+ languages. Useful for knowledge extraction, research, content generation, and entity enrichment.

Gio

Wikipedia Scraper - Articles, Summaries, Metadata

santamaria-automations/wikipedia-scraper

Extract Wikipedia articles including full content, summary, thumbnails, categories, external links, coordinates, and Wikidata IDs. Multi-language support for 12+ languages. Export data, run via API, schedule and monitor runs, or integrate with other tools.

Ale

Google Fast Search Scraper

kaix/google-search-scraper

🔥 ~$0.05/1K results 🔥 Search Google and get structured results including web pages, images, and Knowledge Graph entities. Returns titles, URLs, snippets, and full metadata. Supports geo-targeting, language, date filtering, sorting, and auto-pagination up to 100 results.

Kai

Delaware Business Search Scraper | Division of Corps

parseforge/delaware-business-search-scraper

Search Delaware corporate registry by entity name or file number. Retrieve entity name, file number, incorporation date, entity type, residency, registered agent, address, status, and tax history. Export to JSON, CSV, or Excel for due diligence, KYC, lead generation, and compliance.

ParseForge

Wikidata Entity Search Scraper

🌐 Wikidata Entity Search Scraper

📋 What the Wikidata Entity Search Scraper does

🎬 Full Demo

⚙️ Input

📊 Output

🧾 Schema

📦 Sample records

✨ Why choose this Actor

📈 How it compares to alternatives

🚀 How to use

💼 Business use cases

🤖 Knowledge Graphs

🔍 Search & Discovery

📰 Journalism & Fact-Checking

🤖 ML & NLP

🔌 Automating Wikidata Entity Search Scraper

🌟 Beyond business use cases

🎓 Research and academia

🎨 Personal and creative

🤝 Non-profit and civic

🧪 Experimentation

🤖 Ask an AI assistant about this scraper

❓ Frequently Asked Questions

🧩 How does it work?

📊 How many fields per record?

🌐 Which languages are supported?

🆔 What's the difference between Q and P entities?

🔁 Can I schedule runs?

⚖️ Is this data public?

💳 Do I need a paid Apify plan?

🆘 What if a run fails?

🖼️ Does it return entity images?

📚 How do I use the claims field?

🔌 Integrate with any app

🔗 Recommended Actors

You might also like

Wikidata Entity API

Wikidata Knowledge Base Scraper

Wikipedia Data Scraper - Wikidata Alternative

Google Knowledge Graph

Wikidata Lexemes Scraper

Wikipedia Article Search

Wikipedia Scraper

Wikipedia Scraper - Articles, Summaries, Metadata

Google Fast Search Scraper

Delaware Business Search Scraper | Division of Corps