Pricing

from $10.00 / 1,000 result items

Wikidata Lexemes Scraper

Search and extract Wikidata Lexemes (L-namespace). Returns lemma, language QID, lexical category, senses, glosses, statements, and optional inflected forms for each lexeme. Distinct from Q-entities.

Pricing

from $10.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

🧬 Wikidata Lexemes Scraper

🚀 Export structured lexicographic data in seconds. Pull lemmas, lexical categories, grammatical forms, and senses from the Wikidata Lexeme namespace across hundreds of languages. No API key, no registration, no SPARQL skills required.

The Wikidata Lexemes Scraper queries the L-namespace on wikidata.org and returns 14 fields per record, including lexeme ID, lemma, language QID, lexical category QID, every documented sense, every inflected form, statement metadata, and a link back to the canonical Wikidata Lexeme page. The L-namespace is a structured, machine-readable companion to Wiktionary and powers downstream dictionaries, linguistic research, and language documentation.

The dataset covers more than 1.4 million lexemes spanning over 1,500 languages, from major world languages down to documented endangered and historical languages. This Actor turns the namespace into downloadable CSV, Excel, JSON, or XML in under five minutes. Lemma search, language filter, and lexical category filter all run from the same input form.

🎯 Target Audience	💡 Primary Use Cases
Linguists, NLP engineers, dictionary builders, language-documentation teams, computational lexicographers, knowledge-graph engineers	Multilingual lemma dictionaries, training data for morphology models, inflection tables for language apps, structured glosses for translation pipelines

📋 What the Wikidata Lexemes Scraper does

Three lookup workflows in a single run:

🔍 Lemma search. Query the L-namespace by any lemma string in any UI language.
🌐 Language filter. Restrict results to a single language QID such as Q1860 English or Q150 French.
🔤 Lexical category filter. Limit to a part of speech via QID, for example Q1084 noun or Q24905 verb.

Each record includes the lexeme ID, lemma, lemma language code, language QID, lexical category QID, a short description, every documented sense with multilingual glosses, every inflected form with grammatical feature QIDs, statement count and properties, last-modified timestamp, the canonical Lexeme URL, and the scrape timestamp.

💡 Why it matters: structured lexicographic data powers morphological analyzers, inflection tables, multilingual search, and machine translation. Building your own pipeline means writing SPARQL queries, handling pagination across the L-namespace, and joining sense and form data by hand. This Actor skips all of that and refreshes on every run.

📊 Data fields

Each record includes: description, formCount, forms, languageQid, lastModified, lemma, lemmaLanguage, lexemeId, lexemeUrl, lexicalCategoryQid, scrapedAt, senseCount, senses, statementCount, statementProperties. These field names come straight from the actor's dataset schema, so what you see here is what lands in your dataset.

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Go to the Wikidata Lexemes Scraper page on the Apify Store.
🎯 Set input. Enter a lemma search, pick a language QID and lexical category QID, and set maxItems.
🚀 Run it. Click Start and let the Actor collect your data.
📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to downloaded dataset: 3-5 minutes. No coding required.

🔗 Recommended Actors

📖 Wiktionary Definitions Scraper - Multilingual dictionary entries with definitions and examples
📚 Wikipedia Scraper - Encyclopedic articles and references
📰 ArXiv Scraper - Scientific preprint metadata
📈 Indexmundi Scraper - Global demographic and economic indicators
🗺️ Nominatim OSM Scraper - Geocode addresses via OpenStreetMap

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by Wikidata, the Wikimedia Foundation, or any of its contributors. All trademarks mentioned are the property of their respective owners. Only publicly available open lexicographic data is collected.

🆘 Need Help?

If you hit a bug, have questions about setup, or need a scraper we haven't built yet, open our contact form or write to parseforge@protonmail.com. We also take on paid custom data projects.

For faster answers, join our Discord. It's the best place to get support and suggest new actors.

Wikidata — Structured Knowledge & Entity Search

omao/wikidata

Search 100M+ Wikidata entities into clean JSON: label, description, aliases, statements (properties and values), Wikipedia link and Wikidata URL. Powered by the public MediaWiki/Wikidata API. No API key.

Marouane Oulabass

Wikidata Entities & Knowledge Graph Scraper

scrapers_lat/wikidata-entities-scraper

Search Wikidata and export matching entities as clean structured data. Get each entity's label, description, aliases, type, country, official website, inception date and industry, plus the Wikidata URL. Great for research and KYB enrichment. Export to JSON, CSV or Excel.

Scrapers Lat

Wikidata Knowledge Base Scraper

cloud9_ai/wikidata-scraper

Query and extract structured data from Wikidata. Get entities, properties, and relationships from 100M+ items. No API key needed.

cloud9

Wikidata Entity API

automly/wikidata-entity-api

Search Wikidata entities and export structured entity records with labels, descriptions, aliases, sitelinks, and search-match context.

Automly

Wikidata Scraper

fortuitous_pirate/wikidata-scraper

Query Wikidata's SPARQL endpoint to extract structured data on scientists, countries, companies, artists, or any custom topic. 100M+ entities, free, no auth required.

Fortuitous Pirate

Wikipedia & Wikidata Knowledge Scraper

chrisp1211/wikipedia-scraper-max

Scrape Wikipedia and Wikidata: search articles, get summaries and extracts, resolve entities and pull pageview trends. Returns title, description, extract and URL. No API key. Pay per record; empty runs are free.

Christian Pichichero

Wikidata Entity Lookup & Knowledge Graph Scraper (Free)

fit_melon/wikidata-entity-lookup

Search Wikidata and export entities as clean JSON: Q-ID, label, description, aliases, instance of, country, coordinates, population, dates and Wikipedia links. Free knowledge graph enrichment.

D N

Wikidata Entity Search Scraper

parseforge/wikidata-entity-search-scraper

Search Wikidata's open knowledge graph of 100M+ entities (people, places, brands, books, films) by name. Returns Q-ID, label, description, aliases, all claims (P-properties), sitelinks to every Wikipedia language, structured facts and image. Filter by entity type, language and full-claims fetching.

ParseForge

Camera Equipment Knowledge Base (Wikidata)

crawlerbros/mpb-scraper

Look up camera bodies and lenses from Wikidata's open knowledge graph. Returns encyclopedic entries with brand, release year, category, image, Wikipedia link, and QID. Reference/knowledge data only - no prices, conditions, or live listings.

Crawler Bros

Wikidata Scraper

dami_studio/wikidata-scraper

Search Wikidata by name or resolve Q-ids to full records via the public Wikibase API. Get labels, descriptions, aliases, instance-of, occupation, citizenship, simplified claims, and the Wikipedia link. No key, no login.