Pricing

from $8.00 / 1,000 result items

Wikipedia Page Summaries Scraper

Pull Wikipedia article summaries via REST API. Returns title, description, extract (plain + HTML), thumbnail, lang, page ID, content URLs (desktop + mobile + edit), coordinates, page type, timestamps. Look up specific titles or get search results.

Pricing

from $8.00 / 1,000 result items

Rating

0.0

(0)

Developer

ParseForge

Actor stats

Bookmarked

Total users

Monthly active users

9 days ago

Last modified

📚 Wikipedia Article Summary Scraper

🚀 Pull Wikipedia article summaries with thumbnail, extract, coordinates, Wikidata link, revision ID, and language. Lookup or search modes.

🕒 Last updated: 2026-05-08 · 📊 25 fields per record · 60M+ Wikipedia pages · 300+ languages · summary extract, thumbnail, coords, Wikidata, revision · lookup or search

The Wikipedia Article Summary Scraper pulls structured summaries from Wikipedia's REST API. Output includes thumbnail and original-image URLs (with widths and heights), page ID, title and display title, normalized + canonical title, description and description source, summary extract (plain text + HTML), page type, namespace, Wikibase item ID, language code and direction, last-modified timestamp, revision ID, geographic coordinates, and desktop / mobile / edit / revisions URLs.

Two modes in one Actor: lookup by title (one per line), and search (using Wikipedia's opensearch). The dataset covers Wikipedia in any of 300+ languages. Set the language input to es, fr, de, etc.

🎯 Target Audience	💡 Primary Use Cases
Knowledge-graph builders, content marketers, ML researchers, journalists, encyclopedia apps, education platforms	Knowledge-graph extraction, encyclopedic-content displays, summary embeddings, fact-card UIs, education content

📋 What the Wikipedia Article Summary Scraper does

Five filtering workflows in a single run:

🔍 Lookup mode. One title per line, returns rich summary per page.
🔍 Search mode. Wikipedia's opensearch with ranked matches.
🌐 300+ languages. Switch language with a single input.
🗺️ Coordinates included. Lat / lng for places, when the page is geo-tagged.
🔗 Wikidata link. Direct Wikibase item ID for cross-language joins.

💡 Why it matters: clean, server-side filtering and fresh data on every run.

🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.

⚙️ Input

Input	Type	Default	Behavior
maxItems	integer	10	Records to return. Free plan caps at 10, paid plan up to 1,000,000.
mode	string	"lookup"	lookup or search.
titles	string	""	Newline-separated titles (lookup mode).
query	string	""	Search term (search mode).
language	string	"en"	Wikipedia language code (e.g. `en`, `es`, `fr`, `de`, `ja`).

Example: look up famous scientists.

{
    "maxItems": 50,
    "mode": "lookup",
    "titles": "Albert Einstein\nMarie Curie\nIsaac Newton\nCharles Darwin\nGalileo Galilei",
    "language": "en"
}

Example: search topic in Spanish.

{
    "maxItems": 20,
    "mode": "search",
    "query": "fotografía",
    "language": "es"
}

📊 Output

Each record contains 25 fields. Download as CSV, Excel, JSON, or XML.

🧾 Schema

Field	Type	Example
🖼️ `thumbnailUrl`	string	`"https://upload.wikimedia.org/.../thumb.jpg"`
🆔 `pageId`	number	`736`
📛 `title`	string	`"Albert Einstein"`
📛 `displayTitle`	string	`"Albert Einstein"`
📛 `normalizedTitle`	string	`"Albert Einstein"`
📛 `canonicalTitle`	string	`"Albert_Einstein"`
📜 `description`	string	`"German-born theoretical physicist (1879-1955)"`
📝 `extract`	string	`"Albert Einstein was a German-born theoretical physicist..."`
📝 `extractHtml`	string	`"<p><b>Albert Einstein</b> was a German-born theoretical physicist..."`
🏷️ `type`	string	`"standard"`
🌐 `language`	string	`"en"`
🔤 `languageCode`	string	`"en"`
🔗 `wikibaseItem`	string	`"Q937"`
🗺️ `coordinatesLat`	number	`52.5`
🗺️ `coordinatesLng`	number	`13.4`
📅 `timestamp`	string	`"2026-04-29T14:32:11Z"`
🆔 `revisionId`	number	`1238762345`
🌐 `desktopUrl`	string	`"https://en.wikipedia.org/wiki/Albert_Einstein"`
📱 `mobileUrl`	string	`"https://en.m.wikipedia.org/wiki/Albert_Einstein"`
✏️ `editDesktopUrl`	string	`"https://en.wikipedia.org/wiki/Albert_Einstein?action=edit"`

📦 Sample records

✨ Why choose this Actor

	Capability
🔗	Wikidata link. Wikibase item ID lets you join across languages and other knowledge graphs.
🌍	300+ languages. Same Actor pulls summaries from any Wikipedia language.
🗺️	Geo coordinates. Lat / lng exposed when the page is geo-tagged.
📜	Plain + HTML extracts. Both formats per record.
🆓	No auth. Wikipedia REST is open.

📈 How it compares to alternatives

Approach	Cost	Coverage	Refresh	Filters	Setup
⭐ This Actor	$5 free credit	60M+ pages	Live per run	2 modes	⚡ 2 min
Wikipedia REST direct	Free	Same	Live	DIY	🐢 Code
DBpedia / Wikidata	Free	Triplestore	Live	SPARQL	🐢 Hours
Manual scraping	Free	All	Live	DIY	🐢 Days

🚀 How to use

📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
🌐 Open the Actor. Find the Wikipedia Article Summary Scraper on the Apify Store.
🎯 Set input. Pick filters and maxItems.
🚀 Run it. Click Start.
📥 Download. Grab results in the Dataset tab as CSV, Excel, JSON, or XML.

⏱️ Total time from signup to dataset: 3-5 minutes. No coding required.

💼 Business use cases

📚 Knowledge + Education

Encyclopedia-app summaries
Quiz / trivia content
Reference card UIs
Reading-comprehension exercises

🤖 ML + Search

Summary embeddings
Fine-tune QA models
Knowledge-graph seeding
Multilingual training data

📰 Journalism + Content

Background fact cards
Multi-language coverage
Topic-page generation
Subject-A vs subject-B comparisons

🌐 Localization

Multi-language content seeding
Cross-language entity matching
Translated summaries
Localized SEO content

🔌 Automating Wikipedia Article Summary Scraper

Control the scraper programmatically:

🟢 Node.js. Install the apify-client NPM package.
🐍 Python. Use the apify-client PyPI package.
📚 See the Apify API documentation for full details.

The Apify Schedules feature lets you trigger this Actor on any cron interval.

🌟 Beyond business use cases

Data like this powers more than commercial workflows.

🎓 Research and academia

Reproducible Wikipedia corpora
Cross-language entity studies
Educational fact-card content
Knowledge-graph research

🎨 Personal and creative

Personal study aids
Trivia apps
Side projects with summary data
Reading-list backbones

🤝 Non-profit and civic

Free encyclopedic content
Library reference tools
Heritage knowledge preservation
Educational outreach

🧪 Experimentation

Train summarization models
Prototype QA systems
Build entity-disambiguation tools
Test multilingual pipelines

🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt in the AI of your choice:

❓ Frequently Asked Questions

🧩 How does it work?

Pick lookup or search mode and supply titles or a query. The Actor calls the Wikipedia REST API and returns one record per page.

📊 How many fields per record?

25, including thumbnail, page ID, title fields, description, extract (plain + HTML), type, language metadata, Wikibase item, coordinates, timestamps, and URLs.

🌍 Which languages are supported?

Any Wikipedia language. Set the language input to the standard code (en, es, fr, de, ja, zh, ar, etc).

🗺️ Are geo-coordinates always present?

Only on geo-tagged pages (places, landmarks). Other pages have null.

🔗 What's the wikibaseItem field?

The Wikidata Q-number for the entity. Lets you join across languages and other knowledge bases.

📜 Is the full article body returned?

No, only the summary extract (lead section). Use Wikipedia's full-content API for the entire body.

🆓 Do I need an API key?

No. Wikipedia REST is open.

🔁 Can I schedule runs?

Yes. Schedule daily to refresh summaries.

⚖️ Is this data free to use?

Yes. Wikipedia content is licensed CC-BY-SA. Attribution required for redistribution.

💳 Do I need a paid Apify plan?

No. The free plan covers preview runs (10 records).

🔌 Integrate with any app

Wikipedia Article Summary Scraper connects to any cloud service via Apify integrations:

Make - Automate multi-step workflows
Zapier - Connect with 5,000+ apps
Slack - Get run notifications
Airbyte - Pipe data into your warehouse
GitHub - Trigger runs from commits
Google Drive - Export datasets to Sheets

🔗 Recommended Actors

🌐 Wikidata Entity Search - 100M+ open knowledge-graph entities
🌍 Wikivoyage Travel Articles - Wikivoyage city and country articles with image, geo
🌍 REST Countries Reference Data - Every country with flag, capital, currency, languages
📊 Stack Exchange Questions - Search 170+ Stack Exchange Q&A sites
🌍 GeoNames Places + Postal Codes - 12M+ places with admin hierarchy, lat/lng, alternate names

💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.

🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.

⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the Wikimedia Foundation, Wikipedia editors, or any cited reference work. All trademarks mentioned are the property of their respective owners. Only publicly available open data is collected.

Wikipedia Data Extractor - Articles & Summaries

vernacular_reservoir/wikipedia-data-extractor

Extract structured data from Wikipedia articles by topic or keyword. Get title, summary, description, thumbnail, coordinates and related links. Supports all Wikipedia languages. No API key required.

Aleksandrs

Wikipedia Article Scraper

crawlerbros/wikipedia-scraper

Extract structured data from Wikipedia articles. Get summaries, categories, images, metadata, and descriptions using Wikipedia's official API. Supports 300+ languages.

Crawler Bros

5.0

Wikipedia Scraper - Article Content Extractor

lulzasaur/wikipedia-scraper

Scrape Wikipedia articles. Search by topic and extract full structured content: summaries, sections, infobox data, categories, references, images, and edit history for any article.

lulz bot

Wikipedia Bulk Summary

gocreative.ai/wikipedia-bulk-summary

Bulk-fetch Wikipedia summaries for any list of topics: extract, thumbnail, page URL. For LLM RAG pipelines, content sites, and research.

GoCreative AI

Wikipedia Article Search

ryanclinton/wikipedia-article-search

Search and retrieve structured data from Wikipedia articles across 15 language editions. This Apify actor queries the MediaWiki Search API to find relevant articles, then enriches each result with plain-text summaries, descriptions, Wikidata IDs, and thumbnail images via the Wikipedia REST API.

Ryan Clinton

Wikipedia Article Scraper

cloud9_ai/wikipedia-scraper

Scrape Wikipedia articles by search keyword or exact title. Returns summaries, full article text, categories, and links. Supports 300+ languages.

cloud9

Wikipedia Scraper

gio21/wikipedia-scraper

Search Wikipedia and return article summaries or full text via the public REST API. Supports 300+ languages. Useful for knowledge extraction, research, content generation, and entity enrichment.

Gio

Wikipedia Scraper

automation-lab/wikipedia-scraper

Search and extract Wikipedia articles — titles, summaries, full content, categories, and images. Uses the free MediaWiki API.

Stas Persiianenko

Wikipedia Retrieval MCP

quiescent_zebrafish/wikipedia-retrieval

Search and retrieve Wikipedia summaries, page sections, and bounded extracts for agents.

Stefy (nextime) Lanza

Wikipedia Article Scraper - Search & Extract Content

klondikeking/wikipedia-article-scraper

Search and extract Wikipedia article metadata, summaries, and content via the official MediaWiki API. No scraping overhead — pure API integration with high reliability.

Pierrick McD0nald

Wikipedia Page Summaries Scraper

📚 Wikipedia Article Summary Scraper

📋 What the Wikipedia Article Summary Scraper does

🎬 Full Demo

⚙️ Input

📊 Output

🧾 Schema

📦 Sample records

✨ Why choose this Actor

📈 How it compares to alternatives

🚀 How to use

💼 Business use cases

📚 Knowledge + Education

🤖 ML + Search

📰 Journalism + Content

🌐 Localization

🔌 Automating Wikipedia Article Summary Scraper

🌟 Beyond business use cases

🎓 Research and academia

🎨 Personal and creative

🤝 Non-profit and civic

🧪 Experimentation

🤖 Ask an AI assistant about this scraper

❓ Frequently Asked Questions

🧩 How does it work?

📊 How many fields per record?

🌍 Which languages are supported?

🗺️ Are geo-coordinates always present?

🔗 What's the wikibaseItem field?

📜 Is the full article body returned?

🆓 Do I need an API key?

🔁 Can I schedule runs?

⚖️ Is this data free to use?

💳 Do I need a paid Apify plan?

🔌 Integrate with any app

🔗 Recommended Actors

You might also like

Wikipedia Data Extractor - Articles & Summaries

Wikipedia Article Scraper

Wikipedia Scraper - Article Content Extractor

Wikipedia Bulk Summary

Wikipedia Article Search

Wikipedia Article Scraper

Wikipedia Scraper

Wikipedia Scraper

Wikipedia Retrieval MCP

Wikipedia Article Scraper - Search & Extract Content