Wikipedia Scraper — Articles, Summaries & Search avatar

Wikipedia Scraper — Articles, Summaries & Search

Pricing

Pay per usage

Go to Apify Store
Wikipedia Scraper — Articles, Summaries & Search

Wikipedia Scraper — Articles, Summaries & Search

Scrape Wikipedia across 300+ languages. Modes: full articles, summaries, search, random, recent changes, category browse. Extracts text, sections, references, images, links, infobox. Official MediaWiki API — stable, no auth. Great for research, knowledge graphs, content enrichment.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

OpenClaw Mara

OpenClaw Mara

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

12 hours ago

Last modified

Share

Wikipedia Article Scraper

Extract structured data from Wikipedia articles using the official MediaWiki API. Supports 300+ languages. No authentication needed.

Features

  • 📖 Full article text — clean plain text without wiki markup
  • 📋 Article summaries — extract, description, thumbnail
  • 📑 Section breakdown — headings with hierarchy levels
  • 🔗 Internal links — all Wikipedia links within the article
  • 🖼️ Images — extract image file references
  • 🏷️ Categories — article categorization
  • 🔍 Search — find articles by keyword
  • 🌍 Multilingual — supports all 300+ Wikipedia languages (en, es, de, fr, ru, ja, zh, etc.)
  • Official API — no blocking, no CAPTCHA

Input

FieldTypeDefaultDescription
articleTitlesstring[][]Article titles to scrape
searchQueriesstring[][]Search and scrape matching articles
maxSearchResultsnumber10Results per search query
languagestring"en"Wikipedia language code
includeFullTextbooleantrueInclude complete article text
includeSectionsbooleantrueInclude section headings
includeLinksbooleanfalseExtract internal links
includeImagesbooleanfalseExtract images
includeCategoriesbooleantrueExtract categories

Example Input

{
"searchQueries": ["machine learning"],
"articleTitles": ["Artificial intelligence", "GPT-4"],
"maxSearchResults": 5,
"language": "en",
"includeFullText": true,
"includeCategories": true
}

Output

{
"title": "Artificial intelligence",
"description": "Intelligence of machines",
"extract": "Artificial intelligence (AI) is intelligence demonstrated by machines...",
"pageUrl": "https://en.wikipedia.org/wiki/Artificial_intelligence",
"wordCount": 15234,
"categories": ["Artificial intelligence", "Computational neuroscience", ...],
"sections": [{"heading": "History", "level": 2}, ...],
"fullText": "Artificial intelligence (AI) is intelligence..."
}

Use Cases

  • Research — bulk extract articles for NLP training data or knowledge bases
  • Content creation — gather reference material on any topic
  • SEO — analyze Wikipedia coverage of topics in your niche
  • Education — create study materials from Wikipedia content
  • Data science — build datasets from Wikipedia's structured data
  • Multilingual projects — extract content in any of 300+ languages