Pricing

from $1.00 / 1,000 page returneds

Try for free

Go to Apify Store

Wikipedia Scraper

Try for free

Scrapes Wikipedia via the public MediaWiki API: search by keyword for matching pages with snippets and word counts, or fetch exact titles to get plain-text extract, thumbnail, categories, and URL. Batches 50 titles at a time; full article text option

Pricing

from $1.00 / 1,000 page returneds

Rating

5.0

(2)

Developer

Dami's Studio

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Two modes

1. Search — set searchQuery. Returns matching articles with title, pageid, url, a plain-text snippet (the API's HTML is stripped for you), wordcount, size, and timestamp. The actor paginates automatically (50 per request) up to maxItems.

2. Page data — set pageTitles (a list of exact article titles). Returns title, pageid, url, the plain-text extract, a thumbnail image URL, and categories. Titles are batched 50 at a time. Turn on fullText to get the whole article instead of just the intro.

(If both are provided, search mode wins. Provide one or the other.)

What you get per row

Field	Mode	Notes
`title`	both	Article title.
`pageid`	both	Stable Wikipedia page id (used to dedupe).
`url`	both	Canonical article URL.
`snippet`	search	Plain-text match snippet (HTML stripped).
`wordcount`, `size`, `timestamp`	search	Article word count, byte size, last-edit time.
`extract`	page	Plain-text article text (intro, or full body with `fullText`).
`thumbnail`	page	Lead image URL (up to 400px), if the page has one.
`categories`	page	Visible category names (hidden categories excluded).

Input

Field	Notes
`searchQuery`	Keywords, e.g. `machine learning`. Leave empty if using titles.
`pageTitles`	List of exact titles, e.g. `["Apify", "Web scraping"]`.
`fullText`	Page mode only. Full article text vs. just the intro. Default off.
`language`	Wikipedia edition: `en`, `fr`, `de`, `es`, `ja`, … Default `en`.
`maxItems`	Cap on returned pages. Default 50.

Output

One dataset row per page (ok: true). Charged per page. Empty searches or unknown titles return a non-charged diagnostic row with an errorCode and a human-readable reason instead of silently returning nothing.

Example

{ "searchQuery": "machine learning", "language": "en", "maxItems": 30 }

{ "pageTitles": ["Apify", "Web scraping"], "fullText": false, "language": "en" }

Notes

Uses https://{language}.wikipedia.org/w/api.php. Per Wikimedia's policy the actor always sends a descriptive User-Agent with a contact. Results are deduped by pageid.

Wikipedia Scraper - Articles & Full Text for AI / RAG

flash_scraper/wikipedia-scraper

Turn Wikipedia into clean, structured data. Search by keyword or fetch exact titles in 300+ languages; get the intro summary or full article plaintext plus word count, categories, thumbnail & URL. Free, keyless MediaWiki API. Built for RAG, LLM training & research. Export CSV/JSON/Excel.

Flash Scrape

Wikipedia Scraper

automation-lab/wikipedia-scraper

Search and extract Wikipedia articles — titles, summaries, full content, categories, and images. Uses the free MediaWiki API.

Stas Persiianenko

Wikipedia Article Scraper

cloud9_ai/wikipedia-scraper

Scrape Wikipedia articles by search keyword or exact title. Returns summaries, full article text, categories, and links. Supports 300+ languages.

cloud9

Wikipedia Article Extractor

glassventures/wikipedia-article-extractor

Extract Wikipedia articles via MediaWiki API. Get full text, summaries, sections, categories, images, links. Multi-language. Perfect for AI/ML training data and RAG.

Glass Ventures

Wikipedia Article Scraper

kayhermes/wikipedia-scraper

Khoa Nguyen

Wikipedia Article Scraper

rupom888/wikipedia-article-scraper

Scrape Wikipedia articles using the official MediaWiki REST API. Search by keyword, look up specific titles, or scrape by URL. Extracts full article text, sections, infobox data, categories, references, images, and related articles. Supports 300+ languages.

Syed Rupom

Wikipedia Scraper

leftwinglautus/wikipedia-scraper

Scrape Wikipedia articles via the official Wikipedia API. Search articles, get summaries, full content, and categories.

Moeeze Hassan

Wikipedia Article Extractor

receptional_blender/wikipedia-article-extractor

Extract clean, structured Wikipedia articles as JSON — summary, description, canonical URL, thumbnail, geo-coordinates and optional full text — for any list of titles or a search query, in any language edition. Powered by Wikipedia's free public APIs; no key, login or proxy required.

Assia Fadli

Wikipedia Scraper

exuberant_volley/wikipedia-scraper

Scrape Wikipedia articles by search term or exact titles via the official MediaWiki API — summary extract, page image, canonical URL and last-edited date. Keyless, clean JSON, no personal data.

ScrapeForge

Wikipedia Article Scraper - Search & Extract Content

klondikeking/wikipedia-article-scraper

Search and extract Wikipedia article metadata, summaries, and content via the official MediaWiki API. No scraping overhead — pure API integration with high reliability.