Wikipedia Pro Scraper - Sections, Infobox, References
Pricing
Pay per event
Go to Apify Store

Wikipedia Pro Scraper - Sections, Infobox, References
Wikipedia scraper for AI/RAG. Extracts structured sections, infobox key-value data, references. Multilingual, batch-friendly. Ready for vector databases.
Pricing
Pay per event
Rating
0.0
(0)
Developer
WETYR CORPORATION
Maintained by Community
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Wikipedia Pro Scraper
The Wikipedia scraper built for AI/RAG pipelines. Extracts:
- Structured sections (Introduction, History, etc) ready for chunking
- Infobox key-value data (birthdate, country, founder, etc) — perfect for fact extraction
- References with full citation text
- Internal wikilinks for knowledge graph building
- Image URLs for multimodal datasets
- Categories + langlinks for cross-language enrichment
Multilingual (300+ Wikipedia editions). Properly attributed under CC BY-SA.
Why Pro vs other Wikipedia scrapers
Most Wikipedia scrapers on Apify return raw HTML or stripped plaintext. Ours gives you:
- Sectioned output for vector DB chunking
- Parsed infobox (no manual table parsing)
- Clean text with citation markers removed
- Wikilinks graph for knowledge graphs
Pricing
- $0.05 per actor start
- $0.01 per article scraped
- $0.003 per infobox parsed
Typical run: 1,000 articles with infoboxes = ~$13.05.