Wikipedia Pageviews Scraper avatar

Wikipedia Pageviews Scraper

Pricing

from $8.25 / 1,000 items

Go to Apify Store
Wikipedia Pageviews Scraper

Wikipedia Pageviews Scraper

Pull Wikipedia pageview metrics for any article in any language edition. Daily or monthly granularity, filter by access type (desktop, mobile, app) and agent type (user, spider, automated). Pick a date range. Export to JSON, CSV, or Excel for SEO research and content benchmarking.

Pricing

from $8.25 / 1,000 items

Rating

0.0

(0)

Developer

ParseForge

ParseForge

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 days ago

Last modified

Share

ParseForge Banner

📚 Wikipedia Pageviews Scraper

🚀 Pull daily and monthly Wikipedia pageviews for any article in any language. Filter by date range, access type, and agent type. No API key, no registration, no quota negotiation.

🕒 Last updated: 2026-05-01 · 📊 8 fields per row · 📚 300+ language editions · 📅 daily and monthly granularity · 🗓️ data from July 2015 onward

The Wikipedia Pageviews Scraper queries the official Wikimedia REST API and returns the number of times any Wikipedia article was viewed during a date range. Each row reports the language project, article title, timestamp, access type, agent type, and view count. The endpoint covers every Wikipedia language edition, and the underlying dataset goes back to July 2015, giving you nearly a decade of continuous traffic history per article.

Wikipedia is the eighth most visited website in the world with billions of pageviews per month. Pageview trends are a leading indicator for cultural moments, search demand, breaking news, and product launches. Building your own pipeline against the Wikimedia API means handling URL encoding, paginated date ranges, and per-language hosts. This Actor handles all of that and lets you focus on the analysis.

🎯 Target Audience💡 Primary Use Cases
SEO teams, journalists, trend researchers, market analysts, academics, dashboard buildersSearch demand forecasting, cultural research, content benchmarking, trend tracking, comparative analysis

📋 What the Wikipedia Pageviews Scraper does

Five filtering workflows in a single run:

  • 📚 Per-article views. Submit any Wikipedia article title and pull its full traffic history for the date range you choose.
  • 🌍 Any language edition. Pick from 20+ supported language projects including English, Spanish, German, French, Japanese, Russian, and Chinese Wikipedia.
  • 📅 Daily or monthly granularity. Daily rollups give you weekday seasonality. Monthly rollups give you long-term trend lines.
  • 📱 Access type filter. Slice traffic by desktop, mobile web, mobile app, or all-access combined.
  • 🤖 Agent type filter. Separate human (user) traffic from spiders and automated agents to clean up trend lines.

Each row in the dataset reports the project (e.g. en.wikipedia), URL-encoded article title, granularity, timestamp in YYYYMMDD00 format, access slice, agent slice, and view count. Dataset entries go back to July 2015.

💡 Why it matters: pageview data is one of the cleanest free signals for tracking real-world attention. When a celebrity dies, a film trailer drops, or a country votes, the matching Wikipedia article spikes within hours. SEO teams use the pageview series as a free proxy for search demand. Researchers cite it in studies of collective attention. Dashboard builders embed it as a public-interest gauge.


🎬 Full Demo

🚧 Coming soon: a 3-minute walkthrough showing how to go from sign-up to a downloaded dataset.


⚙️ Input

InputTypeDefaultBehavior
maxItemsinteger10Rows to return. Free plan caps at 10, paid plan at 1,000,000.
articlesarray of strings["Albert_Einstein"]Article titles with underscores in place of spaces. One title per array entry.
projectstring"en.wikipedia.org"Wikipedia language project. Pick from the enum of 20 supported language editions.
granularitystring"daily"Either daily or monthly.
startDatestring30 days agoISO date YYYY-MM-DD. Earliest supported is 2015-07-01.
endDatestringyesterdayISO date YYYY-MM-DD.
accessstring"all-access"all-access, desktop, mobile-app, or mobile-web.
agentstring"all-agents"all-agents, user, spider, or automated.

Example: daily English-Wikipedia views for three articles in April 2026.

{
"maxItems": 100,
"articles": ["Albert_Einstein", "ChatGPT", "Taylor_Swift"],
"project": "en.wikipedia.org",
"granularity": "daily",
"startDate": "2026-04-01",
"endDate": "2026-04-30",
"access": "all-access",
"agent": "user"
}

Example: monthly Spanish-Wikipedia views since 2020.

{
"maxItems": 1000,
"articles": ["Lionel_Messi", "Real_Madrid_CF"],
"project": "es.wikipedia.org",
"granularity": "monthly",
"startDate": "2020-01-01",
"endDate": "2026-04-01"
}

⚠️ Good to Know: Wikipedia article titles are case sensitive and use underscores, not spaces. Submit Albert_Einstein, not albert einstein. Articles that have been moved or deleted return zero rows. The Wikimedia API is unauthenticated but expects a descriptive User-Agent string, which the Actor sends automatically.


📊 Output

Each row contains 8 fields. Download the dataset as CSV, Excel, JSON, or XML.

🧾 Schema

FieldTypeExample
🌐 projectstring"en.wikipedia"
📄 articlestring"Albert_Einstein"
⏱️ granularitystring"daily"
📅 timestampstring"2026040100"
📱 accessstring"all-access"
🤖 agentstring"all-agents"
👁️ viewsinteger15626
🕒 scrapedAtISO 8601"2026-05-01T02:00:11.931Z"

📦 Sample records


✨ Why choose this Actor

Capability
🆓Free official source. Pulls directly from the public Wikimedia REST API, no scraping of HTML pages.
🌍All Wikipedia languages. Pick from 20+ enum-listed projects, request more if you need them.
📅Decade of history. Data goes back to July 2015, with daily and monthly rollups.
🧪Clean filter slices. Separate desktop from mobile, separate human traffic from spiders.
🚀Sub-10-second runs. A typical 100-row pull finishes in under 10 seconds.
🛠️Bulk article support. Submit dozens of articles in a single run, results pushed in order.
🔄Export anywhere. Output ships as CSV, Excel, JSON, or XML through the Apify dataset endpoints.

📊 The Wikimedia Foundation reports more than 18 billion pageviews per month across all editions.


📈 How it compares to alternatives

ApproachCostCoverageRefreshFiltersSetup
Manual queries to the Wikimedia REST APIFreeFullLiveManualEngineer hours
Third-party paid SEO suites$$$ subscriptionPartialDailyBuilt-inAccount setup
Generic web traffic estimators$$ subscriptionEstimatedWeeklyLimitedAccount setup
⭐ Wikipedia Pageviews Scraper (this Actor)Pay-per-eventFullLiveGranularity, access, agentNone

The same data the Wikimedia Foundation publishes, exposed as clean structured records you can pipe into anything.


🚀 How to use

  1. 🆓 Create a free Apify account. Sign up here and get $5 in free credit.
  2. 🔍 Open the Actor. Search for "Wikipedia Pageviews" in the Apify Store.
  3. ⚙️ Set your inputs. Pick articles, project, date range, granularity, and any filters.
  4. ▶️ Click Start. Most runs finish in under 10 seconds.
  5. 📥 Download. Export as CSV, Excel, JSON, or XML, or wire it into a Make / Zapier flow.

⏱️ Total time from sign-up to first dataset: under five minutes.


💼 Business use cases

📈 SEO & content teams

  • Forecast search demand by tracking Wikipedia traffic on related terms
  • Find rising topics before they show up in keyword tools
  • Benchmark content performance against the canonical Wikipedia article
  • Justify content investments with neutral third-party numbers

📰 Journalists & research

  • Quantify public attention to figures and events
  • Study attention bursts around news cycles
  • Compare regional interest across language editions
  • Cite a free, reproducible, primary source in stories

💰 Finance & market research

  • Use article traffic as an alt-data signal for consumer interest
  • Track brand and product awareness over time
  • Benchmark IPO or product-launch attention
  • Build leading indicators for niche markets

🧠 Data science & ML

  • Generate time-series features for downstream models
  • Train demand-forecasting models on cultural attention
  • Build dashboards for trend monitoring
  • Cross-reference pageviews with social and search data

🌟 Beyond business use cases

Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.

🎓 Research and academia

  • Empirical datasets for papers, thesis work, and coursework
  • Longitudinal studies tracking changes across snapshots
  • Reproducible research with cited, versioned data pulls
  • Classroom exercises on data analysis and ethical scraping

🎨 Personal and creative

  • Side projects, portfolio demos, and indie app launches
  • Data visualizations, dashboards, and infographics
  • Content research for bloggers, YouTubers, and podcasters
  • Hobbyist collections and personal trackers

🤝 Non-profit and civic

  • Transparency reporting and accountability projects
  • Advocacy campaigns backed by public-interest data
  • Community-run databases for local issues
  • Investigative journalism on public records

🧪 Experimentation

  • Prototype AI and machine-learning pipelines with real data
  • Validate product-market hypotheses before engineering spend
  • Train small domain-specific models on niche corpora
  • Test dashboard concepts with live input

🔌 Automating Wikipedia Pageviews Scraper

Run this Actor on a schedule, from your codebase, or inside another tool:

Schedule daily, weekly, or monthly runs from the Apify Console. Export results to Google Sheets, S3, or your own webhook with the built-in integrations.


🤖 Ask an AI assistant about this scraper

Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:


❓ Frequently Asked Questions

📅 How far back does the data go?

The Wikimedia REST API serves pageview data from July 1, 2015 onward. Earlier data is not available through this endpoint.

🌍 Which language editions are supported?

The input schema lists 20 widely-used Wikipedia language projects including English, Spanish, German, French, Italian, Portuguese, Russian, Chinese, Japanese, Korean, Arabic, Hebrew, Turkish, Polish, Dutch, Indonesian, Vietnamese, Hindi, Persian, and Ukrainian. Open a request via our contact form if you need a different project.

⏱️ What granularity does the API support?

Daily and monthly. Hourly granularity is available through a different endpoint and is not part of this Actor.

🔠 How do I format article titles?

Use underscores in place of spaces and match the exact capitalization of the Wikipedia URL. Albert_Einstein works, albert einstein does not. Special characters and non-Latin scripts are URL-encoded automatically.

📦 Can I pass dozens of articles at once?

Yes. The articles input is a string array. The Actor processes each title sequentially and pushes results in order until maxItems is reached.

🤖 What is the difference between agent types?

user traffic is human visits. spider traffic is search engine crawlers. automated traffic is known automation tools. all-agents sums everything. For trend analysis, filter by user to remove crawler noise.

📱 What is the difference between access types?

desktop covers traffic from desktop browsers, mobile-web from mobile browsers, mobile-app from the official Wikipedia app. all-access sums everything.

💼 Can I use the data for commercial work?

Yes. Wikipedia pageview data is published under the Creative Commons CC0 license and can be used commercially. Always cite Wikimedia as the data source.

💳 Do I need a paid plan to use this?

The free plan returns up to 10 rows per run, which is enough for testing. Paid plans return up to 1,000,000 rows.

⚠️ What if a run fails or returns empty?

The most common cause is a misspelled article title. Confirm the exact slug on Wikipedia, then retry. If the issue persists, open a contact form and include the run URL.

📊 Can I trust the numbers?

Yes. The data comes directly from the Wikimedia Foundation's published pageview metrics, the same numbers used in their public dashboards.

This Actor uses the official Wikimedia REST API, not scraping. The API is publicly documented and explicitly built for programmatic access.


🔌 Integrate with any app

  • Make - drop run results into 1,800+ apps with a no-code visual builder.
  • Zapier - trigger automations off completed runs.
  • Slack - post run summaries to a channel.
  • Google Sheets - sync each run into a spreadsheet.
  • Webhooks - notify your own services on run finish.
  • Airbyte - load runs into Snowflake, BigQuery, or Postgres.

💡 Pro Tip: browse the complete ParseForge collection for more pre-built scrapers and data tools.


🆘 Need Help? Open our contact form and we'll route the question to the right person.


Wikipedia is a registered trademark of the Wikimedia Foundation. This Actor is not affiliated with or endorsed by the Wikimedia Foundation. It is built on the public Wikimedia REST API and respects all published rate limits.