W3C Standards Catalog Scraper
Pricing
from $13.00 / 1,000 result items
W3C Standards Catalog Scraper
Scrape W3C standards catalog: title, status, type, date, editors, abstract, shortname, group, deliverer, errata, and specification URL. Covers Recommendations, Working Drafts, Notes, and Candidate Recommendations. Export web standards to JSON, CSV, or Excel for developer tooling.
Pricing
from $13.00 / 1,000 result items
Rating
0.0
(0)
Developer
ParseForge
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
7 hours ago
Last modified
Categories
Share

📐 W3C Standards Catalog Scraper
🚀 Export the full W3C Web standards catalog in seconds. Pull 1,696 specifications including HTML, CSS, ARIA, WebSocket, Web Components, and every other open Web standard with maturity status, deliverers, and full version history.
🕒 Last updated: 2026-05-23 · 📊 15 fields per record · 📚 1,696 specifications · 🏛️ All W3C working groups · 🔖 9 maturity levels
The W3C Standards Catalog Scraper exports the official W3C specifications corpus, returning 15 fields per record, including shortname, title, maturity status, description, latest version URL, first version URL, working-group deliverer shortnames, and full version history when requested. The dataset is the authoritative catalog of Web standards published by the World Wide Web Consortium since 1994.
The catalog covers 1,696 specifications across HTML, CSS, the DOM, Web APIs, ARIA accessibility standards, WebSocket, Web Components, payment APIs, internationalization, security, privacy, and dozens of other working groups. A second mode enumerates W3C working groups and community groups themselves, returning the org chart of the open Web.
| 🎯 Target Audience | 💡 Primary Use Cases |
|---|---|
| Web developers, browser engineers, standards researchers, accessibility auditors, technical writers, conformance teams, framework authors | Conformance audits, "supported standards" dashboards, browser feature trackers, accessibility coverage, framework spec mapping, standards research |
📋 What the W3C Standards Catalog Scraper does
Three workflows in a single run:
- 📚 Full specifications catalog. Every W3C spec from Recommendation to Working Draft to Retired, with shortname, title, status, and links.
- 🏛️ Working groups directory. Switch to
mode: "groups"to enumerate the W3C organisational chart of working groups and community groups. - 🔖 Status and group filters. Narrow to one maturity level (Recommendation, Candidate Recommendation, Working Draft, Group Note, Retired, Superseded, Rescinded, Proposed Recommendation) or to one working-group shortname (
css,webapps,html,aria). - 🗂️ Optional version history. Toggle
includeVersionsto pull the per-spec version list with one extra call per record.
Each record carries the canonical shortname, the human title, the maturity status, the editor's draft URL, the latest and first version URLs, the deliverers (working-group shortnames), and a stable API URL back to the W3C catalog.
💡 Why it matters: the Web is an open platform because standards are public, traceable, and versioned. Building a conformance, browser-tracker, or framework dashboard around them means parsing inconsistent HTML, scraping multiple pages, and stitching the org chart together by hand. This Actor gives you the structured catalog in one call.
🎬 Full Demo
🚧 Coming soon: a 3-minute walkthrough showing how to filter by working group and export the catalog as JSON.
⚙️ Input
| Input | Type | Default | Behavior |
|---|---|---|---|
| maxItems | integer | 10 | Records to return. Free plan caps at 10, paid plan at 1,000,000. |
| mode | string | "specifications" | "specifications" for standards, "groups" for working groups. |
| status | string | "" | One of 9 maturity levels. Empty = any. |
| groupShortname | string | "" | Filter to one group shortname (e.g. css, html, aria). |
| includeVersions | boolean | false | When true, pulls per-spec version history. Adds ~1 extra call per record. |
Example: every CSS Working Group specification with version history.
{"maxItems": 200,"mode": "specifications","groupShortname": "css","includeVersions": true}
Example: all current Recommendations across W3C.
{"maxItems": 500,"mode": "specifications","status": "Recommendation"}
⚠️ Good to Know: version history is fetched on demand. Pulling 1,000 specs with
includeVersions: truedoubles the call count and runtime. Leave it off for the catalog overview, turn it on for archival use cases.
📊 Output
Each record contains 15 fields. Download the dataset as CSV, Excel, JSON, or XML.
🧾 Schema
| Field | Type | Example |
|---|---|---|
🆔 shortname | string | null | "css-color-4" |
📜 title | string | null | "CSS Color Module Level 4" |
🔖 status | string | null | "Candidate Recommendation" |
📝 description | string | null | "This module describes CSS color values..." |
🗂️ seriesShortname | string | null | "css-color" |
🔢 seriesVersion | string | null | "4" |
✏️ editorDraftUrl | string | null | "https://drafts.csswg.org/css-color/" |
🔗 shortlink | string | null | "https://www.w3.org/TR/css-color-4/" |
🆕 latestVersionUrl | string | null | "https://www.w3.org/TR/2024/CR-css-color-4-20240314/" |
🥇 firstVersionUrl | string | null | "https://api.w3.org/specifications/css-color-4/versions/1" |
🏛️ groupShortnames | string[] | null | ["css"] |
📚 versionsCount | number | null | 12 |
📑 versionHistory | string[] | null | array of version URLs |
🔌 apiUrl | string | "https://api.w3.org/specifications/css-color-4" |
🕒 scrapedAt | ISO 8601 | "2026-05-23T00:00:00.000Z" |
📦 Sample records
✨ Why choose this Actor
| Capability | |
|---|---|
| 📚 | Full catalog. 1,696 specifications across every W3C working group. |
| 🔖 | Maturity filters. Slice by Recommendation, Candidate Recommendation, Working Draft, Group Note, Retired, Superseded, Rescinded, Proposed Recommendation. |
| 🏛️ | Two modes. Specifications or working groups. Run both to map the open Web's org chart. |
| 📑 | Version history. Optional per-spec version trail so you can build archival dashboards. |
| 🔌 | Stable identifiers. Shortname plus apiUrl gives you durable joins back to the W3C source. |
| ⚡ | Fast. 10 specifications in under 15 seconds. |
| 🚫 | No authentication. Public W3C API. No login or token needed. |
📊 The open Web runs on these specs. A clean, queryable copy of the catalog is the foundation of every conformance tracker, browser feature dashboard, and accessibility audit.
📈 How it compares to alternatives
| Approach | Cost | Coverage | Refresh | Filters | Setup |
|---|---|---|---|---|---|
| ⭐ W3C Standards Catalog Scraper (this Actor) | $5 free credit, then pay-per-use | 1,696 specs | Live per run | status, group, mode, versions | ⚡ 2 min |
| W3C TR/ index by hand | Free | All published | Manual | None | 🐢 Days to parse |
| MDN BCD data | Free | Browser-feature focused | Quarterly | Some | ⏳ Different shape |
| Static caniuse export | Free | Browser-support focused | Periodic | Some | 🕒 Different shape |
Pick this Actor when you need a structured catalog of W3C specifications themselves, not browser support data.
🚀 How to use
- 📝 Sign up. Create a free account with $5 credit (takes 2 minutes).
- 🌐 Open the Actor. Go to the W3C Standards Catalog Scraper page on the Apify Store.
- 🎯 Set input. Pick a mode (
specificationsorgroups), optionally filter by status or group, and setmaxItems. - 🚀 Run it. Click Start and let the Actor collect your data.
- 📥 Download. Grab your results in the Dataset tab as CSV, Excel, JSON, or XML.
⏱️ Total time from signup to a downloaded catalog: 3-5 minutes. No coding required.
💼 Business use cases
🔌 Automating W3C Standards Catalog Scraper
Control the scraper programmatically for scheduled runs and pipeline integrations:
- 🟢 Node.js. Install the
apify-clientNPM package. - 🐍 Python. Use the
apify-clientPyPI package. - 📚 See the Apify API documentation for full details.
The Apify Schedules feature lets you trigger this Actor on any cron interval. Weekly catalog refreshes are common for browser-feature trackers and accessibility tooling.
🌟 Beyond business use cases
Data like this powers more than commercial workflows. The same structured records support research, education, civic projects, and personal initiatives.
🤖 Ask an AI assistant about this scraper
Open a ready-to-send prompt about this ParseForge actor in the AI of your choice:
- 💬 ChatGPT
- 🧠 Claude
- 🔍 Perplexity
- 🅒 Copilot
❓ Frequently Asked Questions
🧩 How does it work?
Pick a mode, optionally set a status or group filter, and click Start. The Actor walks the W3C catalog page by page and emits a clean structured record per specification or per working group.
📚 Is the dataset complete?
The W3C catalog reports 1,696 specifications at the time of writing. The Actor pages through the entire catalog when no filters are set and maxItems is high enough.
🔖 What maturity levels are supported?
Recommendation, Proposed Recommendation, Candidate Recommendation, Working Draft, Group Note, Retired, Superseded Recommendation, and Rescinded Recommendation. Filter to one or leave the field empty for the full catalog.
🏛️ Can I get only one working group's specs?
Yes. Set groupShortname to the group's shortname (for example css, html, aria, webapps). The Actor resolves the deliverers for each spec and filters server-side.
📑 Should I enable version history?
Only when you need it. Each version pull adds one extra call per record. For a catalog overview, leave it off. For an archival dataset, turn it on.
⏰ Can I schedule regular runs?
Yes. Use Apify Schedules to refresh the catalog weekly or monthly into a downstream dashboard.
⚖️ Is this data legal to use?
Yes. The W3C catalog is published under terms that permit reuse. The specs themselves are open standards, freely available for reading and implementation.
💼 Can I use this commercially?
Yes. The Actor returns metadata about open Web standards. Commercial conformance dashboards, browser-feature trackers, and accessibility tooling are all valid use cases.
💳 Do I need a paid Apify plan?
No. The free Apify plan is enough for testing and small runs (10 records per run). A paid plan lifts the limit and gives you access to scheduling, higher concurrency, and larger catalog pulls.
🔁 What happens if a run fails partway through?
Apify retries transient errors automatically. Records already pushed to the dataset are preserved, so a re-run picks up cleanly with the same input.
🆘 What if I need help?
Our support team is here to help. Contact us through the Apify platform or use the Tally form linked below.
🔌 Integrate with any app
W3C Standards Catalog Scraper connects to any cloud service via Apify integrations:
- Make - Automate multi-step workflows
- Zapier - Connect with 5,000+ apps
- Slack - Get run notifications in your channels
- Airbyte - Pipe spec data into your warehouse
- GitHub - Trigger runs from repo commits
- Google Drive - Export datasets straight to Sheets
You can also use webhooks to fire downstream actions when a run finishes. Push a fresh standards catalog into your conformance dashboard, or alert your team in Slack when a new Working Draft drops.
🔗 Recommended Actors
- 📨 IETF Datatracker Drafts Scraper - Internet standards drafts, RFCs, and charters
- 📚 arXiv Scraper - Open-access research papers across all fields
- 📊 OEC Economic Complexity Trade Scraper - International trade flows by country and product
- 📈 Indexmundi Scraper - Global demographic and economic indicators
- 🌐 Nominatim OSM Scraper - Geocode addresses via OpenStreetMap
💡 Pro Tip: browse the complete ParseForge collection for more reference-data scrapers.
🆘 Need Help? Open our contact form to request a new scraper, propose a custom data project, or report an issue.
⚠️ Disclaimer: this Actor is an independent tool and is not affiliated with, endorsed by, or sponsored by the W3C or its member organisations. All trademarks mentioned are the property of their respective owners. Only publicly available W3C catalog data is collected.