Open VSX Extension Registry Scraper
Pricing
Pay per event
Open VSX Extension Registry Scraper
π Scrape Open VSX extension metadata, downloads, ratings, publishers, versions, repositories, licenses, and file URLs from the public registry API.
Pricing
Pay per event
Rating
0.0
(0)
Developer
Stas Persiianenko
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
9 days ago
Last modified
Categories
Share
Scrape Open VSX extension metadata from the public Open VSX Registry API.
Use this actor to collect extension names, namespaces, versions, downloads, ratings, reviews, publishers, repositories, licenses, category tags, version counts, and optional file URLs for VSIX packages, manifests, README files, changelogs, icons, signatures, and checksums.
What does Open VSX Extension Registry Scraper do?
The actor searches open-vsx.org and returns structured extension records.
It can:
- π Search Open VSX by keyword.
- π― Fetch exact extensions by
namespace.name,namespace/name, or URL. - π¦ Enrich records from the extension detail API.
- β Capture downloads, ratings, and review counts.
- π§βπ» Extract publisher names and provider metadata.
- π§Ύ Include repositories, licenses, homepages, bugs URLs, and taxonomy.
- ποΈ Optionally include version identifiers and file URLs.
Who is it for?
Extension publishers
Track competing extensions, category visibility, download counts, and rating changes in the open-source IDE ecosystem.
Developer-tool marketers
Build lists of Open VSX extensions by language, framework, or theme niche and enrich them for content, outreach, or partnership workflows.
OSS intelligence teams
Monitor extension metadata for supply-chain research, package provenance, licensing, repository references, and ecosystem adoption.
IDE platform teams
Compare catalog coverage across Open VSX categories and identify popular extensions for curated recommendations.
Data analysts
Create repeatable datasets of extension popularity, publisher metadata, and searchable marketplace rankings.
Why use this actor?
Open VSX is a public extension registry used by VSCodium, Eclipse Theia, Gitpod-style environments, and other VS Code-compatible open-source developer tools.
The website exposes useful JSON endpoints, but analysts still need a repeatable scraper that handles search pagination, exact extension IDs, deduplication, enrichment, and clean dataset output.
This actor packages that workflow into a reusable Apify actor.
How does it work?
The actor uses the public Open VSX API:
- Search endpoint:
https://open-vsx.org/api/-/search - Detail endpoint:
https://open-vsx.org/api/<namespace>/<extension>
No browser automation is used.
No login is required.
No API key is required for Open VSX.
Data you can extract
| Field | Description |
|---|---|
id | Combined extension ID, such as ms-python.python |
namespace | Open VSX namespace |
name | Extension name |
displayName | Human-readable extension title |
version | Latest version returned by Open VSX |
downloadCount | Total download count shown by the registry |
averageRating | Average user rating |
reviewCount | Number of reviews |
verified | Whether the namespace or extension is verified |
deprecated | Whether the extension is deprecated |
publisherName | Publisher display name |
license | License string when available |
repository | Source repository URL |
extensionUrl | Human Open VSX extension page |
apiUrl | Detail API URL |
versions | Optional list of version identifiers |
files | Optional file URL map |
How much does it cost to scrape Open VSX extensions?
The actor uses pay-per-event pricing.
You pay a small start fee per run and a per-extension fee for each dataset item produced.
Because the actor uses lightweight HTTP API calls and no browser, it is designed for low compute usage and predictable pricing.
For cost control:
- Start with
maxItems: 100. - Disable
includeVersionsunless you need version lists. - Disable
includeFilesunless you need package or manifest URLs. - Use exact
extensionIdsfor small monitoring jobs. - Use broader
queriesfor market research jobs.
Input options
queries
Search terms used for Open VSX discovery.
Examples:
pythonjavathemedockerkubernetesai
extensionIds
Exact extensions to fetch.
Supported formats:
ms-python.pythonms-python/pythonhttps://open-vsx.org/extension/ms-python/pythonhttps://open-vsx.org/api/ms-python/python
maxItems
Maximum unique extensions to save across all searches and explicit IDs.
includeDetails
Fetches detail metadata for richer output.
This is enabled by default and recommended.
includeVersions
Adds the versions array and versionCount.
Use it for version monitoring or compatibility research.
includeFiles
Adds file URLs such as VSIX download, manifest, README, license, changelog, icon, signature, and checksum URLs when Open VSX exposes them.
Example input: keyword search
{"queries": ["python", "java"],"maxItems": 100,"includeDetails": true,"includeVersions": false,"includeFiles": false}
Example input: exact extensions
{"extensionIds": ["ms-python.python", "redhat.java", "vscodevim.vim"],"maxItems": 25,"includeDetails": true,"includeVersions": true,"includeFiles": false}
Example input: deep metadata workflow
{"queries": ["theme", "material", "dark"],"maxItems": 150,"includeDetails": true,"includeVersions": true,"includeFiles": true}
Output example
{"id": "ms-python.python","namespace": "ms-python","name": "python","displayName": "Python","version": "2026.4.0","downloadCount": 50805072,"averageRating": 3.8,"reviewCount": 15,"verified": true,"deprecated": false,"license": "MIT","repository": "https://github.com/Microsoft/vscode-python.git","extensionUrl": "https://open-vsx.org/extension/ms-python/python","apiUrl": "https://open-vsx.org/api/ms-python/python","searchQuery": "python","inputSource": "search"}
Tips for better results
Use short, concrete search terms.
Combine language and category terms for broad market maps.
Use exact IDs for recurring monitoring.
Set includeVersions only when version history matters.
Set includeFiles only when you need package artifacts or manifest links.
Raise maxItems gradually for large discovery jobs.
Integrations
You can connect the dataset to:
- Google Sheets for extension market reports.
- Airtable for publisher tracking.
- Slack alerts for new or deprecated extensions.
- Data warehouses for ecosystem intelligence.
- BI tools for category and download trend dashboards.
- Compliance workflows that review licenses and repository links.
API usage with Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: process.env.APIFY_TOKEN });const run = await client.actor('automation-lab/open-vsx-extension-registry-scraper').call({queries: ['python'],maxItems: 100,includeDetails: true});const { items } = await client.dataset(run.defaultDatasetId).listItems();console.log(items.slice(0, 3));
API usage with Python
from apify_client import ApifyClientclient = ApifyClient('YOUR_APIFY_TOKEN')run = client.actor('automation-lab/open-vsx-extension-registry-scraper').call(run_input={'queries': ['theme'],'maxItems': 100,'includeDetails': True,})items = client.dataset(run['defaultDatasetId']).list_items().itemsprint(items[:3])
API usage with cURL
curl -X POST 'https://api.apify.com/v2/acts/automation-lab~open-vsx-extension-registry-scraper/runs?token=YOUR_APIFY_TOKEN' \-H 'Content-Type: application/json' \-d '{"queries":["python"],"maxItems":100,"includeDetails":true}'
Using with MCP and AI agents
You can expose this actor to MCP-compatible tools through Apify MCP.
Claude Code command:
$claude mcp add --transport http apify-open-vsx-extension-registry-scraper "https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper"
Claude Code MCP URL:
https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper
Claude Desktop, Cursor, or VS Code MCP JSON config:
{"mcpServers": {"apify-open-vsx-extension-registry-scraper": {"url": "https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper"}}}
Example prompts:
- "Find the top Open VSX Python extensions and summarize their publishers."
- "Compare Open VSX theme extensions by downloads and ratings."
- "Extract metadata for ms-python.python, redhat.java, and vscodevim.vim."
Common workflows
Competitive extension research
Run searches for your language, framework, and category keywords.
Sort by downloads and ratings.
Review publishers, repositories, and license data.
Extension catalog monitoring
Run exact extensionIds daily or weekly.
Store the output in a database.
Detect version, download, rating, or deprecation changes.
Compliance and provenance checks
Enable detail enrichment.
Collect repository, license, publisher, and file URLs.
Review records with missing license or suspicious repository metadata.
Related scrapers
You may also find these automation-lab actors useful:
- https://apify.com/automation-lab/website-contact-finder
- https://apify.com/automation-lab/github-repository-scraper
- https://apify.com/automation-lab/domain-to-company-enricher
Limitations
The actor only returns data exposed by Open VSX public APIs.
Some fields may be missing for older or minimally maintained extensions.
Search ranking is controlled by Open VSX and can change over time.
The actor does not download VSIX files; it only returns URLs when includeFiles is enabled.
Troubleshooting
Why did I get fewer items than expected?
Open VSX may return duplicate extensions across queries, and the actor deduplicates by namespace and name. Increase maxItems or add more search terms.
Why are license or repository fields missing?
Not every extension publishes complete manifest metadata. The actor returns these fields when Open VSX exposes them.
Should I enable version and file enrichment?
Enable it for monitoring, package analysis, or compliance workflows. Keep it disabled for lightweight popularity or search visibility reports.
Legality
Open VSX exposes public registry metadata through public API endpoints. You should still use the data responsibly, respect Open VSX terms, and avoid collecting or redistributing data in ways that violate applicable laws or third-party rights.
FAQ
Can I scrape exact Open VSX extension URLs?
Yes. Add Open VSX page URLs, API URLs, namespace.name, or namespace/name values to extensionIds.
Does the actor download VSIX packages?
No. It extracts package and file URLs when includeFiles is enabled, but it does not download binary VSIX files.
Data freshness
Every run fetches current data from Open VSX at execution time.
Use scheduled Apify tasks for recurring monitoring.
Support
If a field is missing, check whether Open VSX exposes it on the extension page or API response.
If you need additional fields from the Open VSX API, open an issue on the actor page with an example extension URL and the desired field.
Changelog
Initial version supports keyword search, exact extension IDs, detail enrichment, optional version lists, optional file URLs, and pay-per-extension output.