Open VSX Extension Registry Scraper avatar

Open VSX Extension Registry Scraper

Pricing

Pay per event

Go to Apify Store
Open VSX Extension Registry Scraper

Open VSX Extension Registry Scraper

πŸ”Ž Scrape Open VSX extension metadata, downloads, ratings, publishers, versions, repositories, licenses, and file URLs from the public registry API.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

9 days ago

Last modified

Categories

Share

Scrape Open VSX extension metadata from the public Open VSX Registry API.

Use this actor to collect extension names, namespaces, versions, downloads, ratings, reviews, publishers, repositories, licenses, category tags, version counts, and optional file URLs for VSIX packages, manifests, README files, changelogs, icons, signatures, and checksums.

What does Open VSX Extension Registry Scraper do?

The actor searches open-vsx.org and returns structured extension records.

It can:

  • πŸ”Ž Search Open VSX by keyword.
  • 🎯 Fetch exact extensions by namespace.name, namespace/name, or URL.
  • πŸ“¦ Enrich records from the extension detail API.
  • ⭐ Capture downloads, ratings, and review counts.
  • πŸ§‘β€πŸ’» Extract publisher names and provider metadata.
  • 🧾 Include repositories, licenses, homepages, bugs URLs, and taxonomy.
  • πŸ—‚οΈ Optionally include version identifiers and file URLs.

Who is it for?

Extension publishers

Track competing extensions, category visibility, download counts, and rating changes in the open-source IDE ecosystem.

Developer-tool marketers

Build lists of Open VSX extensions by language, framework, or theme niche and enrich them for content, outreach, or partnership workflows.

OSS intelligence teams

Monitor extension metadata for supply-chain research, package provenance, licensing, repository references, and ecosystem adoption.

IDE platform teams

Compare catalog coverage across Open VSX categories and identify popular extensions for curated recommendations.

Data analysts

Create repeatable datasets of extension popularity, publisher metadata, and searchable marketplace rankings.

Why use this actor?

Open VSX is a public extension registry used by VSCodium, Eclipse Theia, Gitpod-style environments, and other VS Code-compatible open-source developer tools.

The website exposes useful JSON endpoints, but analysts still need a repeatable scraper that handles search pagination, exact extension IDs, deduplication, enrichment, and clean dataset output.

This actor packages that workflow into a reusable Apify actor.

How does it work?

The actor uses the public Open VSX API:

  • Search endpoint: https://open-vsx.org/api/-/search
  • Detail endpoint: https://open-vsx.org/api/<namespace>/<extension>

No browser automation is used.

No login is required.

No API key is required for Open VSX.

Data you can extract

FieldDescription
idCombined extension ID, such as ms-python.python
namespaceOpen VSX namespace
nameExtension name
displayNameHuman-readable extension title
versionLatest version returned by Open VSX
downloadCountTotal download count shown by the registry
averageRatingAverage user rating
reviewCountNumber of reviews
verifiedWhether the namespace or extension is verified
deprecatedWhether the extension is deprecated
publisherNamePublisher display name
licenseLicense string when available
repositorySource repository URL
extensionUrlHuman Open VSX extension page
apiUrlDetail API URL
versionsOptional list of version identifiers
filesOptional file URL map

How much does it cost to scrape Open VSX extensions?

The actor uses pay-per-event pricing.

You pay a small start fee per run and a per-extension fee for each dataset item produced.

Because the actor uses lightweight HTTP API calls and no browser, it is designed for low compute usage and predictable pricing.

For cost control:

  • Start with maxItems: 100.
  • Disable includeVersions unless you need version lists.
  • Disable includeFiles unless you need package or manifest URLs.
  • Use exact extensionIds for small monitoring jobs.
  • Use broader queries for market research jobs.

Input options

queries

Search terms used for Open VSX discovery.

Examples:

  • python
  • java
  • theme
  • docker
  • kubernetes
  • ai

extensionIds

Exact extensions to fetch.

Supported formats:

  • ms-python.python
  • ms-python/python
  • https://open-vsx.org/extension/ms-python/python
  • https://open-vsx.org/api/ms-python/python

maxItems

Maximum unique extensions to save across all searches and explicit IDs.

includeDetails

Fetches detail metadata for richer output.

This is enabled by default and recommended.

includeVersions

Adds the versions array and versionCount.

Use it for version monitoring or compatibility research.

includeFiles

Adds file URLs such as VSIX download, manifest, README, license, changelog, icon, signature, and checksum URLs when Open VSX exposes them.

{
"queries": ["python", "java"],
"maxItems": 100,
"includeDetails": true,
"includeVersions": false,
"includeFiles": false
}

Example input: exact extensions

{
"extensionIds": ["ms-python.python", "redhat.java", "vscodevim.vim"],
"maxItems": 25,
"includeDetails": true,
"includeVersions": true,
"includeFiles": false
}

Example input: deep metadata workflow

{
"queries": ["theme", "material", "dark"],
"maxItems": 150,
"includeDetails": true,
"includeVersions": true,
"includeFiles": true
}

Output example

{
"id": "ms-python.python",
"namespace": "ms-python",
"name": "python",
"displayName": "Python",
"version": "2026.4.0",
"downloadCount": 50805072,
"averageRating": 3.8,
"reviewCount": 15,
"verified": true,
"deprecated": false,
"license": "MIT",
"repository": "https://github.com/Microsoft/vscode-python.git",
"extensionUrl": "https://open-vsx.org/extension/ms-python/python",
"apiUrl": "https://open-vsx.org/api/ms-python/python",
"searchQuery": "python",
"inputSource": "search"
}

Tips for better results

Use short, concrete search terms.

Combine language and category terms for broad market maps.

Use exact IDs for recurring monitoring.

Set includeVersions only when version history matters.

Set includeFiles only when you need package artifacts or manifest links.

Raise maxItems gradually for large discovery jobs.

Integrations

You can connect the dataset to:

  • Google Sheets for extension market reports.
  • Airtable for publisher tracking.
  • Slack alerts for new or deprecated extensions.
  • Data warehouses for ecosystem intelligence.
  • BI tools for category and download trend dashboards.
  • Compliance workflows that review licenses and repository links.

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/open-vsx-extension-registry-scraper').call({
queries: ['python'],
maxItems: 100,
includeDetails: true
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items.slice(0, 3));

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/open-vsx-extension-registry-scraper').call(run_input={
'queries': ['theme'],
'maxItems': 100,
'includeDetails': True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items[:3])

API usage with cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~open-vsx-extension-registry-scraper/runs?token=YOUR_APIFY_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"queries":["python"],"maxItems":100,"includeDetails":true}'

Using with MCP and AI agents

You can expose this actor to MCP-compatible tools through Apify MCP.

Claude Code command:

$claude mcp add --transport http apify-open-vsx-extension-registry-scraper "https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper"

Claude Code MCP URL:

https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper

Claude Desktop, Cursor, or VS Code MCP JSON config:

{
"mcpServers": {
"apify-open-vsx-extension-registry-scraper": {
"url": "https://mcp.apify.com/?tools=automation-lab/open-vsx-extension-registry-scraper"
}
}
}

Example prompts:

  • "Find the top Open VSX Python extensions and summarize their publishers."
  • "Compare Open VSX theme extensions by downloads and ratings."
  • "Extract metadata for ms-python.python, redhat.java, and vscodevim.vim."

Common workflows

Competitive extension research

Run searches for your language, framework, and category keywords.

Sort by downloads and ratings.

Review publishers, repositories, and license data.

Extension catalog monitoring

Run exact extensionIds daily or weekly.

Store the output in a database.

Detect version, download, rating, or deprecation changes.

Compliance and provenance checks

Enable detail enrichment.

Collect repository, license, publisher, and file URLs.

Review records with missing license or suspicious repository metadata.

You may also find these automation-lab actors useful:

Limitations

The actor only returns data exposed by Open VSX public APIs.

Some fields may be missing for older or minimally maintained extensions.

Search ranking is controlled by Open VSX and can change over time.

The actor does not download VSIX files; it only returns URLs when includeFiles is enabled.

Troubleshooting

Why did I get fewer items than expected?

Open VSX may return duplicate extensions across queries, and the actor deduplicates by namespace and name. Increase maxItems or add more search terms.

Why are license or repository fields missing?

Not every extension publishes complete manifest metadata. The actor returns these fields when Open VSX exposes them.

Should I enable version and file enrichment?

Enable it for monitoring, package analysis, or compliance workflows. Keep it disabled for lightweight popularity or search visibility reports.

Legality

Open VSX exposes public registry metadata through public API endpoints. You should still use the data responsibly, respect Open VSX terms, and avoid collecting or redistributing data in ways that violate applicable laws or third-party rights.

FAQ

Can I scrape exact Open VSX extension URLs?

Yes. Add Open VSX page URLs, API URLs, namespace.name, or namespace/name values to extensionIds.

Does the actor download VSIX packages?

No. It extracts package and file URLs when includeFiles is enabled, but it does not download binary VSIX files.

Data freshness

Every run fetches current data from Open VSX at execution time.

Use scheduled Apify tasks for recurring monitoring.

Support

If a field is missing, check whether Open VSX exposes it on the extension page or API response.

If you need additional fields from the Open VSX API, open an issue on the actor page with an example extension URL and the desired field.

Changelog

Initial version supports keyword search, exact extension IDs, detail enrichment, optional version lists, optional file URLs, and pay-per-extension output.