GitHub Marketplace Scraper avatar

GitHub Marketplace Scraper

Pricing

from $0.03 / 1,000 item extracteds

Go to Apify Store
GitHub Marketplace Scraper

GitHub Marketplace Scraper

Extract GitHub Marketplace apps and actions from public pages for vendor intelligence, category monitoring, and developer-tool leads.

Pricing

from $0.03 / 1,000 item extracteds

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

Extract public GitHub Marketplace apps and actions for vendor intelligence, lead generation, ecosystem research, and partner discovery.

The actor reads GitHub Marketplace browse, search, category, and detail pages without logging in. It returns names, slugs, Marketplace URLs, listing type, short descriptions, pricing/free-trial signals, badge signals, images, optional detail text, and public install/setup links.

What does GitHub Marketplace Scraper do?

GitHub Marketplace Scraper turns GitHub Marketplace pages into a clean dataset.

It can collect marketplace listings from:

  • ๐Ÿ”Ž App browse pages
  • ๐Ÿ”Ž Action browse pages
  • ๐Ÿ”Ž Search result pages
  • ๐Ÿ”Ž Category pages
  • ๐Ÿ”Ž Individual app detail pages
  • ๐Ÿ”Ž Individual action detail pages

Use it when you need repeatable GitHub Marketplace data instead of manual copy-paste.

Who is it for?

This scraper is useful for several buyer workflows.

  • ๐Ÿงฉ Developer-tool companies tracking adjacent vendors
  • ๐Ÿค Partnership teams looking for integration leads
  • ๐Ÿ› ๏ธ DevOps agencies building vendor shortlists
  • ๐Ÿ“ˆ Ecosystem analysts monitoring Marketplace categories
  • ๐Ÿงช Product managers researching pricing and positioning
  • ๐Ÿงฒ Growth teams collecting GitHub app/action lead lists

Why use this actor?

GitHub Marketplace is a high-signal directory of tools that sell to developers.

Manual browsing is slow because useful data is split across search pages, category pages, and listing detail pages.

This actor gives you structured records that are easier to filter, enrich, deduplicate, and monitor over time.

What data can you extract?

Each dataset item represents one Marketplace listing.

FieldDescription
nameMarketplace app or action name
slugStable GitHub Marketplace slug
marketplaceUrlPublic listing URL
listingTypeApp or Action label when visible
developerPublisher/creator signal when available
descriptionShort description
categoriesCategory signals from detail pages
pricingSignalsFree, paid, pricing, subscription, or open-source text signals
freeTrialFree-trial signal when found
badgesBadge signals such as verified
detailTextOptional long detail text
imageUrlsLogo, screenshot, or OpenGraph image URLs
installLinksInstall, setup, pricing, or getting-started links
setupLinksDocs, support, website, or guide links
sourceUrlPage where the listing was discovered
scrapedAtTimestamp of extraction

How much does it cost to scrape GitHub Marketplace?

The actor uses pay-per-event pricing.

  • A small start event is charged once per run: $0.005.
  • A per-listing event is charged for each saved dataset item. BRONZE is $0.000051062 per listing, with lower tiers for higher-volume plans.
  • Detail-page extraction costs more compute time because it opens each listing page, but it does not change the event type.

Keep maxItems low for first tests, then increase it when the output matches your workflow.

Input options

You can drive the actor with URLs, searches, categories, or a mix.

Start URLs

Use startUrls when you already know the Marketplace pages to scrape.

Examples:

[
{ "url": "https://github.com/marketplace?type=apps" },
{ "url": "https://github.com/marketplace?type=actions&query=security" },
{ "url": "https://github.com/marketplace/render" }
]

Search queries

Use searchQueries for keyword discovery.

["security", "ci", "monitoring"]

Categories

Use categories for Marketplace category slugs.

["continuous-integration", "code-quality"]

Listing type

Set type to either:

  • apps
  • actions

Maximum listings

maxItems limits the number of records saved.

Open detail pages

Set includeDetails to true when you need long descriptions, categories, and public setup/resource links.

Example input: app vendor discovery

{
"searchQueries": ["security", "code review", "monitoring"],
"type": "apps",
"maxItems": 50,
"includeDetails": true
}

Example input: GitHub Actions research

{
"startUrls": [
{ "url": "https://github.com/marketplace?type=actions&query=deploy" },
{ "url": "https://github.com/marketplace?type=actions&query=test" }
],
"maxItems": 100,
"includeDetails": false
}

Example output

{
"name": "Render",
"slug": "render",
"marketplaceUrl": "https://github.com/marketplace/render",
"listingType": "App",
"developer": null,
"description": "Continuous integration and deploys with Render, the modern cloud for ambitious developers",
"categories": [],
"pricingSignals": [],
"freeTrial": false,
"badges": [],
"detailText": null,
"imageUrls": ["https://avatars.githubusercontent.com/ml/4937?s=400&v=4"],
"installLinks": [],
"setupLinks": [],
"sourceUrl": "https://github.com/marketplace?type=apps",
"scrapedAt": "2026-06-29T02:09:06.842Z"
}

Tips for best results

  • Start with one or two search terms.
  • Use includeDetails=false for broad monitoring.
  • Use includeDetails=true for lead qualification.
  • Mix app and action URLs if you need both ecosystems.
  • Keep searches specific: security, ci, deploy, code review, or observability.

Integrations

Common workflows include:

  • Export to Google Sheets for vendor review.
  • Send records to a CRM as developer-tool leads.
  • Store daily snapshots in a database for monitoring.
  • Join Marketplace listings with GitHub organization or website enrichment.
  • Feed records into competitive-intelligence dashboards.

API usage with Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/github-marketplace-scraper').call({
searchQueries: ['security'],
type: 'apps',
maxItems: 25,
includeDetails: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

API usage with Python

from apify_client import ApifyClient
client = ApifyClient('MY_APIFY_TOKEN')
run = client.actor('automation-lab/github-marketplace-scraper').call(run_input={
'searchQueries': ['security'],
'type': 'apps',
'maxItems': 25,
'includeDetails': False,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)

API usage with cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~github-marketplace-scraper/runs?token=$APIFY_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"searchQueries":["security"],"type":"apps","maxItems":25,"includeDetails":false}'

MCP usage

Use this actor from Claude Desktop, Claude Code, or other MCP clients through Apify MCP.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/github-marketplace-scraper

Claude Code setup:

$claude mcp add apify-github-marketplace "https://mcp.apify.com/?tools=automation-lab/github-marketplace-scraper"

Claude Desktop JSON config:

{
"mcpServers": {
"apify-github-marketplace": {
"url": "https://mcp.apify.com/?tools=automation-lab/github-marketplace-scraper"
}
}
}

Example prompts:

  • "Scrape GitHub Marketplace security apps and summarize pricing signals."
  • "Find GitHub Actions related to deployment and return their Marketplace URLs."
  • "Monitor GitHub Marketplace CI tools and highlight new vendors."

Data quality notes

GitHub Marketplace pages are public but the visible HTML can change.

The actor is designed to parse stable page signals first: listing links, names, descriptions, type labels, badges, images, and public links.

When GitHub does not expose a field on a page, the actor returns null or an empty array rather than inventing data.

Legality

This actor extracts publicly available GitHub Marketplace pages.

You are responsible for using the data in compliance with GitHub terms, privacy law, and your own obligations.

Do not use scraped data for spam, abuse, credential attacks, or unwanted outreach.

FAQ

Can this scraper access private GitHub Marketplace data?

No. It only reads public pages and does not log in.

Troubleshooting

Why did I get fewer results than expected?

GitHub may show a limited number of listings for a query or category. Add more specific start URLs or search queries.

Why are detail fields empty?

Set includeDetails=true to visit each listing detail page. Broad listing runs keep it off by default for speed and cost control.

Why are pricing signals not exact prices?

The actor extracts visible text signals such as free, paid, or pricing. Marketplace billing details can vary by vendor and may require clicking external vendor pages.

Related Automation Lab actors:

Changelog

0.1

Initial private build for GitHub Marketplace app/action extraction.

Support

If a page stops parsing correctly, open an issue with:

  • The exact input JSON
  • A sample Marketplace URL
  • Expected fields
  • A run URL if available

Limits

The actor does not log in to GitHub.

It does not install apps or actions.

It does not scrape private repositories.

It does not claim that text signals are official vendor pricing unless GitHub visibly exposes those signals.

Monitoring workflow

For recurring monitoring, run the actor daily or weekly with the same input and compare records by slug and marketplaceUrl.

Useful comparison fields include description, pricingSignals, badges, categories, and scrapedAt.

Lead generation workflow

For lead generation, combine Marketplace data with company websites, GitHub organizations, LinkedIn research, or CRM enrichment.

Use includeDetails=true for smaller lead lists where setup links and detail text matter.

Competitive intelligence workflow

For competitive intelligence, run category searches over time.

Track changes in visible positioning, badge signals, listing descriptions, and new entrants in relevant categories.