Pricing

from $20.00 / 1,000 results

🐙 GitHub Scraper — Repos, Stars & Code Data

Extract repo data from GitHub — stars, forks, contributors, languages, issues & READMEs. Build developer tools, open source analytics & technology trend trackers. Pay per repo.

Pricing

from $20.00 / 1,000 results

Rating

0.0

(0)

Developer

NexGenData

Actor stats

Bookmarked

Total users

Monthly active users

3 days ago

Last modified

🐙 GitHub Scraper — Repos, Stars, Forks, Contributors & Topic Discovery

Pay-per-result GitHub scraper — extracts full repo metadata, stargazer counts, fork graphs, contributor lists, languages, topics, license, README content, and release cadence. Built for VC scouts, devtool marketers, OSS-funded analytics, and competitive intelligence as a no-rate-limit alternative to GitHub's REST API (5000 req/hr authenticated cap), GraphQL v4 (point quota), GitHub Archive on BigQuery (storage cost), Octoverse-style reports, and SaaS aggregators like Sourcegraph Cloud ($299+/mo) and OpenSauced.

Why GitHub Scraper Beats the GitHub REST API, GraphQL & Sourcegraph

Feature	NexGenData GitHub Scraper	GitHub REST API	GitHub GraphQL v4	Sourcegraph Cloud
Cost	$0.002 / repo, pay-per-result	Free + rate-limited	Free + node quota	$299-1000+ / month
Rate limit	None for end user	5000 req/hr	5000 node points/hr	Plan-dependent
Auth	Apify token	GitHub PAT (often hits cap)	GitHub PAT	Account + plan
Bulk export	Direct dataset → JSON/CSV/Excel	Per-call REST	Per-query GraphQL	UI + limited API
Contributor enrichment	Yes	Multi-call required	Multi-query	Yes
Topic / trending discovery	Yes	Limited	Limited	Yes
Free trial	Free Apify credits on signup	Free for low volume	Free for low volume	30-day trial

VC scouts, devtool marketers, and OSS-funded analysts pick this actor instead of rolling their own PAT-rotation rig to dodge GitHub's 5000-req/hr limit. It is a drop-in alternative to GitHub's API for "I just need to pull 50K repos by topic + their stargazer-history slope" — the kind of query you cannot do in one GitHub API call.

What You Get Per Repo

Each dataset item is a flat JSON record:

owner, name, full_name, html_url, description
stars, forks, watchers, open_issues, closed_issues, open_prs, closed_prs
primary_language, languages_breakdown (% by bytes)
topics, license_spdx, default_branch
created_at, updated_at, pushed_at, last_release_at
contributors_count, top_contributors — array of {login, contributions}
releases — array of {tag_name, published_at, prerelease, download_count}
commit_activity_52w — weekly commit counts
star_history — sampled time series
readme_text, has_wiki, has_pages, archived, disabled
funding_links — Open Collective, GitHub Sponsors, Patreon, etc.

Use Cases

VC devtool scouting — surface fast-growing OSS repos in your thesis (LLM tooling, observability, dev experience) without burning GitHub PAT quota
Developer-tool marketers — find repos using your competitor's library to retarget
OSS funding programs — score grant candidates on commit cadence, contributor diversity, and adoption signals
Hiring teams — discover top contributors in a language / topic
Investor due-diligence — verify a startup's claim of "X stars in Y months" with raw star-history data
Competitive intel — track release cadence + open-issue backlog of a competitor's OSS
Newsletters / Substacks — automate "top 10 LLM repos this month" content

Quick Start

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("nexgendata/github-scraper").call(run_input={
    "queries": ["topic:llm stars:>500", "language:rust stars:>1000"],
    "includeContributors": True,
    "includeReleases": True,
    "maxResults": 5000
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["full_name"], item["stars"], item["primary_language"])

Pricing

Pay-per-event — no PAT quota, no monthly minimum.

Actor Start: $0.0001
Per repo: $0.002
Per contributor enrichment: $0.0005

A 5000-repo topic sweep with contributors costs about $10-15. The equivalent in GitHub-API terms requires hours of PAT rotation and exponential backoff.

Use case	Actor
Daily / weekly trending repos	GitHub Trending Scraper
Deep stargazer-history + analytics	GitHub Repo Stats
GitLab projects + MRs	GitLab Scraper
Docker Hub image pull counts	Docker Hub Scraper
npm package download stats	npm Package Stats
PyPI package download stats	PyPI Package Stats
Dev.to articles + dev audience	Dev.to Scraper
Developer-tools MCP server	Developer Tools MCP Server

FAQ

Q: Why not just use the GitHub API directly? Two reasons: (1) the 5000 req/hr cap throttles anything over ~3000 repos with contributor enrichment, and (2) star-history requires sampling the stargazer event stream — a paginated multi-call dance that's painful in any GraphQL implementation.

Q: Do you respect GitHub's TOS? We use unauthenticated public-page extraction plus optional authenticated API calls when the user provides a PAT. All data we extract is publicly available without login.

Q: Can I scrape private repos? No — public data only.

Q: How fresh are stargazer counts? Live per run. Star history is sampled at points along the repo's life.

Q: What about commit-level data? This actor stops at repo-level metadata. For commit-diff content extraction, layer on github-repo-stats which deep-dives into individual repos.

Q: Can I filter by topic or language? Yes — queries accepts any GitHub search syntax (topic:, language:, stars:, created:, pushed:, etc.).

About NexGenData

NexGenData publishes 260+ buyer-intent actors covering SEC filings, YC alumni, lead generation, competitive intelligence, stock fundamentals across 30+ exchanges, and more. All pay-per-result. Browse the full catalog at https://apify.com/nexgendata?fpr=2ayu9b

How NexGenData Pricing Works

Every NexGenData actor uses pay-per-event pricing — you only pay for results that actually land in your dataset. No monthly minimum, no seat fees, no surprise overage bills.

Actor Start: a single-event charge each time you spin the actor up (scaled to memory size)
Result / item: charged per item written to the default dataset
No charge for retries, internal proxy rotation, or failed sub-requests — those are absorbed by the platform

Apify Platform Bonus

New to Apify? Sign up with the NexGenData referral link — you get free platform credits on signup (enough for several thousand free results) and you help fund the maintenance of this actor fleet.

Integration Surface

Every actor in the NexGenData catalog can be triggered from:

Apify console — point-and-click run
Apify API — REST + webhooks
Apify Python / JS SDKs — programmatic batch
Zapier, Make.com, n8n — official integrations
MCP — many actors are exposed as MCP tools for Claude / ChatGPT / Cursor agents
Schedules — built-in cron for daily / weekly / monthly runs
Webhooks — POST results to any HTTPS endpoint on dataset write

Support

NexGenData maintains 260+ Apify actors and ships updates regularly. Bug reports via the Apify console issues tab get a response within 24 hours. Roadmap requests are welcome — high-demand features ship in the next version.

Home: thenextgennexus.com Full catalog: apify.com/nexgendata

GitHub Repo Stats. Stars, Forks, Languages, Contributors

seemuapps/github-repo-stats-scraper

Get stars, forks, issues, language breakdown, license, last commit, and contributor counts for any GitHub repository. Bulk-process a list of repos in one run.

Andrew

GitHub Repository Scraper — Stars, Forks, Languages & More

joyouscam35875/github-repo-scraper

Scrape GitHub repository data using the REST API v3. Get stars, forks, languages, topics, contributors, releases. Search repos by keyword. Perfect for tech stack analysis and competitive intelligence. $0.002/repo.

Ken Digital

GitHub Scraper - Repos, Stars, Issues & Profiles

cryptosignals/github-scraper

Scrape GitHub repositories, profiles, and issues — extract stars, forks, contributors, README, commit history, and topics. CSV/JSON output. No login.

Web Data Labs

GitHub Scraper

muscular_quadruplet/github-scraper

Scrape GitHub repositories, users, stars, forks. Extract trending repos, developer profiles, code stats. Monitor open source projects, find contributors, analyze tech stacks. No API limits.

Do It

5.0

GitHub Repository Analyzer

optimus-fulcria/github-repo-analyzer

Analyze GitHub repositories: stars, forks, issues, contributors, languages, commit activity. Competitive intelligence for open source.

Fulcria Labs

GitHub Stars Tracker

glassventures/github-stars-tracker

Track GitHub repository stars, forks, and metadata. Extract repo stats, stargazer data, and search repositories by keywords.

Glass Ventures

GitHub Scraper

automation-lab/github-scraper

Extract data from GitHub — repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.

Stas Persiianenko

Github Repositry Scraper

crawlforge/github-repositry-scraper

Scrape GitHub repos by URL, search, or trending. Extract stars, forks, topics, languages, contributors & more. No login needed.

Amna Iftikhar

GitHub Repo Search — Stars, Language & Topics

ryanclinton/github-repo-search

Search and scrape GitHub repositories by keyword, language, stars, forks, or topic. Extract structured repo metadata including owner, license, topics, and activity timestamps. Sort by stars, forks, or recently updated. Export to JSON, CSV, or API. No token required.