GitHub Repository Scraper — Stars, Forks, Languages & More avatar

GitHub Repository Scraper — Stars, Forks, Languages & More

Pricing

Pay per usage

Go to Apify Store
GitHub Repository Scraper — Stars, Forks, Languages & More

GitHub Repository Scraper — Stars, Forks, Languages & More

Scrape GitHub repository data using the REST API v3. Get stars, forks, languages, topics, contributors, releases. Search repos by keyword. Perfect for tech stack analysis and competitive intelligence. $0.002/repo.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ken Digital

Ken Digital

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 hours ago

Last modified

Categories

Share

Scrape GitHub repository metadata at scale using the official REST API v3. Get structured data on stars, forks, languages, topics, contributors, releases, and more — ready for lead generation, market research, and competitive analysis.

What it does

  • Scrape specific repos — provide a list of owner/repo strings
  • Search GitHub — use any GitHub search syntax to discover repos
  • Enrich with extras — optionally fetch top contributors and latest releases
  • Handle rate limits — automatic back-off with X-RateLimit headers; optional token support for 5,000 req/hr

Output fields

FieldTypeDescription
ownerstringRepository owner login
namestringRepository name
fullNamestringowner/name
urlstringGitHub URL
descriptionstringRepository description
starsintStargazer count
forksintFork count
openIssuesintOpen issue count
languagestringPrimary language
languagesobjectAll languages with byte counts
topicsarrayTopic tags
createdAtstringISO 8601 creation date
updatedAtstringLast update date
pushedAtstringLast push date
licensestringSPDX license identifier
isArchivedboolWhether the repo is archived
isForkboolWhether the repo is a fork
defaultBranchstringDefault branch name
sizeintRepository size in KB
watchersintWatcher/subscriber count
homepagestringHomepage URL
topContributorsarrayTop 30 contributors (opt-in)
latestReleasesarrayLast 5 releases (opt-in)

Input examples

Scrape specific repositories

{
"repos": ["apify/crawlee", "microsoft/playwright", "facebook/react"]
}

Search for Python web scraping tools

{
"searchQuery": "web scraping language:python stars:>100",
"maxRepos": 50
}

Full enrichment with auth

{
"repos": ["vercel/next.js"],
"searchQuery": "framework language:typescript stars:>5000",
"maxRepos": 100,
"includeContributors": true,
"includeReleases": true,
"githubToken": "ghp_xxxxxxxxxxxx"
}

Rate limits

ModeRequests/hourRepos/run (approx)
No token60~20–30 (2–3 API calls per repo)
With token5,000~1,500–2,500

Tip: Create a free personal access token (no scopes needed for public repos) to unlock 5,000 requests/hour.

Pricing

$0.002 per repository scraped (pay per event).

Cost comparison

ReposThis actorGitHub API (your infra)Manual research
10$0.02Free + your time~30 min
100$0.20Free + your time~5 hours
500$1.00Free + your time~2 days
1,000$2.00Free + your time~1 week

You pay for data, not infrastructure. No servers to maintain, no code to write, no rate limits to handle.

Use cases

  • Lead generation — Find companies using specific technologies, contact repo owners
  • Competitive analysis — Track competitor open-source projects, compare stars/forks growth
  • Technology research — Discover trending tools in any language or domain
  • Talent sourcing — Identify top contributors to relevant projects
  • Investment research — Gauge open-source traction for developer tools companies
  • Academic research — Collect repository metadata for software engineering studies
  • Dependency auditing — Assess health (activity, issues, releases) of your dependencies

Technical details

  • Uses GitHub REST API v3 (api.github.com)
  • Automatic rate-limit detection and back-off via X-RateLimit-* headers
  • No browser or proxy needed — pure API calls
  • Async execution with httpx for fast throughput
  • Outputs clean, structured JSON to the Apify dataset

🔗 More Scrapers by Ken Digital

ScraperWhat it doesPrice
YouTube Channel ScraperVideos, stats, metadata$0.001/video
France Job ScraperWTTJ + France Travail + Hellowork$0.005/job
France Real Estate Scraper5 sources + DVF price analysis$0.008/listing
Website Content CrawlerHTML → Markdown for AI/RAG$0.001/page
Google Trends ScraperKeywords, regions, related queries$0.002/keyword
GitHub Repo ScraperStars, forks, languages, topics$0.002/repo
RSS News AggregatorMulti-source feed parsing$0.0005/article
Instagram Profile ScraperFollowers, bio, posts$0.0015/profile
Google Maps ScraperBusinesses, reviews, contacts$0.002/result
TikTok ScraperVideos, likes, shares$0.001/video
Google SERP ScraperSearch results, PAA, snippets$0.003/search
Trustpilot ScraperReviews, ratings, sentiment$0.001/review

👉 View all scrapers

🔗 Quick Integration

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("joyouscam35875/github-repo-scraper").call(run_input={...})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item)

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('joyouscam35875/github-repo-scraper').call({...});
const { items } = await client.dataset(run.defaultDatasetId).listItems();

No-code: Make / Zapier / n8n

Search for this actor in the Apify connector. No code needed.