GitHub Repository Scraper — Stars, Forks, Languages & More
Pricing
Pay per usage
GitHub Repository Scraper — Stars, Forks, Languages & More
Scrape GitHub repository data using the REST API v3. Get stars, forks, languages, topics, contributors, releases. Search repos by keyword. Perfect for tech stack analysis and competitive intelligence. $0.002/repo.
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Ken Digital
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
11 hours ago
Last modified
Categories
Share
Scrape GitHub repository metadata at scale using the official REST API v3. Get structured data on stars, forks, languages, topics, contributors, releases, and more — ready for lead generation, market research, and competitive analysis.
What it does
- Scrape specific repos — provide a list of
owner/repostrings - Search GitHub — use any GitHub search syntax to discover repos
- Enrich with extras — optionally fetch top contributors and latest releases
- Handle rate limits — automatic back-off with
X-RateLimitheaders; optional token support for 5,000 req/hr
Output fields
| Field | Type | Description |
|---|---|---|
owner | string | Repository owner login |
name | string | Repository name |
fullName | string | owner/name |
url | string | GitHub URL |
description | string | Repository description |
stars | int | Stargazer count |
forks | int | Fork count |
openIssues | int | Open issue count |
language | string | Primary language |
languages | object | All languages with byte counts |
topics | array | Topic tags |
createdAt | string | ISO 8601 creation date |
updatedAt | string | Last update date |
pushedAt | string | Last push date |
license | string | SPDX license identifier |
isArchived | bool | Whether the repo is archived |
isFork | bool | Whether the repo is a fork |
defaultBranch | string | Default branch name |
size | int | Repository size in KB |
watchers | int | Watcher/subscriber count |
homepage | string | Homepage URL |
topContributors | array | Top 30 contributors (opt-in) |
latestReleases | array | Last 5 releases (opt-in) |
Input examples
Scrape specific repositories
{"repos": ["apify/crawlee", "microsoft/playwright", "facebook/react"]}
Search for Python web scraping tools
{"searchQuery": "web scraping language:python stars:>100","maxRepos": 50}
Full enrichment with auth
{"repos": ["vercel/next.js"],"searchQuery": "framework language:typescript stars:>5000","maxRepos": 100,"includeContributors": true,"includeReleases": true,"githubToken": "ghp_xxxxxxxxxxxx"}
Rate limits
| Mode | Requests/hour | Repos/run (approx) |
|---|---|---|
| No token | 60 | ~20–30 (2–3 API calls per repo) |
| With token | 5,000 | ~1,500–2,500 |
Tip: Create a free personal access token (no scopes needed for public repos) to unlock 5,000 requests/hour.
Pricing
$0.002 per repository scraped (pay per event).
Cost comparison
| Repos | This actor | GitHub API (your infra) | Manual research |
|---|---|---|---|
| 10 | $0.02 | Free + your time | ~30 min |
| 100 | $0.20 | Free + your time | ~5 hours |
| 500 | $1.00 | Free + your time | ~2 days |
| 1,000 | $2.00 | Free + your time | ~1 week |
You pay for data, not infrastructure. No servers to maintain, no code to write, no rate limits to handle.
Use cases
- Lead generation — Find companies using specific technologies, contact repo owners
- Competitive analysis — Track competitor open-source projects, compare stars/forks growth
- Technology research — Discover trending tools in any language or domain
- Talent sourcing — Identify top contributors to relevant projects
- Investment research — Gauge open-source traction for developer tools companies
- Academic research — Collect repository metadata for software engineering studies
- Dependency auditing — Assess health (activity, issues, releases) of your dependencies
Technical details
- Uses GitHub REST API v3 (
api.github.com) - Automatic rate-limit detection and back-off via
X-RateLimit-*headers - No browser or proxy needed — pure API calls
- Async execution with
httpxfor fast throughput - Outputs clean, structured JSON to the Apify dataset
🔗 More Scrapers by Ken Digital
| Scraper | What it does | Price |
|---|---|---|
| YouTube Channel Scraper | Videos, stats, metadata | $0.001/video |
| France Job Scraper | WTTJ + France Travail + Hellowork | $0.005/job |
| France Real Estate Scraper | 5 sources + DVF price analysis | $0.008/listing |
| Website Content Crawler | HTML → Markdown for AI/RAG | $0.001/page |
| Google Trends Scraper | Keywords, regions, related queries | $0.002/keyword |
| GitHub Repo Scraper | Stars, forks, languages, topics | $0.002/repo |
| RSS News Aggregator | Multi-source feed parsing | $0.0005/article |
| Instagram Profile Scraper | Followers, bio, posts | $0.0015/profile |
| Google Maps Scraper | Businesses, reviews, contacts | $0.002/result |
| TikTok Scraper | Videos, likes, shares | $0.001/video |
| Google SERP Scraper | Search results, PAA, snippets | $0.003/search |
| Trustpilot Scraper | Reviews, ratings, sentiment | $0.001/review |
🔗 Quick Integration
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("joyouscam35875/github-repo-scraper").call(run_input={...})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(item)
Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('joyouscam35875/github-repo-scraper').call({...});const { items } = await client.dataset(run.defaultDatasetId).listItems();
No-code: Make / Zapier / n8n
Search for this actor in the Apify connector. No code needed.