GitHub Scraper
Pricing
Pay per usage
GitHub Scraper
Extract GitHub user profiles and repositories via the official GitHub API. Build developer lead lists for tech recruiting, B2B prospecting and market research. Clean structured data (JSON/CSV).
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Joao Paulo
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
15 hours ago
Last modified
Categories
Share
Extract GitHub user profiles and repositories at scale through the official GitHub REST API. Build developer lead lists, power tech-recruiting pipelines, and run market research — clean rows, no HTML parsing, no breakage.
What it does
This GitHub Scraper queries the official GitHub REST API (https://api.github.com) and returns structured data in four modes:
- Search users — find developer profiles by query (e.g.
location:berlin language:python) and enrich each with full profile details. - User profiles — fetch complete profiles for a list of usernames.
- User repositories — list all repositories owned by given usernames.
- Search repositories — find repos by query (e.g.
machine-learning stars:>1000).
Results are flattened into clean, spreadsheet-ready rows and stored in the dataset (export to JSON, CSV, Excel, or API).
Features
- Four scraping modes covering both GitHub users and GitHub repositories.
- Optional GitHub token support — raises the rate limit from 60 to 5000 requests/hour.
- Automatic pagination across result pages.
- Rate-limit aware: reads
X-RateLimit-Remainingand stops gracefully instead of crashing. - Profile enrichment in user search (name, company, location, public email, bio, followers).
maxItemscap to control run size and cost.- Polite request pacing with retries on transient errors.
Input
| Field | Type | Description |
|---|---|---|
mode | enum | search_users, user_profile, user_repos, or search_repos. |
query | string | Search query for search_users / search_repos. Supports GitHub search qualifiers. |
usernames | array | List of GitHub usernames for user_profile / user_repos. |
githubToken | string (secret) | Optional. A read-only token raises the rate limit to 5000 req/h. |
maxItems | integer | Maximum rows to return. Default 1000. |
Example input
{"mode": "search_users","query": "location:berlin language:python","maxItems": 500}
Output
User row (lead-gen use case)
{"login": "torvalds","name": "Linus Torvalds","company": "Linux Foundation","location": "Portland, OR","email": null,"blog": null,"bio": null,"followers": 200000,"following": 0,"publicRepos": 8,"profileUrl": "https://github.com/torvalds","createdAt": "2011-09-03T15:26:22Z"}
Repository row
{"fullName": "facebook/react","description": "The library for web and native user interfaces.","language": "JavaScript","stars": 225000,"forks": 46000,"openIssues": 700,"topics": ["react", "javascript", "frontend", "ui"],"htmlUrl": "https://github.com/facebook/react","updatedAt": "2026-06-28T10:00:00Z","ownerLogin": "facebook"}
Use cases
- Developer lead generation & tech recruiting — source candidates by location, language, and activity, complete with public contact details.
- B2B prospecting — find companies and maintainers behind popular tools and libraries.
- Market & ecosystem research — track repositories, languages, stars, and trends in any technology niche.
- AI training data — collect structured datasets of profiles and repositories for analysis or model training.
Why this actor
- Official GitHub REST API — stable, documented, and compliant. No fragile HTML scraping that breaks on UI changes.
- Higher limits when you need them — drop in an optional token to go from 60 to 5000 requests/hour.
- Clean, ready-to-use output — flattened rows that drop straight into your CRM, ATS, or spreadsheet.
Search terms: GitHub scraper, GitHub API, scrape GitHub users, developer leads, GitHub profiles, tech recruiting data.