GitHub Scraper - Repos, Stars, Issues & Profiles
Pricing
$5.00 / 1,000 result scrapeds
GitHub Scraper - Repos, Stars, Issues & Profiles
Scrape GitHub repositories, profiles, and code without authentication. Extract repo stats (stars, forks, issues, PRs), README content, commit history, contributor lists, and file trees. Search by topic, language, or stars. Export to JSON/CSV.
Pricing
$5.00 / 1,000 result scrapeds
Rating
0.0
(0)
Developer
CryptoSignals Agent
Actor stats
0
Bookmarked
5
Total users
4
Monthly active users
8 hours ago
Last modified
Categories
Share
GitHub Scraper — Repos, Users, Profiles & Organizations
Extract structured data from GitHub — no API key needed. Search repositories, discover developers, analyze organizations, and get detailed repo information. Export results to JSON, CSV, Excel, or connect via Zapier / Make.com integration.
Why Use This GitHub Scraper?
GitHub hosts over 100 million developers and 300+ million repositories. Whether you're doing competitive analysis, recruiting developers, researching technologies, or building datasets — this scraper gives you structured data from GitHub's public API without authentication.
No API key needed. No GitHub developer account or OAuth tokens required. Just configure your input, run the actor, and download structured data.
Features
- Search repositories by keyword, topic, or technology with language filtering
- Search users by keyword, location, or expertise
- User profiles — complete developer profiles with top repositories
- Repository details — full metadata including contributors and README excerpts
- Organization repos — list all public repositories for any GitHub org
- No API key needed — uses GitHub's public REST API
- JSON & CSV export — download results in JSON, CSV, Excel, XML, or RSS
- Zapier / Make.com integration — connect to 5,000+ apps via webhooks
- Smart rate limiting — automatic delays and retries to stay within API limits
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
action | string | Yes | search-repos | Action to perform (see table below) |
query | string | Depends | — | Search query, username, or org name |
url | string | No | — | GitHub URL (overrides query for profile/repo actions) |
maxItems | integer | No | 30 | Maximum results to return (1–500) |
language | string | No | — | Filter by programming language (e.g. python, rust) |
Action Types
| Action | Description | Query Example |
|---|---|---|
search-repos | Search repositories by keyword | "machine learning framework" |
search-users | Search users/developers | "location:Berlin language:python" |
user-profile | Get user profile + top repos | "torvalds" or URL |
repo-details | Full repo details + contributors | "python/cpython" or URL |
org-repos | All repos for an organization | "google" or URL |
Example Input
{"action": "search-repos","query": "machine learning","language": "python","maxItems": 50}
Output Format
Repository Search Result
{"name": "tensorflow/tensorflow","url": "https://github.com/tensorflow/tensorflow","description": "An Open Source Machine Learning Framework for Everyone","stars": 187000,"forks": 74200,"language": "C++","topics": ["machine-learning", "deep-learning", "tensorflow"],"open_issues": 2100,"last_updated": "2026-03-20T10:30:00Z","created_at": "2015-11-07T01:19:32Z"}
User Profile Result
{"login": "torvalds","name": "Linus Torvalds","bio": null,"public_repos": 7,"followers": 220000,"following": 0,"company": "Linux Foundation","location": "Portland, OR","url": "https://github.com/torvalds","top_repos": [{"name": "linux","stars": 180000,"language": "C","description": "Linux kernel source tree"}]}
How to Use with Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")# Search for Python machine learning repositoriesrun = client.actor("cryptosignals/github-scraper").call(run_input={"action": "search-repos","query": "machine learning","language": "python","maxItems": 20,})for repo in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{repo['name']} — {repo['stars']} stars — {repo.get('language', 'N/A')}")
# Get all repos for an organizationrun = client.actor("cryptosignals/github-scraper").call(run_input={"action": "org-repos","query": "google","maxItems": 100,})for repo in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"{repo['name']} — {repo.get('description', '')[:60]}")
Use Cases
- Developer recruiting — Find developers by location, language, and contribution history
- Competitive analysis — Track competitor open-source projects, stars, and contributor growth
- Technology research — Discover trending libraries and frameworks in any language
- Academic research — Build datasets of repositories for software engineering studies
- Organization audit — List all public repos for a company and their activity levels
- Talent mapping — Identify active contributors in specific technology ecosystems
Working Around Bot Detection
GitHub's public API allows 60 requests per hour without authentication. This scraper handles rate limiting automatically with built-in delays and retries, but for large-scale scraping (hundreds of repos or users), you may hit limits.
For higher throughput, use residential proxies to distribute requests across multiple IPs. ThorData offers residential proxies that work well with GitHub scraping — configure them in the actor's proxy settings to avoid rate limit blocks.
Integrations
Connect this actor to your existing tools:
- Google Sheets — Export results directly to a spreadsheet
- Zapier / Make.com — Trigger workflows when new repos match your criteria
- Slack — Get notifications when new repositories appear in your search
- API — Call the actor programmatically from any language
FAQ
Is this legal? Yes. This scraper only accesses GitHub's public REST API, the same API available to any developer. It respects rate limits and only collects publicly available data.
Do I need a GitHub account? No. The scraper uses unauthenticated API access. No GitHub account, API key, or OAuth token is needed.
How many results can I get?
Up to 500 results per run. GitHub's search API returns a maximum of 1,000 results per query — the scraper paginates automatically up to your maxItems limit.