GitHub User Scraper — Profiles, Repos & Orgs
Pricing
Pay per event
GitHub User Scraper — Profiles, Repos & Orgs
Fetch GitHub user or organisation metadata via the GitHub API — name, bio, company, location, blog, public repo count, follower count, and optionally the user's pinned/public repositories — export to JSON or CSV. Free REST API, optional token for higher limits.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
an hour ago
Last modified
Categories
Share
🎯 What this scrapes
Every GitHub user and organisation has a public profile page — and a matching REST API endpoint. This Actor ingests a list of usernames or org slugs, fans them out in parallel, and returns one structured row per profile: display name, bio, company, location, blog URL, public repo count, follower and following counts, hireable flag, account-creation timestamp, and an optional public-repos sub-array. Organisations and personal accounts share the same output shape; the type field tells them apart.
🔥 Features
- 🛡️ Browser fingerprint rotation —
curl-cffireplays real Chrome / Firefox / Safari TLS handshakes so requests look like a browser, not a Python script. - 🌐 Proxy rotation via Apify Proxy — fresh session and exit IP on every block; residential pool available on paid plans.
- 🔁 Retries with exponential backoff — up to 5 attempts per request on
408 / 429 / 5xx,Retry-Afterhonoured. - 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down and surface a clear status message; you never get a silent empty dataset.
- 🧊 Clean, typed rows — Pydantic-validated output, ISO-8601 timestamps, stable
loginIDs, direct JSON / CSV / Excel export from Apify Console. - 💰 Pay-Per-Event pricing — you pay only for rows that reach your dataset. No data, no charge (minus the small
actor-startwarm-up fee).
💡 Use cases
- Technical recruiting — find developers whose bio mentions a framework and whose
hireableflag istrue; enrich a sourcer's shortlist at scale. - B2B lead gen — pull GitHub orgs in a target geography, filter by
public_repos > N, then enrich with the org's website or LinkedIn. - ATS enrichment — pipe candidate GitHub profile URLs through this Actor to append repo count, follower count, and company to your applicant-tracker rows.
- Developer-OSINT — map a pseudonymous developer's public identity: linked Twitter handle, declared company, registered location, and account-creation date.
- DevRel contributor intelligence — rank contributors to a project by follower count and organisation, then prioritise outreach.
⚙️ How to use it
- Click Try for free at the top of the Actor page.
- Paste GitHub usernames (or profile URLs) into the Usernames or org slugs field — one per line.
- Optionally add a GitHub personal access token to raise the rate limit from 60 to 5 000 requests per hour.
- Toggle Include public repos list if you need the repos sub-array.
- Click Start. Results stream into the run's dataset in real time.
- Export from Storage → Dataset as JSON, CSV, or Excel — or pull via the Apify API.
📥 Input
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
usernames | array | yes | ["torvalds", "apify"] | GitHub usernames or org slugs, one per line. Full profile URLs accepted — the host is stripped automatically. |
githubToken | string | no | — | Personal access token. Unauthenticated: 60 req/hr. With token: 5 000 req/hr. |
includeRepos | boolean | no | false | Appends a repos array (up to maxReposPerUser most-recently-updated public repos). One extra API call per user. |
maxReposPerUser | integer | no | 30 | Cap on repos returned per user when includeRepos is true. Hard ceiling: 100. |
concurrency | integer | no | 6 | Parallel API requests in flight at once. |
proxyConfiguration | object | no | {"useApifyProxy": false} | Apify Proxy config. Enable residential proxies for large-volume runs. |
Example input
{"usernames": ["apify", "torvalds"],"includeRepos": true,"maxReposPerUser": 10,"concurrency": 4,"proxyConfiguration": {"useApifyProxy": false}}
📤 Output
One dataset item per GitHub profile.
| Field | Type | Notes |
|---|---|---|
login | string | GitHub username or org slug. |
type | string | User or Organization. |
name | string | null | Display name. |
company | string | null | Self-declared company. |
blog | string | null | Profile blog or homepage URL. |
location | string | null | Self-declared location. |
email | string | null | Public email — null when the user keeps it private. |
bio | string | null | Profile bio. |
twitter_username | string | null | Linked X (Twitter) handle. |
public_repos | integer | Public repository count. |
public_gists | integer | Public gist count. |
followers | integer | Follower count. |
following | integer | Following count. |
html_url | string | Profile URL. |
avatar_url | string | Avatar image URL. |
hireable | boolean | null | Self-declared hireable flag. |
created_at | string | Account creation timestamp (ISO-8601). |
updated_at | string | Last profile-update timestamp (ISO-8601). |
repos | array | null | Public repos summary when includeRepos is true. |
scraped_at | string | When this row was written (ISO-8601). |
Example output
{"login": "apify","type": "Organization","name": "Apify","company": null,"blog": "https://apify.com","location": "Prague, Czechia","email": null,"bio": "The full-stack web scraping and browser automation platform.","public_repos": 412,"followers": 856,"hireable": null,"html_url": "https://github.com/apify","created_at": "2012-09-21T08:07:25Z","scraped_at": "2025-11-01T10:30:00Z"}
💰 Pricing
Pay-Per-Event — you pay only when these events fire:
| Event | USD | What it covers |
|---|---|---|
actor-start | $0.005 | One-off warm-up charge per run |
result | $0.003 | Per dataset item written |
1 000 profiles at the rates above ≈ $3.00. No subscription, no minimum commitment, no card required to start — every new Apify account gets $5 of free credit.
🚧 Limitations
- Only the
/users/{login}REST endpoint is called. Contribution graphs, sponsor tiers, starred repos, and commit activity are not included in this Actor. - Private-org membership and private repos require OAuth scopes that are outside what a read-only personal token provides — those fields are out of scope.
- GitHub's unauthenticated rate limit is 60 requests per hour per IP. For bulk runs of 500+ users, supply a GitHub token or enable Apify Proxy to spread requests across IPs.
- Data freshness reflects whatever GitHub returns at run time. Cached or recently-deleted accounts may return stale or empty rows.
❓ FAQ
Is this a github profile scraper as well as a user scraper?
Yes — the same Actor handles both personal user accounts and organisation profiles. The type field in each row tells you which you got (User vs Organization). If you search for "github profile scraper" this is the tool.
What about the github contributor api — can I get contribution data?
GitHub's public REST API does not expose a user's contribution graph or commit-activity heatmap. This Actor returns what the /users/{login} endpoint publishes: bio, follower count, repo count, hireable flag, and (optionally) the public-repos list. For commit-level data you need the repository contributors endpoint, which is a separate scrape.
Why is email null for most profiles?
GitHub hides email addresses by default. We surface exactly what the public API returns; we do not probe commit metadata or decode noreply addresses.
Are GitHub organisations supported?
Yes — org slugs and personal usernames use the same endpoint and produce the same row shape. The type field distinguishes them.
Does this respect GitHub's ToS and GDPR?
We request only publicly published profile fields via the official REST API — no login walls, no scraping of private content. Whether your downstream use of the data complies with GDPR or GitHub's Acceptable Use Policy is your legal call; we surface the data, you own the compliance obligation.
Can I run this at scale for recruiting?
With a GitHub token you get 5 000 API requests per hour, enough for a few thousand profiles per run. We handle rate-limit responses and back off automatically; you set the concurrency and we do the rest.
💬 Your feedback
Spotted a bug, hit a weird edge case, or need an extra field? Open an issue on the Actor's Issues tab in Apify Console — we ship fixes weekly and read every report.