GitHub Scraper avatar

GitHub Scraper

Pricing

Pay per usage

Go to Apify Store
GitHub Scraper

GitHub Scraper

Extract GitHub user profiles and repositories via the official GitHub API. Build developer lead lists for tech recruiting, B2B prospecting and market research. Clean structured data (JSON/CSV).

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Joao Paulo

Joao Paulo

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

15 hours ago

Last modified

Share

Extract GitHub user profiles and repositories at scale through the official GitHub REST API. Build developer lead lists, power tech-recruiting pipelines, and run market research — clean rows, no HTML parsing, no breakage.

What it does

This GitHub Scraper queries the official GitHub REST API (https://api.github.com) and returns structured data in four modes:

  • Search users — find developer profiles by query (e.g. location:berlin language:python) and enrich each with full profile details.
  • User profiles — fetch complete profiles for a list of usernames.
  • User repositories — list all repositories owned by given usernames.
  • Search repositories — find repos by query (e.g. machine-learning stars:>1000).

Results are flattened into clean, spreadsheet-ready rows and stored in the dataset (export to JSON, CSV, Excel, or API).

Features

  • Four scraping modes covering both GitHub users and GitHub repositories.
  • Optional GitHub token support — raises the rate limit from 60 to 5000 requests/hour.
  • Automatic pagination across result pages.
  • Rate-limit aware: reads X-RateLimit-Remaining and stops gracefully instead of crashing.
  • Profile enrichment in user search (name, company, location, public email, bio, followers).
  • maxItems cap to control run size and cost.
  • Polite request pacing with retries on transient errors.

Input

FieldTypeDescription
modeenumsearch_users, user_profile, user_repos, or search_repos.
querystringSearch query for search_users / search_repos. Supports GitHub search qualifiers.
usernamesarrayList of GitHub usernames for user_profile / user_repos.
githubTokenstring (secret)Optional. A read-only token raises the rate limit to 5000 req/h.
maxItemsintegerMaximum rows to return. Default 1000.

Example input

{
"mode": "search_users",
"query": "location:berlin language:python",
"maxItems": 500
}

Output

User row (lead-gen use case)

{
"login": "torvalds",
"name": "Linus Torvalds",
"company": "Linux Foundation",
"location": "Portland, OR",
"email": null,
"blog": null,
"bio": null,
"followers": 200000,
"following": 0,
"publicRepos": 8,
"profileUrl": "https://github.com/torvalds",
"createdAt": "2011-09-03T15:26:22Z"
}

Repository row

{
"fullName": "facebook/react",
"description": "The library for web and native user interfaces.",
"language": "JavaScript",
"stars": 225000,
"forks": 46000,
"openIssues": 700,
"topics": ["react", "javascript", "frontend", "ui"],
"htmlUrl": "https://github.com/facebook/react",
"updatedAt": "2026-06-28T10:00:00Z",
"ownerLogin": "facebook"
}

Use cases

  • Developer lead generation & tech recruiting — source candidates by location, language, and activity, complete with public contact details.
  • B2B prospecting — find companies and maintainers behind popular tools and libraries.
  • Market & ecosystem research — track repositories, languages, stars, and trends in any technology niche.
  • AI training data — collect structured datasets of profiles and repositories for analysis or model training.

Why this actor

  • Official GitHub REST API — stable, documented, and compliant. No fragile HTML scraping that breaks on UI changes.
  • Higher limits when you need them — drop in an optional token to go from 60 to 5000 requests/hour.
  • Clean, ready-to-use output — flattened rows that drop straight into your CRM, ATS, or spreadsheet.

Search terms: GitHub scraper, GitHub API, scrape GitHub users, developer leads, GitHub profiles, tech recruiting data.