GitHub Scraper - Repos, Issues, PRs & Contributors avatar

GitHub Scraper - Repos, Issues, PRs & Contributors

Pricing

Pay per usage

Go to Apify Store
GitHub Scraper - Repos, Issues, PRs & Contributors

GitHub Scraper - Repos, Issues, PRs & Contributors

Scrape GitHub repositories, issues, pull requests, contributors, releases, and trending repos. Uses the official GitHub REST API. Optional GitHub token for higher rate limits.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

kade

kade

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

0

Monthly active users

3 days ago

Last modified

Categories

Share

GitHub Scraper — Repos, Issues, PRs, Contributors & Trending

The only dedicated GitHub scraper on Apify. Extract data from GitHub repositories, issues, pull requests, contributors, releases, trending repos, and user profiles using the official GitHub REST API. No browser. No proxy required. Works without a GitHub token (60 req/hr free) or with your own token (5,000 req/hr).

What you can scrape

ModeWhat you get
repoStars, forks, watchers, topics, language, license, dates
issuesTitle, state, labels, assignees, comment count, body
pull_requestsPR title, state, merged status, labels, author, dates
contributorsLogin, commit count, profile URL
releasesTag, name, assets, download count, publish date
search_reposSearch by keyword, language, stars, topic, org
trendingToday's / this week's / this month's trending repos
userUser/org profile + all their public repos

Why use GitHub Scraper?

  • Developer intelligence — track competitor repos, monitor open source projects, analyze tech stacks by stars/forks
  • Lead generation — find contributors and maintainers in your niche (JS devs, Python ML engineers, Rust systems devs)
  • Trend monitoring — scrape trending repos daily to spot emerging tech before it's mainstream
  • Research & data science — build datasets from GitHub activity, issues, releases
  • No dedicated actor exists — this is the only GitHub-focused scraper on Apify

How to use

  1. Open the Input tab
  2. Select a Scrape Mode (repo, issues, trending, search, etc.)
  3. For repo/issues/PRs: add repos in owner/repo format (e.g. microsoft/vscode)
  4. Optionally add your GitHub Token for higher rate limits (5000/hr vs 60/hr)
  5. Click Start — data appears in the Output tab instantly

Input

ParameterTypeDefaultDescription
scrapeModestringrepoWhat to scrape: repo, issues, pull_requests, contributors, releases, search_repos, trending, user
reposstring[]Repos in owner/repo format. Used by repo/issues/PRs/contributors/releases modes.
searchQuerystringGitHub search query (e.g. language:python stars:>1000 topic:llm). Used by search_repos mode.
usernamesstring[]GitHub usernames or orgs. Used by user mode.
trendingLanguagestringFilter trending repos by language (e.g. python, rust). Empty = all languages.
trendingPeriodstringdailydaily, weekly, or monthly. Used by trending mode.
issueStatestringopenFilter issues/PRs by state: open, closed, all.
maxItemsinteger100Max items per repo (0 = unlimited).
githubTokenstringOptional GitHub PAT (classic, no scopes needed for public data). Increases rate limit 83x.

Example inputs

{
"scrapeMode": "trending",
"trendingLanguage": "python",
"trendingPeriod": "daily"
}

Get repo details for multiple repos

{
"scrapeMode": "repo",
"repos": ["microsoft/vscode", "torvalds/linux", "facebook/react"]
}

Search repos by topic

{
"scrapeMode": "search_repos",
"searchQuery": "topic:machine-learning language:python stars:>500",
"maxItems": 200,
"githubToken": "ghp_..."
}

Get all open issues from a repo

{
"scrapeMode": "issues",
"repos": ["openai/openai-python"],
"issueState": "open",
"maxItems": 500
}

Output examples

Repository

{
"type": "repository",
"fullName": "microsoft/vscode",
"owner": "microsoft",
"name": "vscode",
"description": "Visual Studio Code",
"url": "https://github.com/microsoft/vscode",
"stars": 186550,
"forks": 40566,
"language": "TypeScript",
"topics": ["editor", "electron", "typescript"],
"license": "MIT",
"createdAt": "2015-09-03T20:23:38Z",
"updatedAt": "2026-06-19T14:22:00Z"
}

Issue

{
"type": "issue",
"id": 12345,
"title": "Extensions not loading after update",
"state": "open",
"author": "username",
"labels": ["bug", "P1"],
"commentsCount": 23,
"createdAt": "2026-06-01T10:00:00Z"
}
{
"type": "trending_repo",
"fullName": "owner/repo-name",
"language": "Python",
"stars": 45200,
"starsToday": 892,
"trendingPeriod": "daily"
}

Without a token: 60 API requests/hour — fine for small batches (5-10 repos). With a token: 5,000 API requests/hour — needed for large batches or fetching many issues/PRs.

How to get a free token:

  1. Go to github.com/settings/tokens
  2. Click "Generate new token (classic)"
  3. Give it a name, set no expiry, check no scopes (read-only public data needs no scopes)
  4. Copy the ghp_... token and paste into the githubToken field

Your token is never stored — it's used only during the run.

Pricing

This actor uses the Pay Per Event model — you pay per item returned.

  • 100 repos: ~$0.01
  • 1,000 issues: ~$0.05
  • trending (15 repos): ~$0.01

GitHub's API is fast and returns structured JSON — no browser overhead, runs complete in seconds.

Tips

  • Lead gen: use contributors mode on active repos in your target niche → get a list of active developers
  • Tech trend monitoring: schedule trending mode daily → track which languages and frameworks are surging
  • Competitive intel: issues + pull_requests on a competitor's repo → see what users are asking for
  • OSS research: search_repos with topic:your-tech stars:>100 → map the ecosystem

FAQ

Do I need a GitHub account? No. All public data is accessible without any login.

Is this against GitHub's ToS? This actor uses the official GitHub REST API with proper headers and rate limit handling. GitHub's API is designed for programmatic access. Always comply with GitHub's API usage terms.

What about private repos? Only public data is accessible. Private repos require OAuth which is not supported.

Something broken? Open an issue on the Issues tab with the repo and error message.