Pricing

from $2.00 / 1,000 repository scrapeds

GitHub Scraper - Repos, Users & Issues

Scrape GitHub repositories, users, and issues via the official GitHub API. Get stars, forks, languages, topics, issues, user profiles, and follower counts. No login needed (optional token for higher limits).

Pricing

from $2.00 / 1,000 repository scrapeds

Rating

0.0

(0)

Developer

Md Jakaria Mirza

Actor stats

Bookmarked

Total users

Monthly active users

14 days ago

Last modified

What It Extracts

Repository name, owner, description, URL, homepage, language, license, topics, stars, forks, watchers, open issues, archive/fork status, default branch, created/updated/pushed dates
Optional nested recent issues and pull requests for each repository
User or organization login, name, type, company, location, bio, public counts, followers, following, profile URL, and created/updated dates
Optional nested repositories for each user or organization
Scrape timestamp plus machine-readable run diagnostics

Quick Start

Use this small input first:

{
  "repos": ["openai/openai-node"],
  "users": [],
  "searchQueries": [],
  "includeIssues": false,
  "maxIssuesPerRepo": 0,
  "includeUserRepos": false,
  "maxReposPerUser": 0,
  "maxResults": 1,
  "githubToken": "",
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}

For larger jobs, add a GitHub token with no scopes for public data. Prefer a token over proxy rotation for normal GitHub API usage.

Input

Field	Type	Default	Notes
`repos`	array	`["openai/openai-node"]`	`owner/repo` or GitHub repository URLs.
`users`	array	`[]`	GitHub usernames or organization names.
`searchQueries`	array	`[]`	GitHub repository search syntax, e.g. `topic:typescript stars:>1000`.
`includeIssues`	boolean	`false`	Adds recent issues and pull requests inside each repository record.
`maxIssuesPerRepo`	integer	`20`	Applies only when `includeIssues` is enabled.
`includeUserRepos`	boolean	`false`	Adds repositories inside each user or organization record.
`maxReposPerUser`	integer	`20`	Applies only when `includeUserRepos` is enabled.
`maxResults`	integer	`1`	Maximum repositories from explicit repos and search queries. Use 1-5 for tests.
`githubToken`	secret string	empty	Optional token for higher GitHub API limits.
`proxyConfiguration`	object	disabled	Keep disabled unless you intentionally need proxy routing.

Input is validated before any request. Repository URLs must point to one exact owner/repo path, profile URLs must point to one exact user or organization, and duplicates are removed case-insensitively. A run accepts at most 1,000 repository inputs, 500 user or organization inputs, and 20 repository search queries.

Output Dataset

The default dataset contains repository records and user/organization records. The Store table includes separate views for repositories and users/organizations, while JSON exports include the full nested issue or repository arrays when those options are enabled.

The default key-value store also contains RUN_STATUS. It records saved repository and user counts, completed or skipped searches, duplicate and not-found targets, GitHub incomplete-search signals, non-public records skipped, spending/runtime stops, and a sanitized failure message when the official API cannot provide trustworthy data.

Verified sample from an existing successful run:

{
  "entityType": "repo",
  "repoId": 438419937,
  "fullName": "openai/openai-node",
  "name": "openai-node",
  "owner": "openai",
  "description": "Official JavaScript / TypeScript library for the OpenAI API",
  "url": "https://github.com/openai/openai-node",
  "homepage": "https://www.npmjs.com/package/openai",
  "language": "TypeScript",
  "stars": 10984,
  "forks": 1520,
  "watchers": 156,
  "openIssues": 264,
  "license": "Apache-2.0",
  "topics": ["nodejs", "openai", "typescript"],
  "isFork": false,
  "isArchived": false,
  "defaultBranch": "main",
  "createdAt": "2021-12-14T22:32:58Z",
  "updatedAt": "2026-06-21T04:29:27Z",
  "pushedAt": "2026-06-18T02:14:48Z",
  "issuesScrapedCount": 0,
  "issues": [],
  "scrapedAt": "2026-06-21T07:13:13.054Z"
}

Pricing And Cost Control

Current live pricing checked on 2026-07-15:

Event	Active price
`repo-scraped`	`$0.002` per repository/user record
`apify-actor-start`	`$0.00005` per GB

Nested issues and nested user repositories are included inside the saved repo/user record. The Actor saves and charges each repo/user record atomically and stops when the user's spending limit is reached.

Cost-control tips:

Test with one repository and maxResults: 1.
Keep includeIssues and includeUserRepos disabled until you need nested data.
Use a GitHub token for larger jobs instead of proxy routing.
Split very broad search queries into smaller runs.

Use Cases

Open-source ecosystem research
Repository and dependency intelligence
Technical due-diligence datasets
Organization and project monitoring
Public issue and pull-request analysis

Known Limits

Unauthenticated GitHub API calls are limited to 60 requests/hour per IP.
GitHub Search API exposes at most 1,000 results per query and can mark a response as incomplete; narrow search queries work better.
This Actor collects public GitHub data only. It does not bypass private repositories or authentication.

Reliability Behavior

Requests have a 20-second timeout and bounded retries for network and transient 5xx failures.
Real rate limits honor GitHub's Retry-After or X-RateLimit-Reset headers when the wait is short enough; longer limits fail clearly instead of hammering the API.
401, ordinary 403, rate-limit 403/429, confirmed 404, malformed JSON, and valid empty searches are handled as distinct outcomes.
Records are validated before the atomic dataset write and repo-scraped event charge.
Private repository payloads exposed by an over-scoped user token are skipped before nested requests or billing.
Requests run serially to reduce GitHub secondary-rate-limit risk, and a safe runtime guard stops before the platform timeout.

Responsible Use

Use this Actor only for lawful collection and analysis of public GitHub data. Follow GitHub's terms, applicable privacy laws, anti-spam rules, and your own compliance requirements. Do not use exported profiles for spam or harassment.

License

Apache-2.0

GitHub Search Scraper — Repos, Users & Issues

ponderable_hydrometer/github-search-scraper

Search GitHub repos, users & issues, or fetch repo details — stars, forks, topics, language, license, dates. Free GitHub API, add a token for higher rate. For lead-gen & research.

Ponderable Hydrometer

GitHub Scraper - Repos, Stars, Issues & Profiles

cryptosignals/github-scraper

Scrape GitHub repositories, profiles, and issues — extract stars, forks, contributors, README, commit history, and topics. CSV/JSON output. No login.

Web Data Labs

GitHub Scraper - Repos, Stars & Profiles

pear_fight/github-scraper

Scrape GitHub repositories, user profiles, issues & stars. Extract repo names, descriptions, stars, forks, languages, contributors, issues. Search by topic or keyword. No API limits. Pay per result. Export JSON/CSV.

Harald

GitHub Scraper - Repos, Issues, PRs & Contributors

nominated_tupelo/github-scraper

Scrape GitHub repositories, issues, pull requests, contributors, releases, and trending repos. Uses the official GitHub REST API. Optional GitHub token for higher rate limits.

kade

GitHub Scraper

dami_studio/github-scraper

Search GitHub repositories or users via the public API and get structured rows: stars, forks, issues, language, topics, license, dates for repos; name, bio, company, location, followers for users. Ran

Dami's Studio

5.0

GitHub Repositories Scraper

exuberant_volley/github-repos-scraper

Search public GitHub repositories and scrape clean structured data — stars, forks, language, topics, license, open issues, homepage and timestamps — via the official GitHub REST search API. No token needed, no owner personal data.

ScrapeForge

Github Repos Scraper

velvety_bedbug/github-repos-scraper

Search and scrape GitHub repositories. Filter by language, topic, stars, and creation date. Returns stars, forks, issues, topics, and more.

Peters Bugs

GitHub Repository Scraper - Stars, Forks, Topics

spiky_pepperoni/github-repository-scraper

Search and scrape GitHub repositories: stars, forks, language, topics, license, issues. No login. Export JSON, CSV, Excel.

Arad S

GitHub Repos Scraper — Repositories by User or Search

omao/github-repos

List a user's or organization's GitHub repositories, or search repos, into clean JSON: name, owner, description, stars, forks, language, topics, license and dates. Powered by the official GitHub API. Optional token for higher rate limits.