GitHub Repos Scraper avatar

GitHub Repos Scraper

Pricing

Pay per usage

Go to Apify Store
GitHub Repos Scraper

GitHub Repos Scraper

Extract GitHub repository data including stars, forks, issues, languages, and contributor info. Monitor open-source trends, track technology adoption, and analyze developer ecosystem.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

fatih dağüstü

fatih dağüstü

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

3 days ago

Last modified

Categories

Share

GitHub Repositories Scraper

Scrape GitHub repositories using GitHub's free public API — no authentication required! Extract stars, forks, topics, languages, licenses, owner info, and more for competitor analysis, tech stack research, and lead generation.

Features

  • No API key needed — uses GitHub's free public API (60 req/hour)
  • Search repositories by keyword, topic, or technology
  • Filter by language — Python, JavaScript, TypeScript, Go, Rust, etc.
  • Sort by stars, forks, or recently updated
  • User/Org repos — fetch all repos from any GitHub user or organization
  • Rich data — stars, forks, watchers, topics, license, issues, branches, owner info
  • Pagination — scrape up to 1,000 results per search
  • Rate-limit safe — automatic delay and retry on rate limits

Input

FieldTypeDefaultDescription
searchQuerystringKeywords to search (e.g., web scraping, machine learning)
languagestringFilter by language (e.g., python, javascript)
sortstringstarsSort by: stars, forks, updated, help-wanted-issues
minStarsinteger0Minimum star count filter
maxResultsinteger100Maximum repos to return (up to 1000)
userOrOrgstringFetch all repos from a GitHub user/organization

Output

Each repository item contains:

{
"name": "scrapy",
"fullName": "scrapy/scrapy",
"description": "Scrapy, a fast high-level web crawling & scraping framework for Python.",
"url": "https://api.github.com/repos/scrapy/scrapy",
"htmlUrl": "https://github.com/scrapy/scrapy",
"stars": 52000,
"forks": 10500,
"watchers": 52000,
"language": "Python",
"topics": ["scraping", "crawling", "python", "spider"],
"license": {
"key": "bsd-3-clause",
"name": "BSD 3-Clause \"New\" or \"Revised\" License",
"spdxId": "BSD-3-Clause"
},
"createdAt": "2010-02-22T19:11:19Z",
"updatedAt": "2024-01-15T12:34:00Z",
"pushedAt": "2024-01-14T18:00:00Z",
"size": 25600,
"defaultBranch": "master",
"openIssues": 234,
"isArchived": false,
"isFork": false,
"owner": {
"login": "scrapy",
"avatarUrl": "https://avatars.githubusercontent.com/u/733635?v=4",
"type": "Organization",
"htmlUrl": "https://github.com/scrapy"
}
}

Example Inputs

{
"searchQuery": "web scraping",
"language": "python",
"sort": "stars",
"minStars": 1000,
"maxResults": 50
}

Get all Microsoft repositories

{
"userOrOrg": "microsoft",
"sort": "stars",
"maxResults": 100
}

Find recently updated JavaScript tools

{
"searchQuery": "automation tool",
"language": "javascript",
"sort": "updated",
"maxResults": 200
}

Research AI/ML frameworks

{
"searchQuery": "large language model",
"language": "python",
"sort": "stars",
"minStars": 5000,
"maxResults": 30
}

Use Cases

Competitor Analysis

Find all open-source projects competing with your product. Analyze their star growth, contributor count, and activity to benchmark your own project.

Tech Stack Research

Discover the most popular libraries and frameworks for any programming language or domain. Identify trending technologies before they go mainstream.

Lead Generation

Find GitHub organizations and users who build in your target technology stack. Reach out to companies actively using specific tools.

Market Research

Understand which technologies are gaining traction. Track star growth for repositories in your industry over time.

Developer Outreach

Build lists of active open-source developers for hiring, partnerships, or community outreach campaigns.

Content Research

Find the most starred repositories in any category to create "top X libraries" blog posts or tutorials.

Rate Limits

GitHub's unauthenticated API allows 60 requests per hour. The scraper automatically adds a 1-second delay between requests to stay within limits. For most use cases (up to 1,000 results), this takes about 10-15 minutes.

Notes

  • GitHub Search API caps results at 1,000 per query. Use specific queries to get the most relevant results.
  • For user/org repos, all public repositories are returned sorted by stars.
  • topics field requires GitHub to have them set by the repo owner.