Pricing

Pay per usage

GitHub Repository Scraper — Stars, Issues & Activity

Scrape any GitHub repository for stars, forks, issues, PRs, contributors, languages, topics, releases, license, last commit, and README preview. Search repos by keyword with language and star filters. Great for tech research and competitive analysis.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Ricardo Akiyoshi

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

GitHub Repository Scraper

Scrape any GitHub repository for comprehensive metadata — stars, forks, issues, pull requests, contributors, languages, topics, releases, license, last commit, README preview, and homepage URL. Search repositories by keyword with language and star count filters.

What does GitHub Repository Scraper do?

This actor extracts detailed information from GitHub repository pages. You can either provide direct repository URLs or use search queries to discover repositories matching your criteria. Every scraped repository returns a rich data object with 18+ fields covering popularity, activity, and technical details.

Unlike GitHub's API (which requires authentication and has strict rate limits), this scraper works without any API key and extracts data directly from GitHub's public pages.

Features

Direct URL scraping — Provide any GitHub repo URL and get full metadata
Keyword search — Find repos by keyword, language, and star count
18+ data fields — Stars, forks, watchers, issues, PRs, contributors, languages, topics, releases, license, last commit, README preview, homepage
Multiple extraction strategies — Embedded JSON-LD, meta tags, and DOM parsing for maximum reliability
Deduplication — No duplicate repos in output, even with overlapping search results
Pagination — Automatically follows GitHub search pagination up to 1,000 results
Proxy support — Built-in proxy rotation to avoid rate limiting on large scrapes

Input

Field	Type	Default	Description
repoUrls	array	[]	Direct GitHub repository URLs to scrape
searchQuery	string	null	Search repos by keyword (e.g., "web scraper")
language	string	null	Filter by language (e.g., "python", "javascript")
sortBy	enum	"stars"	Sort: stars, forks, updated, best_match
minStars	integer	0	Minimum star count filter
maxResults	integer	200	Maximum repos to return (up to 1,000)
proxyConfiguration	object	Apify Proxy	Proxy settings for rate limit avoidance

Output

Each repository produces a data object like this:

{
  "name": "react",
  "owner": "facebook",
  "fullName": "facebook/react",
  "description": "The library for web and native user interfaces.",
  "stars": 232145,
  "forks": 47523,
  "watchers": 6723,
  "openIssues": 983,
  "pullRequests": 287,
  "language": "JavaScript",
  "languages": ["JavaScript", "TypeScript", "HTML", "CSS"],
  "license": "MIT",
  "topics": ["react", "javascript", "frontend", "ui", "declarative"],
  "lastCommit": "2026-03-01T14:32:00Z",
  "contributors": 1847,
  "releases": 215,
  "readmePreview": "React is a JavaScript library for building user interfaces...",
  "homepage": "https://react.dev",
  "url": "https://github.com/facebook/react",
  "scrapedAt": "2026-03-02T10:15:30Z"
}

Use Cases

Tech Stack Research

Discover and compare frameworks, libraries, and tools in any programming language. Filter by stars and activity to find the most popular and actively maintained options.

Competitor Analysis

Monitor competitor open source projects — track star growth, contributor activity, release frequency, and community engagement.

Open Source Intelligence (OSINT)

Gather intelligence on organizations' tech stacks by analyzing their public repositories. Identify technologies, team size (contributors), and development velocity.

Hiring & Talent Research

Find active open source contributors in specific languages or frameworks. Identify prolific developers by exploring contributor data across popular repositories.

Investment & Market Research

Spot emerging technologies by tracking rapidly growing repositories. Compare star counts, fork rates, and contributor growth across competing projects.

Academic Research

Collect structured data on open source software ecosystems for academic studies. Analyze language trends, licensing patterns, and community dynamics.

Examples

Scrape specific repositories

{
  "repoUrls": [
    "https://github.com/facebook/react",
    "https://github.com/vuejs/vue",
    "https://github.com/angular/angular"
  ]
}

Search for Python machine learning repos with 1000+ stars

{
  "searchQuery": "machine learning",
  "language": "python",
  "minStars": 1000,
  "sortBy": "stars",
  "maxResults": 100
}

Find the most-forked JavaScript frameworks

{
  "searchQuery": "framework",
  "language": "javascript",
  "sortBy": "forks",
  "maxResults": 50
}

Discover recently updated Rust projects

{
  "searchQuery": "async runtime",
  "language": "rust",
  "sortBy": "updated",
  "maxResults": 30
}

Pricing

Pay per result — you only pay for repositories successfully scraped. See the Pricing tab for current rates. Each repository with full metadata counts as one event.

Rate Limits & Proxies

GitHub allows unauthenticated access but may rate-limit aggressive scraping. For scrapes of 100+ repositories, enabling Apify Proxy is recommended. The actor automatically rotates user agents and adds delays to respect GitHub's servers.

Limitations

GitHub search returns a maximum of 1,000 results per query
Some private or restricted repositories may not be accessible
Contributor counts on very large repos (10,000+ contributors) may be approximate
README preview is truncated to the first 500 characters

Integration — Python

from apify_client import ApifyClient

client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("sovereigntaylor/github-repo-scraper").call(run_input={
    "searchTerm": "github repo",
    "maxResults": 50
})

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item.get('title', item.get('name', 'N/A'))}")

Integration — JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });

const run = await client.actor('sovereigntaylor/github-repo-scraper').call({
    searchTerm: 'github repo',
    maxResults: 50
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(item => console.log(item.title || item.name || 'N/A'));

GitHub Scraper - Repos, Stars, Issues & Profiles

cryptosignals/github-scraper

Scrape GitHub repositories, profiles, and issues — extract stars, forks, contributors, README, commit history, and topics. CSV/JSON output. No login.

Web Data Labs

GitHub Repo Stats. Stars, Forks, Languages, Contributors

seemuapps/github-repo-stats-scraper

Get stars, forks, issues, language breakdown, license, last commit, and contributor counts for any GitHub repository. Bulk-process a list of repos in one run.

Andrew

Github Repository Analyzer

actually_good_at_this/apify-github-repository-analyzer

GitHub Repository Analyzer extracts comprehensive repository metrics using the official GitHub API: stars, forks, watchers, contributors, commit activity, and issues/PRs.

john Y

GitHub Repository Analyzer

optimus-fulcria/github-repo-analyzer

Analyze GitHub repositories: stars, forks, issues, contributors, languages, commit activity. Competitive intelligence for open source.

Fulcria Labs

GitHub Repository Scraper — Stars, Forks, Languages & More

joyouscam35875/github-repo-scraper

Scrape GitHub repository data using the REST API v3. Get stars, forks, languages, topics, contributors, releases. Search repos by keyword. Perfect for tech stack analysis and competitive intelligence. $0.002/repo.

Ken Digital

GitHub Repository Scraper

skystone_labs/github-repo-scraper

Extract GitHub repository metadata using GitHub API and scraping. Get repo info, stars, forks, language, topics, and README content. Perfect for research, analysis, and building datasets.

Skystone

GitHub Repository Scraper

vulnv/github-repository-scraper

Scrape and extract GitHub repository data, metadata, statistics, stars, forks, issues, and project information from multiple repositories at once.

VulnV

GitHub Repository Scraper

logiover/github-repository-scraper

Scrape GitHub repositories by search query - stars, forks, language, topics, owner, license and activity dates. Track trending projects, competitor repos or developer activity.

Logiover

GitHub Scraper

automation-lab/github-scraper

Extract data from GitHub — repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.