GitHub Repo Scraper avatar
GitHub Repo Scraper

Pricing

from $1.00 / 1,000 results

Go to Apify Store
GitHub Repo Scraper

GitHub Repo Scraper

Scrape GitHub repository stats, README, languages, contributors, and releases.

Pricing

from $1.00 / 1,000 results

Rating

0.0

(0)

Developer

Artificially

Artificially

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 days ago

Last modified

Share

GitHub Repository Scraper

Scrape detailed information from any public GitHub repository including stats, README content, languages, contributors, and releases.

Features

  • Repository Stats: Stars, forks, watchers, open issues, size
  • Repository Info: Description, license, topics, creation/update dates
  • README Content: Full README text (optional)
  • Language Breakdown: Bytes of code per language (optional)
  • Contributors: List of top contributors with commit counts (optional)
  • Releases: Recent releases with download links (optional)

Use Cases

  • Track popularity metrics across multiple repositories
  • Monitor competitor open-source projects
  • Build datasets for research on GitHub trends
  • Aggregate documentation from multiple repos
  • Track release schedules and changelogs

Input

FieldTypeRequiredDefaultDescription
repositoriesarrayYes-List of repositories (e.g., ["facebook/react", "https://github.com/vercel/next.js"])
includeReadmebooleanNotrueInclude full README content
includeLanguagesbooleanNotrueInclude language breakdown
includeContributorsbooleanNofalseInclude contributor list
maxContributorsnumberNo10Maximum contributors to fetch
includeReleasesbooleanNofalseInclude release history
maxReleasesnumberNo5Maximum releases to fetch

Example Input

{
"repositories": [
"facebook/react",
"vercel/next.js",
"https://github.com/microsoft/vscode"
],
"includeReadme": true,
"includeLanguages": true,
"includeContributors": true,
"maxContributors": 5
}

Output

Each repository produces a result with:

{
"repoId": 10270250,
"name": "react",
"fullName": "facebook/react",
"owner": "facebook",
"ownerType": "Organization",
"description": "The library for web and native user interfaces.",
"url": "https://github.com/facebook/react",
"homepage": "https://react.dev",
"stars": 220000,
"watchers": 220000,
"forks": 45000,
"openIssues": 1500,
"language": "JavaScript",
"topics": ["declarative", "frontend", "javascript", "library", "react", "ui"],
"license": "MIT",
"isPrivate": false,
"isFork": false,
"isArchived": false,
"defaultBranch": "main",
"createdAt": "2013-05-24T16:15:54Z",
"updatedAt": "2024-01-15T10:30:00Z",
"size": 350000,
"languages": {
"JavaScript": 5000000,
"TypeScript": 2000000,
"HTML": 50000
},
"readme": "# React\n\nReact is a JavaScript library...",
"contributors": [
{
"username": "gaearon",
"contributions": 2500,
"profileUrl": "https://github.com/gaearon"
}
],
"scrapedAt": "2024-01-15T12:00:00Z"
}

Cost

This actor uses pay-per-result pricing:

Cost TypeAmount
Start fee$0.05 per run
Per repository$0.001

No API key required - Uses GitHub's public API (unauthenticated).

Example Cost Calculation

  • 100 repos: $0.05 + (100 x $0.001) = $0.15
  • 1,000 repos: $0.05 + (1000 x $0.001) = $1.05

Rate Limits

GitHub's unauthenticated API allows 60 requests per hour. For high-volume scraping:

  • Use Apify proxy to distribute requests
  • Add delays between repositories
  • Consider using a GitHub personal access token

Tips

  1. Repository formats accepted:

    • owner/repo (e.g., facebook/react)
    • Full URL (e.g., https://github.com/facebook/react)
    • URL with .git suffix is handled automatically
  2. Minimize API calls: Disable includeReadme, includeLanguages, includeContributors, and includeReleases if you don't need that data.

  3. Large repos: For repositories with many contributors/releases, limit the count to avoid rate limits.

Support

  • Built by: Artificially
  • Issues: Report bugs or request features via Apify Console