Github Scraper avatar

Github Scraper

Pricing

Pay per event

Go to Apify Store
Github Scraper

Github Scraper

Extract data from GitHub — repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Scrape data from GitHub — repositories, developer profiles, trending repos, and search results. Get stars, forks, languages, topics, licenses, followers, and more.

What does GitHub Scraper do?

GitHub Scraper extracts structured data from GitHub using its public API and web pages. It supports four modes:

  • Repository details — Full metadata for specific repos (stars, forks, topics, license, dates)
  • Developer profiles — Bio, followers, location, company, repos count for any user
  • Trending repositories — Today's/week's/month's hottest repos with star velocity
  • Search repositories — Find repos by keyword, sorted by stars

No GitHub API token required. Works with GitHub's public endpoints.

Why scrape GitHub?

GitHub hosts 400M+ repositories and 100M+ developers. It's the primary source for:

  • 📊 Tech trend analysis — Track which languages and frameworks are gaining traction
  • 🔍 Competitive intelligence — Monitor competitor repos, stars growth, and release cadence
  • 📈 Developer recruiting — Find active developers by language, location, and contribution history
  • 🏗️ Open source research — Analyze licensing, dependency patterns, and community health
  • 📰 Content creation — Curate trending repos for newsletters and social media

Use cases

  • Newsletter creators sharing weekly trending repos
  • VCs and investors tracking open-source momentum
  • Hiring managers building candidate lists from active contributors
  • Researchers studying open-source ecosystem dynamics
  • DevRel teams monitoring mentions and competitive landscape
  • Developers discovering new tools and libraries

How to scrape GitHub

  1. Go to GitHub Scraper on Apify Store
  2. Choose a mode: repos, profiles, trending, or search
  3. Enter URLs or search query depending on the mode
  4. Set the max results limit
  5. Click Start and wait for results
  6. Download data as JSON, CSV, or Excel

Data you can extract

Repository data

FieldTypeDescription
fullNamestringOwner/repo (e.g., facebook/react)
descriptionstringRepo description
starsnumberStar count
forksnumberFork count
watchersnumberWatcher count
openIssuesnumberOpen issue count
languagestringPrimary language
topicsarrayTopic tags
licensestringLicense (MIT, Apache-2.0, etc.)
isArchivedbooleanWhether the repo is archived
createdAtstringCreation date
updatedAtstringLast update date
sizenumberRepo size in KB

Profile data

FieldTypeDescription
usernamestringGitHub username
namestringDisplay name
biostringProfile bio
companystringCompany
locationstringLocation
followersnumberFollower count
followingnumberFollowing count
publicReposnumberPublic repo count
blogstringWebsite URL
twitterUsernamestringX/Twitter handle
FieldTypeDescription
fullNamestringOwner/repo
descriptionstringRepo description
languagestringPrimary language
starsnumberTotal stars
starsTodaynumberStars gained in the period
forksnumberFork count
builtByarrayTop contributors with avatars

Input parameters

ParameterTypeDefaultDescription
modestring"trending"Mode: repos, profiles, trending, search
urlsarray[]GitHub URLs (for repos/profiles mode)
searchQuerystring""Search query (for search mode)
trendingSincestring"daily"Trending period: daily, weekly, monthly
trendingLanguagestring""Filter by language (e.g., python)
maxResultsinteger25Max results to return

Input example

{
"mode": "trending",
"trendingSince": "daily",
"maxResults": 25
}

Output example

Repository

{
"name": "react",
"fullName": "facebook/react",
"owner": "facebook",
"description": "The library for web and native user interfaces.",
"url": "https://github.com/facebook/react",
"homepageUrl": "https://react.dev",
"language": "JavaScript",
"stars": 243711,
"forks": 50668,
"watchers": 243711,
"openIssues": 1150,
"topics": ["declarative", "frontend", "javascript", "library", "react", "ui"],
"license": "MIT",
"isArchived": false,
"isFork": false,
"defaultBranch": "main",
"createdAt": "2013-05-24T16:15:54Z",
"updatedAt": "2026-03-08T09:17:07Z",
"pushedAt": "2026-03-05T15:52:24Z",
"size": 942058,
"scrapedAt": "2026-03-08T09:45:54.218Z"
}

Profile

{
"username": "torvalds",
"name": "Linus Torvalds",
"bio": null,
"company": "Linux Foundation",
"location": "Portland, OR",
"followers": 289246,
"following": 0,
"publicRepos": 11,
"avatarUrl": "https://avatars.githubusercontent.com/u/1024025?v=4",
"url": "https://github.com/torvalds",
"scrapedAt": "2026-03-08T09:45:50.123Z"
}

Pricing

GitHub Scraper uses pay-per-event pricing:

EventPrice
Run started$0.005
Repository or profile extracted$0.003 per item

Cost examples

ScenarioItemsCost
Daily trending (25 repos)25~$0.08
10 repo details10~$0.04
Search (50 results)50~$0.16

Apify's free plan includes $5/month in platform credits — enough for ~60 trending scrapes.

Using GitHub Scraper with the Apify API

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });
const run = await client.actor('automation-lab/github-scraper').call({
mode: 'trending',
trendingSince: 'daily',
maxResults: 25,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach(repo => {
console.log(`${repo.fullName}: ${repo.stars} stars (+${repo.starsToday} today)`);
});

Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_API_TOKEN')
run = client.actor('automation-lab/github-scraper').call(run_input={
'mode': 'trending',
'trendingSince': 'daily',
'maxResults': 25,
})
dataset = client.dataset(run['defaultDatasetId']).list_items().items
for repo in dataset:
print(f"{repo['fullName']}: {repo['stars']} stars (+{repo['starsToday']} today)")

cURL

curl "https://api.apify.com/v2/acts/automation-lab~github-scraper/run-sync-get-dataset-items?token=YOUR_API_TOKEN" \
-X POST -H "Content-Type: application/json" \
-d '{"mode": "trending", "trendingSince": "daily", "maxResults": 10}'

Integrations

GitHub Scraper works with all Apify integrations:

  • Scheduled runs — Track trending repos daily or weekly
  • Webhooks — Get notified when a scrape completes
  • Google Sheets — Export repos and profiles to spreadsheets
  • Slack — Post trending repos to your team's channel
  • Zapier / Make — Automate workflows with GitHub data

Tips

  • 📈 Track trending daily — Schedule runs to build a history of trending repos
  • 🔍 Use search for competitive analysis — Search for keywords in your domain
  • 👥 Profile scraping — Great for building lists of developers by location or company
  • 🏷️ Filter by language — Use trendingLanguage to focus on specific tech stacks
  • Rate limits — The actor handles GitHub API rate limits automatically with retries
  • 📊 Combine modes — Run trending to discover repos, then repos mode for full details

GitHub provides a public REST API specifically designed for programmatic access. This scraper uses that API and publicly accessible web pages. It does not bypass authentication, rate limits, or access private data. Always review GitHub's Terms of Service and API usage policies.

FAQ

Do I need a GitHub API token? No. The scraper works without authentication using GitHub's public API (60 requests/hour per IP). For higher rate limits, a future version may support optional token input.

How many trending repos are shown? GitHub's trending page shows up to 25 repos per language/period combination.

Can I search for users or organizations? Currently, search mode finds repositories. Profile mode accepts direct profile URLs. User search may be added in a future version.

What about private repos? The scraper only accesses public data. Private repos are not visible without authentication.