GitHub — Repository Search & Data Scraper avatar

GitHub — Repository Search & Data Scraper

Pricing

from $10.00 / 1,000 results

Go to Apify Store
GitHub — Repository Search & Data Scraper

GitHub — Repository Search & Data Scraper

Scrapes GitHub repository data via the GitHub REST API v3. Supports keyword search, advanced qualifiers, user/org listing, trending repos, topic browsing, and direct repo lookup. Works without an API key (rate-limited) or with a personal access token for higher throughput.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Jamshaid Arif

Jamshaid Arif

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

🐙 GitHub — Repository Search & Data Scraper — Apify Actor

Scrapes GitHub repository data via the GitHub REST API v3. Supports keyword search, advanced qualifiers, user/org listing, trending repos, topic browsing, and direct repo lookup. Works without an API key (rate-limited) or with a personal access token for higher throughput.

Features

  • 8 search modes — keyword search, user repos, org repos, topic browse, direct lookup, and 3 trending periods.
  • Advanced qualifiers — language, stars, forks, license, creation date, push date, archived/fork filters.
  • 4 optional extras — language breakdown, top contributors, releases with download counts, topic tags.
  • Smart rate limiting — automatic backoff on 403, separate delays for search vs. other endpoints, token-aware pacing.
  • 4 output formats — enriched (with computed metrics), raw, minimal, CSV-friendly.
  • Up to 1,000 search results — auto-paginates across 34 pages.

Search Modes

ModeWhat it does
searchKeyword search with qualifiers (up to 1,000 results)
user_reposAll public repos from a specific GitHub user
org_reposAll public repos from an organization
topic_browseRepos tagged with a specific topic
repos_by_nameFetch specific repos by owner/name
trending_todayRepos created today, sorted by stars
trending_weekRepos created this week, sorted by stars
trending_monthRepos created this month, sorted by stars

Input Examples

Search Python Web Scraping Tools

{
"mode": "search",
"searchQuery": "web scraping",
"language": "Python",
"minStars": 100,
"sortBy": "stars",
"maxPages": 5,
"fetchExtras": ["languages", "contributors"],
"outputFormat": "enriched"
}

All Google Org Repos

{
"mode": "org_repos",
"orgName": "google",
"maxPages": 10,
"outputFormat": "enriched"
}
{
"mode": "trending_week",
"minStars": 50,
"maxPages": 5,
"outputFormat": "enriched"
}

Specific Repos with Releases

{
"mode": "repos_by_name",
"repoFullNames": "facebook/react, vuejs/vue, angular/angular, sveltejs/svelte",
"fetchExtras": ["languages", "releases", "contributors"],
"outputFormat": "enriched"
}

MIT-Licensed TypeScript Projects

{
"mode": "search",
"searchQuery": "dashboard",
"language": "TypeScript",
"license": "mit",
"minStars": 500,
"pushedAfter": "2024-06-01",
"sortBy": "stars"
}

Enriched Output Fields

FieldExample
full_namescrapy/scrapy
ownerscrapy
owner_typeOrganization
descriptionScrapy, a fast high-level…
urlhttps://github.com/scrapy/scrapy
homepagehttps://scrapy.org
languagePython
stars53,421
forks13,812
watchers53,421
open_issues487
size_kb23456
licenseBSD-3-Clause
topicspython, web-scraping, crawler
default_branchmaster
is_forkfalse
is_archivedfalse
has_wikitrue
has_pagestrue
created_at2010-02-22T02:23:08Z
pushed_at2025-04-04T18:30:00Z
age_days5520
stars_per_day9.68
fork_to_star_ratio0.259

Extra Fields

ExtraFields Added
Languageslanguages_breakdown (lang → bytes + %), languages_flat, language_count
Contributorstop_contributors (login, commits, avatar × 10), top_contributor
Releasesreleases (tag, date, downloads × 5), latest_release, total_release_downloads
Topicsall_topics (array), topic_count

Rate Limits & Token

Without TokenWith Token
Search API10 requests/min30 requests/min
Other endpoints60 requests/hr5,000 requests/hr
Search results cap1,0001,000

To create a token: github.com → Settings → Developer Settings → Personal Access Tokens → Fine-grained tokens. No special permissions are needed for public repos.

The actor automatically handles rate limiting: it reads x-ratelimit-remaining headers and waits for resets when exhausted.