GitHub — Repository Search & Data Scraper
Pricing
from $10.00 / 1,000 results
GitHub — Repository Search & Data Scraper
Scrapes GitHub repository data via the GitHub REST API v3. Supports keyword search, advanced qualifiers, user/org listing, trending repos, topic browsing, and direct repo lookup. Works without an API key (rate-limited) or with a personal access token for higher throughput.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Jamshaid Arif
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
🐙 GitHub — Repository Search & Data Scraper — Apify Actor
Scrapes GitHub repository data via the GitHub REST API v3. Supports keyword search, advanced qualifiers, user/org listing, trending repos, topic browsing, and direct repo lookup. Works without an API key (rate-limited) or with a personal access token for higher throughput.
Features
- 8 search modes — keyword search, user repos, org repos, topic browse, direct lookup, and 3 trending periods.
- Advanced qualifiers — language, stars, forks, license, creation date, push date, archived/fork filters.
- 4 optional extras — language breakdown, top contributors, releases with download counts, topic tags.
- Smart rate limiting — automatic backoff on 403, separate delays for search vs. other endpoints, token-aware pacing.
- 4 output formats — enriched (with computed metrics), raw, minimal, CSV-friendly.
- Up to 1,000 search results — auto-paginates across 34 pages.
Search Modes
| Mode | What it does |
|---|---|
search | Keyword search with qualifiers (up to 1,000 results) |
user_repos | All public repos from a specific GitHub user |
org_repos | All public repos from an organization |
topic_browse | Repos tagged with a specific topic |
repos_by_name | Fetch specific repos by owner/name |
trending_today | Repos created today, sorted by stars |
trending_week | Repos created this week, sorted by stars |
trending_month | Repos created this month, sorted by stars |
Input Examples
Search Python Web Scraping Tools
{"mode": "search","searchQuery": "web scraping","language": "Python","minStars": 100,"sortBy": "stars","maxPages": 5,"fetchExtras": ["languages", "contributors"],"outputFormat": "enriched"}
All Google Org Repos
{"mode": "org_repos","orgName": "google","maxPages": 10,"outputFormat": "enriched"}
Trending Repos This Week
{"mode": "trending_week","minStars": 50,"maxPages": 5,"outputFormat": "enriched"}
Specific Repos with Releases
{"mode": "repos_by_name","repoFullNames": "facebook/react, vuejs/vue, angular/angular, sveltejs/svelte","fetchExtras": ["languages", "releases", "contributors"],"outputFormat": "enriched"}
MIT-Licensed TypeScript Projects
{"mode": "search","searchQuery": "dashboard","language": "TypeScript","license": "mit","minStars": 500,"pushedAfter": "2024-06-01","sortBy": "stars"}
Enriched Output Fields
| Field | Example |
|---|---|
full_name | scrapy/scrapy |
owner | scrapy |
owner_type | Organization |
description | Scrapy, a fast high-level… |
url | https://github.com/scrapy/scrapy |
homepage | https://scrapy.org |
language | Python |
stars | 53,421 |
forks | 13,812 |
watchers | 53,421 |
open_issues | 487 |
size_kb | 23456 |
license | BSD-3-Clause |
topics | python, web-scraping, crawler |
default_branch | master |
is_fork | false |
is_archived | false |
has_wiki | true |
has_pages | true |
created_at | 2010-02-22T02:23:08Z |
pushed_at | 2025-04-04T18:30:00Z |
age_days | 5520 |
stars_per_day | 9.68 |
fork_to_star_ratio | 0.259 |
Extra Fields
| Extra | Fields Added |
|---|---|
| Languages | languages_breakdown (lang → bytes + %), languages_flat, language_count |
| Contributors | top_contributors (login, commits, avatar × 10), top_contributor |
| Releases | releases (tag, date, downloads × 5), latest_release, total_release_downloads |
| Topics | all_topics (array), topic_count |
Rate Limits & Token
| Without Token | With Token | |
|---|---|---|
| Search API | 10 requests/min | 30 requests/min |
| Other endpoints | 60 requests/hr | 5,000 requests/hr |
| Search results cap | 1,000 | 1,000 |
To create a token: github.com → Settings → Developer Settings → Personal Access Tokens → Fine-grained tokens. No special permissions are needed for public repos.
The actor automatically handles rate limiting: it reads x-ratelimit-remaining headers and waits for resets when exhausted.