Pricing

from $0.99 / 1,000 results

GitHub Repository Scraper - Stars, Topics, Trending

Scrape GitHub repos by search query and export stars, topics, forks & license to CSV/JSON. GitHub data export without an API key - trending repos scraper.

Pricing

from $0.99 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Actor stats

Bookmarked

Total users

Monthly active users

2 days ago

Last modified

✨ What this Actor does / Key features

🔌 Official GitHub Search API — stable, well-documented and structured; no HTML scraping, no headless browser, no anti-bot bypass.
🔎 Full GitHub search syntax — every qualifier GitHub supports: language:, topic:, stars:>N, forks:>N, created:>YYYY-MM-DD, pushed:>YYYY-MM-DD, archived:false, is:public, org:, user:, free text, and any combination.
🔢 Flexible sort & order — sort by stars, forks, updated or help-wanted-issues, ascending or descending.
⭐ Rich repo metadata — 20 fields per repo: stars, forks, open issues, watchers, language, topics, license, archived flag, dates, owner and owner type.
📅 Activity tracking — createdAt, updatedAt and pushedAt separate actively-developed repos from abandoned ones.
👤 Owner identity — owner login plus ownerType (User vs Organization) for downstream segmentation.
♾️ Unlimited mode — maxRepos=0 pulls every result the query allows (GitHub Search caps at 1,000 per query).
🧱 Flat schema — no nested JSON to wrangle; drops straight into a spreadsheet or warehouse.
⏱️ Schedule-friendly — deterministic and idempotent, ideal for daily ecosystem tracking.
🔓 No auth required — anonymous GitHub Search API access.

🚀 Quick start (3 steps)

Configure — type any GitHub search query (e.g. topic:ai language:python stars:>500), pick a sort and order, and set maxRepos.
Run — click Start. The Actor paginates the Search API in pages of 100 and streams repositories into your dataset.
Get your data — open the Output tab and export to JSON, CSV, Excel, HTML, XML or JSONL, or pull it via the Apify API.

📥 Input

Everything is optional — with no input the Actor defaults to stars:>1000 sorted by stars.

{
  "searchQuery": "topic:ai language:python stars:>500",
  "sort": "stars",
  "order": "desc",
  "maxRepos": 0
}

Example — active Rust CLI devtools created this year

{
  "searchQuery": "language:rust topic:cli created:>2026-01-01 stars:>50 archived:false",
  "sort": "updated",
  "order": "desc",
  "maxRepos": 500
}

Example — every public repo owned by an organization

{
  "searchQuery": "org:vercel fork:false",
  "sort": "stars",
  "order": "desc",
  "maxRepos": 0
}

Field	Type	Required	Description
`searchQuery`	string	No	Any GitHub Search query string. Free text and qualifiers both work: `language:python`, `topic:llm`, `stars:>1000`, `created:>2026-01-01`, `pushed:>2026-04-01`, `archived:false`, `is:public`, `org:openai`, `user:torvalds`. Default `stars:>1000`.
`sort`	enum	No	Sort field: `stars`, `forks`, `updated`, `help-wanted-issues`. Default `stars`.
`order`	enum	No	Sort direction: `desc` or `asc`. Default `desc`.
`maxRepos`	integer	No	Hard cap on rows. `0` = pull every available result (GitHub Search caps at 1,000 per query).

Beating the 1,000-result cap: GitHub Search returns at most 1,000 rows per query. To scrape a larger niche, split the query into narrower windows — by date (created:2024-01-01..2024-06-30), star band (stars:100..500), language or topic. Each window gets its own 1,000-row budget.

📤 Output

One row per repository — 20 fields, exportable to JSON, CSV, Excel, HTML, XML or JSONL. Here is a trimmed sample record:

{
  "id": 65600975,
  "fullName": "openai/whisper",
  "name": "whisper",
  "owner": "openai",
  "ownerType": "Organization",
  "description": "Robust Speech Recognition via Large-Scale Weak Supervision",
  "url": "https://github.com/openai/whisper",
  "homepage": "",
  "language": "Python",
  "topics": ["audio", "speech-recognition", "deep-learning", "pytorch"],
  "stars": 88231,
  "forks": 10241,
  "openIssues": 102,
  "watchers": 88231,
  "license": "MIT",
  "isArchived": false,
  "createdAt": "2022-09-21T16:35:42.000Z",
  "updatedAt": "2026-05-15T09:11:00.000Z",
  "pushedAt": "2026-05-12T14:22:18.000Z",
  "scrapedAt": "2026-07-06T10:00:00.000Z"
}

💡 Use cases

Open-source intelligence for VCs — run topic:ai stars:>500 created:>2026-01-01 weekly to catch new AI projects the moment they cross a star threshold, before they trend elsewhere.
Dependency & supply-chain mapping — pull every repo using a topic or library you care about (topic:kubernetes, topic:llm) and build a database of who depends on what.
Security & vulnerability research — find every public repo using a framework or language version; when a CVE drops, you have a pre-built, star-ranked target list.
Technical talent sourcing — language:rust stars:>100 sorted by updated returns maintainers of significant, actively-developed projects — a recruiter's shortlist.
Devtool adoption & competitive intelligence — track every repo that uses a competitor's library, compare growth month over month, and segment by language and topic.
Trending discovery & curation — build "best of GitHub this week/month" newsletters, dashboards or podcasts from a single scheduled scrape.
Ecosystem mapping for research & journalism — map every project in a niche (LLM agents, web3 infra, climate tech) and cross-reference languages, owners, dates and licenses.

👥 Who uses it

VCs & investors scouting open-source momentum · security & AppSec teams tracking dependencies · technical recruiters sourcing by stack · OSS maintainers monitoring competitors · devtool product & growth teams measuring adoption · data teams & journalists powering ecosystem dashboards.

💰 Pricing

This Actor runs on a simple pay-per-result model — you pay only for the repository rows saved to the dataset, and queries that match nothing are not billed. Try it on the free tier first, then scale up. See the Pricing tab on this page for the current rate.

❓ Frequently Asked Questions

Is using this GitHub scraper allowed? Yes. It uses the official public GitHub Search API and reads only publicly visible repository metadata. Use the data responsibly under GitHub's terms.

Do I need a GitHub account, login, token or API key? No. The Actor works against GitHub's public Search API without authentication — no login, no personal access token and no API key to manage. (Authenticated requests would raise the rate-limit ceiling; request a custom build if you need that.)

How many repos can I get per run? GitHub Search caps any single query at 1,000 results. Set maxRepos=0 to pull every available result for your query. To go beyond 1,000, split your query into narrower windows (by date, star band, language or topic) — each window has its own 1,000-row budget.

What search syntax does it support? Any GitHub Search qualifier and free-text combination. Common qualifiers: language:, topic:, stars:, forks:, size:, created:, pushed:, archived:, is:public, org:, user:. Comparators: >N, <N, >=N, <=N, N..M. Use exactly the same syntax you'd type into the github.com search bar.

What's the difference between updatedAt and pushedAt? updatedAt reflects any metadata change (a description edit, a star-count change), while pushedAt reflects an actual code push to a branch. Use pushedAt to identify truly active development.

Is this a GitHub API alternative?

Yes. This Actor reads GitHub's official Search API for you, handles pagination and rate limits, flattens the nested response, and returns a warehouse-ready dataset — so it works as a no-setup GitHub API alternative with no token to manage.

Yes. The Actor reads public repository metadata through GitHub's anonymous public Search API — no GitHub login, no personal access token and no API key required, only an Apify account.

Yes. Sort by stars and filter by created: / pushed: dates to build any trending-style list you want, then export the trending-repos dataset for newsletters, dashboards or VC deal-flow tracking.

How do I export GitHub repos to CSV or JSON?

Run any search query, then download the resulting Apify Dataset as CSV, JSON, Excel, HTML, XML or JSONL. Every repository becomes one flat row, so the export drops straight into a spreadsheet or warehouse.

How do I get every repo in an organization?

Use org:openai (replace with the org slug) as your searchQuery. Add fork:false to exclude forks: org:openai fork:false.

🔗 More dev, research & intelligence scrapers by logiover

Building a developer-ecosystem or market-intelligence pipeline? Pair the GitHub scraper with the rest of the suite:

Actor	What it does
npm Package Intelligence Scraper	npm package metadata, downloads & dependency signals
Hugging Face Hub Intelligence Scraper	Models, datasets & Spaces with downloads and likes
CVE Security Advisory Monitor	New CVEs and security advisories for dependency risk
GitHub Activity Stream	Live GitHub events — commits, releases, issues
Semantic Scholar Research Scraper	Academic papers, citations & authors
arXiv Paper Scraper	Preprints by category, author and keyword
Google News Scraper	News coverage by keyword and source
SERP Keyword Research	Search-engine results & keyword intelligence
Company Deep Research Scraper	Consolidated company intelligence dossiers
Meta Ad Library Scraper	Competitor ad creative & spend signals
Product Hunt Daily Launches Scraper	Today's launches with votes and makers
Discussion Intelligence Scraper	Cross-platform discussion & sentiment mining

👉 Browse all logiover scrapers on Apify Store — 180+ actors across real estate, jobs, crypto, social media & B2B data.

⏰ Scheduling & integration

Schedule this Actor to run hourly for new-trending-repo alerts in a fast-moving topic, daily for ecosystem dashboards and VC deal-flow trackers, weekly for newsletters and dependency reports, or monthly for "state of the X ecosystem" reports. Push new rows into Slack, Discord, Notion, Airtable, Google Sheets, Postgres, BigQuery or your CRM via Apify Webhooks and the Apify API. Connect it to Make, n8n or Zapier for no-code pipelines.

⭐ Support & feedback

Found a bug or need an extra field? Open an issue on the Issues tab — response is usually fast. If this Actor saves you time, a ★★★★★ review on the Store page genuinely helps and is hugely appreciated. 🙏

⚖️ Legal

This Actor uses the official public GitHub Search API and extracts only publicly visible repository metadata. It is intended for legitimate research, analytics and monitoring. You are responsible for complying with GitHub's terms of service and any applicable laws.

📝 Changelog

2026-07-06

✨ README overhaul: shields badge row, richer trimmed output sample, collapsible full field reference, ready-to-run example scenarios, dev/research suite cross-links and clearer quick-start.

2026-07-01

Maintenance pass: re-verified end-to-end on live data and confirmed successful runs within the 5-minute quality window on the default input.
Sharpened Store metadata (SEO title & description) and expanded the FAQ with high-intent, long-tail questions for easier discovery in Google and Apify Store search.
Added ready-to-run example tasks that cover common real-world use cases.

2026-06-15

Reliability pass: re-verified end-to-end on live data with real-world inputs. Routine maintenance build.

2026-06-07

Docs: added coverage for exporting GitHub repos to CSV/JSON, scraping GitHub without an API key, and building trending repos / GitHub stars data exports.

2026-06-05

🛡️ Reliability fix: results are no longer dropped by strict output validation — runs now complete cleanly even at high volume (thousands of results).
⚡ Stability & performance hardening; fresh rebuild.

2026-06-04

Verified live & refreshed build — reliability/maintenance pass.

2026-06-01

Maintenance & reliability pass: pulled the latest source and rebuilt the Actor on the current base image; build verified.

GitHub Scraper

automation-lab/github-scraper

Extract data from GitHub — repository details, developer profiles, trending repos, and search results. Stars, forks, languages, topics, and more. No API key needed.

Stas Persiianenko

5.0

GitHub Repos Scraper

gio21/github-repos-scraper

Search and scrape GitHub repositories. Extract stars, forks, language, license, topics, and more from the GitHub public API.

Gio

GitHub Trending Repos Scraper

rambunctious_fingerprint/github-trending-scraper

Casey Marsh

GitHub Repository Scraper - Stars, Forks, Topics

spiky_pepperoni/github-repository-scraper

Search and scrape GitHub repositories: stars, forks, language, topics, license, issues. No login. Export JSON, CSV, Excel.

Arad S

Github Trending Repos

sweet_rebel/github-trending-repos

Rajat Sharda

GitHub Repository Scraper

troy_007/github-repo-scraper

Search GitHub repos by topic, language, and star count. Extract stars, forks, description, owner, topics, and activity via the official GitHub API.

Pathik Shah

Github Repos Scraper

velvety_bedbug/github-repos-scraper

Search and scrape GitHub repositories. Filter by language, topic, stars, and creation date. Returns stars, forks, issues, topics, and more.

Peters Bugs

Github Scraper

fortuitous_pirate/github-scraper

Extract GitHub repository data including trending repos, search results, and contributor lists. Get stars, forks, language, topics, license, and activity dates. No authentication required for public data — optional GitHub token for higher rate limits.

Fortuitous Pirate

Github Repositry Scraper

crawlforge/github-repositry-scraper

Scrape GitHub repos by URL, search, or trending. Extract stars, forks, topics, languages, contributors & more. No login needed.

Amna Iftikhar

GitHub Stars Scraper

lulzasaur/github-stars-scraper

Scrape GitHub repository data. Search by keyword or language, fetch specific repos. Extract star counts, forks, topics, licenses, and full repo metadata.

lulz bot