GitHub Organisation Scraper
Pricing
Pay per event
GitHub Organisation Scraper
Pull GitHub organisation metadata + their public repository list — display name, description, location, blog, members count, plus per-repo summary (name, stars, language, last push). Free GitHub REST API, optional token for higher rate limits.
Pricing
Pay per event
Rating
0.0
(0)
Developer
DevilScrapes
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
🎯 What this scrapes
GitHub publishes every org at api.github.com/orgs/{slug} and its public repos at /orgs/{slug}/repos. This Actor takes a list of org slugs, fans them out, and writes one row per organisation — with its public repos summary attached.
🔥 What we handle for you
- 🛡️ Browser fingerprint rotation —
curl-cffiimpersonates real Chrome / Firefox / Safari TLS handshakes so the target sees a browser, not Python. - 🌐 Residential proxy rotation via Apify Proxy — fresh session and exit IP on every block.
- 🔁 Retries with exponential backoff on
408 / 429 / 5xx— up to 5 attempts per page,Retry-Afterhonoured. - 🧱 Rate-limit-aware pacing — when the target pushes back, we slow down instead of getting banned.
- 🧊 Clean, typed dataset rows — Pydantic-validated, ISO-8601 timestamps, stable IDs, JSON / CSV / Excel export straight from the Apify Console.
- 💰 Pay-Per-Event pricing — you only pay for results that hit your dataset. No data, no charge.
💡 Use cases
- Lead gen — pull a list of dev-tool companies and surface contact info from public org profiles.
- M&A research — quantify the open-source surface of a target company.
- Hiring — rank organisations by location + activity to find engineering-heavy targets.
- Dependency intel — combine with the repo Actor to inventory a single company's libraries.
⚙️ How to use it
- Click Try for free at the top of the page.
- Fill in the input form — most fields have sensible defaults.
- Click Start. Output streams into the run's dataset.
- Export from Storage → Dataset as JSON, CSV, or Excel — or fetch via the API.
📥 Input
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
orgs | array | yes | ['apify', 'anthropics'] | List of GitHub organisation slugs (the bit after github.com/). URLs are also accepted. |
githubToken | string | no | '—' | Lifts rate limit from 60/hour to 5 000/hour. Read-only public access is sufficient. |
includeRepos | boolean | no | True | Adds up to maxReposPerOrg recently-updated repos per org. One extra API call per org. |
maxReposPerOrg | integer | no | 30 | Cap on repos returned per org. Hard ceiling 100 per page. |
concurrency | integer | no | 4 | Parallel API requests. |
proxyConfiguration | object | no | {'useApifyProxy': False} | Optional — GitHub doesn't IP-block at normal volumes. |
Example input
{"orgs": ["apify"],"includeRepos": false,"concurrency": 2,"proxyConfiguration": {"useApifyProxy": false}}
📤 Output
Every row is one dataset item.
| Field | Type | Notes |
|---|---|---|
login | string | Organisation slug. |
name | ['string', 'null'] | Display name. |
description | ['string', 'null'] | Org bio. |
company | ['string', 'null'] | Self-declared company string. |
blog | ['string', 'null'] | Homepage URL. |
location | ['string', 'null'] | Location. |
email | ['string', 'null'] | Public email. |
twitter_username | ['string', 'null'] | X handle. |
public_repos | integer | Public repo count. |
public_gists | integer | Public gist count. |
followers | integer | Followers count. |
html_url | string | Org profile URL. |
avatar_url | string | Avatar URL. |
members_url_template | string | Members API template. |
type | string | Always Organization. |
is_verified | ['boolean', 'null'] | Verified-org flag. |
created_at | string | Org creation timestamp. |
updated_at | string | Last profile update. |
repos | ['array', 'null'] | Per-repo summary list (when includeRepos=true). |
scraped_at | string | When this row was recorded. |
Example output
{"login": "apify","name": "Apify","description": "Web scraping and automation platform.","blog": "https://apify.com","location": "Prague, Czechia","public_repos": 412,"html_url": "https://github.com/apify","type": "Organization"}
💰 Pricing
Pay-Per-Event — you pay only when these events fire:
| Event | USD | What it is |
|---|---|---|
actor-start | $0.005 | One-off warm-up charge per run |
result | $0.003 | Per dataset item |
Example: 1 000 results at the rates above ≈ $3.00. No subscription, no minimum, no card to start — Apify gives every new account $5 of free credit.
🚧 Limitations
Member lists, private repos, and security advisories are out of scope. Some fields (location, email, blog) are user-supplied and frequently null.
❓ FAQ
Why is email empty?
Most orgs hide email. We surface what GitHub publishes — never guess.
What's a verified org?
GitHub flags some orgs as verified (Apple, Microsoft, etc.). The boolean lives on the response.
Are private members listed?
Never — public read-only API. Use a token with org-read scope inside the org itself if you need member listings.
Can I get GHAS findings?
Out of scope — that's a paid GitHub Advanced Security API.
💬 Your feedback
Spotted a bug, hit a weird edge case, or need a new field? Open an issue on the Actor's Issues tab on Apify Console — we ship fixes weekly and we read every report.