# Github Profile Scraper (`kawsar/github-profile-scraper`) Actor

GitHub Profile Scraper that pulls followers, repos, bio, location, and contact info from any public GitHub account, so recruiters and researchers can build prospect lists without clicking through profiles one by one.

- **URL**: https://apify.com/kawsar/github-profile-scraper.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** Lead generation, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## GitHub Profile Scraper: Extract Public Data from Any GitHub Account

GitHub Profile Scraper pulls public profile data from any GitHub user account and returns it as structured JSON. Paste in a username and get back name, bio, location, company, website, follower count, repo count, Twitter handle, email, and more. Or paste in a list of usernames and process them all in one run.

No browser automation, no manual copy-paste, no setup on your end. Just usernames in, data out.

### What data does this actor extract?

Each scraped profile returns the following fields:

| Field | Description |
|---|---|
| `username` | GitHub handle (login name) |
| `name` | Full display name |
| `bio` | Profile bio text |
| `location` | City or country |
| `company` | Employer or organization |
| `websiteUrl` | Personal website or portfolio URL |
| `twitterUsername` | Twitter/X handle (from main profile) |
| `socialAccounts` | All linked social media accounts as `[{provider, url}]` (LinkedIn, YouTube, Instagram, etc.) |
| `avatarUrl` | Profile picture URL |
| `profileUrl` | Full GitHub profile URL |
| `email` | Public contact email (null if not set) |
| `followers` | Number of followers |
| `following` | Number of accounts followed |
| `publicRepos` | Count of public repositories |
| `publicGists` | Count of public gists |
| `totalStars` | Total stars received across all public repositories |
| `totalForks` | Total forks across all public repositories |
| `hireable` | Open-to-work flag (true, false, or null) |
| `accountType` | "User" or "Organization" |
| `createdAt` | Account creation date (ISO 8601) |
| `updatedAt` | Last profile update date (ISO 8601) |
| `scrapedAt` | Timestamp of when this record was collected |
| `error` | Error message if the profile could not be fetched, otherwise null |

### How to use it

#### Input

| Field | Type | Default | Description |
|---|---|---|---|
| `username` | string | `kawsarlog` | Single GitHub username to scrape |
| `usernames` | string list | — | Multiple usernames for batch scraping |
| `maxItems` | integer | `100` | Cap on profiles processed per run |
| `requestTimeoutSecs` | integer | `30` | Per-request timeout in seconds |

You can provide a single username, a list of usernames, or both. Duplicates are ignored. Usernames with a leading `@` are accepted and cleaned automatically.

#### Example input

```json
{
    "username": "torvalds",
    "usernames": ["gvanrossum", "kawsarlog"],
    "maxItems": 100,
    "requestTimeoutSecs": 30
}
````

#### Example output

```json
{
    "username": "torvalds",
    "name": "Linus Torvalds",
    "bio": "Just a geek",
    "location": "Portland, OR",
    "company": "Linux Foundation",
    "websiteUrl": null,
    "twitterUsername": null,
    "socialAccounts": [
        {"provider": "linkedin", "url": "https://www.linkedin.com/in/torvalds"}
    ],
    "avatarUrl": "https://avatars.githubusercontent.com/u/1024025?v=4",
    "profileUrl": "https://github.com/torvalds",
    "email": null,
    "followers": 239000,
    "following": 0,
    "publicRepos": 7,
    "publicGists": 0,
    "totalStars": 218400,
    "totalForks": 43200,
    "hireable": null,
    "accountType": "User",
    "createdAt": "2011-09-03T15:26:22Z",
    "updatedAt": "2024-12-15T08:41:12Z",
    "scrapedAt": "2026-04-13T10:00:00.000000+00:00",
    "error": null
}
```

### Use cases

- **Recruiting**: screen developer candidates by checking their public repo count, follower count, and account activity before reaching out
- **Sales prospecting**: build technical lead lists by collecting company, website, and contact info from GitHub profiles at scale
- **Open-source research**: track contributor profiles across multiple accounts without visiting each page manually
- **HR analytics**: compare candidate GitHub presence as part of a structured evaluation process
- **Market research**: map out the GitHub activity of a competitor's engineering team or open-source community
- **Data enrichment**: append GitHub profile data to existing contact lists using batch mode

### Batch scraping

Supply a list of usernames in the `usernames` field to scrape many profiles per run. The actor processes them in order, handles errors per profile without stopping the run, and saves each result to the dataset as it goes.

If a username does not exist or returns an error, the output record for that username will have `null` for all data fields and a message in the `error` field.

### FAQ

**How many profiles can I scrape per run?**
The default limit is 100 profiles per run, adjustable up to 1,000 via the `maxItems` field.

**Does this work for organization accounts?**
Yes. GitHub organizations have public profiles with similar fields. The `accountType` field will return `"Organization"` instead of `"User"`.

**What happens if a profile is private or does not exist?**
The actor logs an error for that username and continues with the rest. The failed profile gets an error record in the output dataset so nothing is silently skipped.

**Can I scrape private profile data?**
No. This actor only collects data that GitHub exposes publicly. Private emails, private repos, and private account settings are not accessible.

**How do I input usernames with an @ symbol?**
You can. The actor strips the leading `@` automatically, so `@torvalds` and `torvalds` both work.

**What format is the output?**
JSON, saved to the Apify dataset. You can export it as JSON, CSV, or XLSX directly from the Apify console, or use the Apify API to pull it into your pipeline.

### Scheduling and monitoring

You can schedule this actor to run on a cron schedule from the Apify console. Useful for monitoring follower growth, tracking profile updates, or refreshing a contact list on a weekly basis.

### GitHub profile scraper for recruiters and researchers

Whether you are building a candidate database, enriching a CRM with developer data, or mapping out an open-source community, this GitHub profile scraper gets you structured data without the manual work. Run it on demand or on a schedule and export the results wherever you need them.

# Actor input Schema

## `username` (type: `string`):

A single GitHub username to scrape (e.g. torvalds). Leading @ is stripped automatically.

## `usernames` (type: `array`):

Scrape multiple GitHub profiles in one run. Enter one username per line. Duplicates are ignored.

## `maxItems` (type: `integer`):

Maximum number of profiles to process per run. Hard cap is 1000.

## `requestTimeoutSecs` (type: `integer`):

Per-request timeout in seconds. Increase for slow connections.

## Actor input object example

```json
{
  "username": "torvalds",
  "usernames": [
    "kawsarlog",
    "torvalds"
  ],
  "maxItems": 100,
  "requestTimeoutSecs": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "username": "kawsarlog",
    "usernames": [
        "kawsarlog",
        "torvalds",
        "gvanrossum"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/github-profile-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "username": "kawsarlog",
    "usernames": [
        "kawsarlog",
        "torvalds",
        "gvanrossum",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/github-profile-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "username": "kawsarlog",
  "usernames": [
    "kawsarlog",
    "torvalds",
    "gvanrossum"
  ]
}' |
apify call kawsar/github-profile-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/github-profile-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Github Profile Scraper",
        "description": "GitHub Profile Scraper that pulls followers, repos, bio, location, and contact info from any public GitHub account, so recruiters and researchers can build prospect lists without clicking through profiles one by one.",
        "version": "0.0",
        "x-build-id": "rIfpMnHcng4XP3nHn"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~github-profile-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-github-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~github-profile-scraper/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-github-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~github-profile-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-github-profile-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "username": {
                        "title": "GitHub username",
                        "type": "string",
                        "description": "A single GitHub username to scrape (e.g. torvalds). Leading @ is stripped automatically."
                    },
                    "usernames": {
                        "title": "GitHub usernames (batch)",
                        "type": "array",
                        "description": "Scrape multiple GitHub profiles in one run. Enter one username per line. Duplicates are ignored.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of profiles to process per run. Hard cap is 1000.",
                        "default": 100
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Per-request timeout in seconds. Increase for slow connections.",
                        "default": 30
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
