Github Profile Scraper

Pricing

$20.00/month + usage

Try for free

Go to Apify Store

Github Profile Scraper

Try for free

Scrapes GitHub user profiles including bio, repositories, followers, contributions, and more. Accepts a list of usernames and extracts comprehensive profile data.

Pricing

$20.00/month + usage

Rating

5.0

(1)

Developer

VulnV

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

🚀 GitHub Profile Scraper ⚡ Extract Developer Profiles at Scale

Overview

The GitHub Profile Scraper is a powerful Apify Actor designed to extract comprehensive data from GitHub user profiles efficiently. Perfect for recruitment, developer research, competitive analysis, or building developer databases — this scraper provides detailed insights into GitHub users' professional profiles, repositories, and contributions.

✅ Bulk username processing | ✅ Comprehensive profile data | ✅ Email extraction (when public) | ✅ Repository analysis | ✅ Contribution tracking

Complete Profile Data Extraction

Basic Information — Name, username, bio, location, website
Contact Details — Email addresses (when publicly visible)
Professional Details — Company, Twitter/X handle
Network Statistics — Followers, following counts
Repository Data — Public repositories count, pinned repositories with details
Activity Metrics — Contribution counts and contribution graph data
Social Links — Website, social media profiles
Starred Repositories — List of starred projects (when accessible)

Key Features

Bulk Processing — Process multiple GitHub usernames in one run
Smart Email Detection — Extracts emails using multiple methods including itemprop="email" elements (only for publicly visible emails)
Proxy Support — Built-in Apify proxy integration for reliable scraping
Error Handling — Robust error handling with detailed status reporting
Clean JSON Output — Structured, ready-to-use data format
Username Validation — Automatic username cleaning and validation with GitHub format requirements
Format Flexibility — Accepts various username formats and automatically normalizes them

🧾 Input Configuration

Submit an array of GitHub usernames via the input schema:

{
  "usernames": [
    "johndeveloper",
    "jane-coder", 
    "techexpert123",
    "@another-user",
    "https://github.com/some-developer"
  ],
  "max_threads": 5,
  "proxy_configuration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Note: The scraper automatically normalizes different username formats and validates them against GitHub's requirements. Invalid usernames will be skipped with warning messages.

Input Parameters

Usernames (required):
- Array of GitHub usernames to scrape
- Supported formats: username, @username, github.com/username, https://github.com/username
- Username requirements: Must follow GitHub's username rules (alphanumeric characters and hyphens, no consecutive hyphens, cannot start/end with hyphen, max 39 characters)
- Invalid usernames will be automatically filtered out with warnings
Max Threads (optional):
- Number of concurrent threads for scraping (1-20)
- Default: 5
- Higher values = faster processing but may increase chance of rate limiting
Proxy Configuration (recommended):
- Enable Apify proxy to avoid rate limiting
- Recommended for bulk scraping operations

📤 Output Format

Each GitHub profile returns structured data such as:

{
  "username": "johndeveloper",
  "status": "success",
  "name": "John Developer",
  "bio": "Full-stack developer passionate about open source",
  "location": "San Francisco, CA",
  "email": "john@example.com",
  "website": "https://johndeveloper.dev",
  "twitter": "john_codes",
  "followers": "1234",
  "following": "456",
  "repos_count": "42",
  "contribs": "567 contributions in the last year",
  "pinnedrepos": [
    {
      "name": "awesome-project",
      "url": "https://github.com/johndeveloper/awesome-project",
      "desc": "An innovative web application framework",
      "lang": "JavaScript",
      "stars": "2,500",
      "forks": "320"
    }
  ],
  "repos": [
    {
      "url": "https://github.com/johndeveloper/web-framework",
      "name": "web-framework",
      "desc": "Modern web development framework",
      "stars": "1850",
      "forks": "210",
      "languages": [
        {"lang": "JavaScript", "percent": "78.2%"},
        {"lang": "TypeScript", "percent": "18.5%"}
      ]
    }
  ],
  "starred_repos_list": [
    {
      "url": "https://github.com/example-org/popular-tool",
      "name": "popular-tool"
    }
  ],
  "contrib_matrix": [
    {
      "date": "2024-01-01",
      "count": "3",
      "level": "1"
    }
  ]
}

Error Handling

Failed profiles return structured error information:

{
  "username": "nonexistent-user",
  "status": "not_found",
  "message": "User not found"
}

Common Error Cases:

not_found — User doesn't exist or profile is private
error — Network issues or scraping errors
Invalid usernames are filtered out before processing with warning logs

💼 Common Use Cases

Recruitment & Talent Sourcing

Research developer profiles and technical expertise
Analyze contribution patterns and project involvement
Build comprehensive talent pipelines with GitHub activity data
Assess coding skills through repository analysis

Developer Research & Analysis

Study open source community members and contributors
Analyze technology trends through developer profiles
Research competitor team structures and technical expertise
Track developer career progression and project involvement

Lead Generation & Business Development

Extract contact information for developer outreach
Build databases of potential customers in tech sectors
Identify decision-makers in technology companies
Enrich existing contact databases with GitHub profiles

Community Building & Networking

Find developers with specific skills or interests
Build communities around particular technologies
Identify potential collaborators for open source projects
Research conference speakers and industry experts

📊 Output & Export Options

Dataset Storage

All extracted data stored in Apify dataset
Each profile becomes one dataset item
Status tracking for successful and failed extractions

Export Formats

JSON — Raw structured data for API integration
CSV — Spreadsheet-compatible format for analysis
Excel — Formatted spreadsheet with profile data

Data Processing

Clean, validated usernames
Structured error reporting
Comprehensive logging for troubleshooting

⚡ Quick Start Guide

Configure Input:
- Add GitHub usernames to the usernames array
- Set desired max_threads (recommended: 5-10)
- Enable proxy configuration for reliable scraping
Run the Actor:
- Execute through Apify Console or API
- Monitor progress through real-time logs
- Review extracted data in the dataset
Export Results:
- Download data in your preferred format
- Integrate with your existing tools and workflows

🛡️ Privacy & Compliance

Public Data Only — Extracts only publicly visible profile information
Respects Privacy Settings — Email extraction only works for publicly visible emails
Rate Limiting — Built-in delays and proxy support to respect GitHub's terms
Error Handling — Graceful handling of private or restricted profiles

🔧 Technical Details

Built With

Python & BeautifulSoup — Efficient HTML parsing and data extraction
Apify SDK — Robust actor framework with built-in storage and proxy support
Multi-threading — Concurrent processing for improved performance
Request Handling — Smart retry mechanisms and error recovery

Performance

Process hundreds of profiles per run
Configurable concurrency for optimal speed
Proxy rotation for reliable access
Comprehensive error logging and recovery

📈 Example Results

Successful Profile Extraction

{
  "username": "jane-coder",
  "status": "success",
  "name": "Jane Smith",
  "bio": "Frontend developer specializing in React and TypeScript. Open source enthusiast.",
  "location": "Austin, TX",
  "email": null,
  "website": "https://jane-codes.dev",
  "followers": "3456",
  "following": "234",
  "repos_count": "87",
  "pinnedrepos": [
    {
      "name": "react-toolkit",
      "desc": "Comprehensive React development toolkit",
      "stars": "8500",
      "lang": "TypeScript"
    }
  ]
}

💡 Tips for Best Results

Enable Proxies — Use Apify proxy configuration for reliable large-scale scraping
Username Format — Ensure usernames follow GitHub's format rules:
- Only alphanumeric characters and hyphens allowed
- Cannot start or end with a hyphen
- No consecutive hyphens (e.g., user--name is invalid)
- Maximum 39 characters
- Invalid usernames will be skipped with warnings
Monitor Rate Limits — Use appropriate thread counts to avoid GitHub rate limiting
Handle Private Profiles — Some data may not be available for users with privacy settings
Email Availability — Email extraction only works for publicly visible emails (most users keep emails private)

🆘 Support & Feedback

For questions, feature requests, or technical support:

Visit the Apify Community Forum
Contact us through the Apify platform
Submit issues for improvements and bug reports

🌟 Explore More Actors

✨ Need more scraping solutions? Discover additional actors on Apify for comprehensive web automation and data extraction. Explore our full range of tools at 🌐 Explore More Actors on Apify.

📧 For inquiries or custom development, reach out at apify@vulnv.com.

Github Profile Scraper

saswave/github-profile-scraper

GitHub User Profile Scraper. Extracts data from GitHub profiles, including followers, following, LinkedIn, Twitter, achievements and much more. Ideal for developers, researchers, and marketers. From a list of Github profile or a repository stargazers link

SASWAVE

136

Github User Profile Scraper

powerful_bachelor/Github-User-Profile-Scraper

The GitHub User Profile Scraper extracts vital info from GitHub profiles, including followers, following, LinkedIn, Twitter, achievements and much more. Ideal for developers, researchers, and marketers, it supports multiple profiles and exports data in various formats.

Powerful Bachelor

Github Search Scraper

saswave/github-search-scraper

Github search scraper. Get all data from search results list

SASWAVE

5.0

Github List Scraper

janbuchar/github-list-scraper

This Actor scrapes repositories from GitHub **Awesome Lists**, **topic listings**, and **individual repositories**, collecting useful metadata for each project.

Jan Buchar

Github Users Scraper

getdataforme/github-users-actor

This actor works well and helps to scrape the users on github repository.

GetDataForMe

GitHub Email Scraper – Advanced, Cheapest & Reliable 📧⚡📷

contactminerlabs/my-actor-2

🔍 Scrape GitHub Emails Enter your search parameters to collect verified contact emails from public GitHub profiles, along with profile title, bio snippet, source URL & platform info ✉️📊 Perfect for lead generation, influencer outreach & data enrichment in tools like Google Sheets or CRMs⚡🧩

ContactMinerLabs

5.0

GitHub Stars

sauain/github-stars

Input will be the URL of any GitHub repository, and output will be GitHub Stars.

Saurav Jain

Github Users Scraper

dtrungtin/github-users-scraper

Github Users Scraper is an Apify actor for extracting users or emails from Github. It allows you to extract all watchers, stargazers, and members from a repository page.

Tin

251

4.0

GitHub Repository Scraper

vulnv/github-repository-scraper

Scrape and extract GitHub repository data, metadata, statistics, stars, forks, issues, and project information from multiple repositories at once.

VulnV

5.0

Github Profile Reverse Lookup Scraper

saswave/github-profile-reverse-lookup-scraper

Find github profile account suggestions from fullname, username or email search. Enrich and scale your data analysis

SASWAVE

5.0

Github Profile Scraper

Github Profile Scraper

🚀 GitHub Profile Scraper ⚡ Extract Developer Profiles at Scale

Overview

Complete Profile Data Extraction

Key Features

🧾 Input Configuration

Input Parameters

📤 Output Format

Error Handling

💼 Common Use Cases

Recruitment & Talent Sourcing

Developer Research & Analysis

Lead Generation & Business Development

Community Building & Networking

📊 Output & Export Options

Dataset Storage

Export Formats

Data Processing

⚡ Quick Start Guide

🛡️ Privacy & Compliance

🔧 Technical Details

Built With

Performance

📈 Example Results

Successful Profile Extraction

💡 Tips for Best Results

🆘 Support & Feedback

🌟 Explore More Actors

You might also like

Github Profile Scraper

Github User Profile Scraper

Github Search Scraper

Github List Scraper

Github Users Scraper

GitHub Email Scraper – Advanced, Cheapest & Reliable 📧⚡📷

GitHub Stars

Github Users Scraper

GitHub Repository Scraper

Github Profile Reverse Lookup Scraper