gitWrap avatar
gitWrap
Under maintenance

Pricing

Pay per usage

Go to Apify Store
gitWrap

gitWrap

Under maintenance

Scrapes your GitHub profile, repos, commits, issues, and languages to Generate a yearly ‘GitHub Wrapped’ summary. Outputs top languages, frameworks, activity heatmap, and repo highlights—perfect for Spotify-Wrapped style visualizations. No API needed, fully automated with Apify.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Arya Anand Pathak

Arya Anand Pathak

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

12 days ago

Last modified

Share

GitHub Wrapped Scraper

Apify Actor to scrape GitHub profiles, repositories, commits, issues, and more from HTML only


Overview

This Apify Actor scrapes a GitHub user's profile, activity, repositories, and more, to generate a rich dataset for a 'Spotify Wrapped' style frontend. It does NOT use the GitHub API and works solely by scraping HTML pages using CheerioCrawler or PlaywrightCrawler.

Features

  • Scrapes profile info (avatar, name, repo count)
  • Scrapes all public repositories (name, stars, languages, etc), commits, issues, pull requests
  • Auto-detects major frameworks (React, Next.js, Express, Django, Flask, LangChain)
  • Provides heatmap of commit activity, top repos, languages, frameworks, and more
  • Matches Apify actor standards fully (pagination, retries, request queue, logs, etc)

Usage

Input Schema (.actor/INPUT_SCHEMA.json)

{
"githubUsername": "string (required)",
"year": "number (default=2024)",
"maxRepos": "number (default=30)"
}

Output Schema (.actor/OUTPUT_SCHEMA.json)

Results look like:

{
"username": "octocat",
"displayName": "The Octocat",
"profilePicture": "https://github.com/octocat.png",
"year": 2024,
"totalRepos": 10,
"totalCommits": 210,
"totalIssues": 12,
"totalPullRequests": 3,
"topLanguages": [{"language":"JavaScript","count":120}],
"topFrameworks": ["React","Next.js"],
"activityHeatmap": {"2024-01-01": 2, ...},
"mostActiveDays": ["2024-03-12",...],
"topRepositories": [{"name":"repo","commits":70,"stars":35}],
"generatedAt": "2024-12-06T12:00:00Z"
}

How It Works

  • Scrapes public GitHub user profile and repository HTML
  • Follows pagination for repos, commits, issues/PRs
  • Counts/aggregates user activity for heatmaps
  • Tries to fetch and parse package.json for frameworks
  • Respects 429/403 by sleeping/retrying
  • Uses Apify's RequestQueue for robust crawling
  • Pushes exactly one object as final result

Quality & Limitations

  • Error handling and retries for robustness
  • Pagination for repos, commits, etc
  • Only scrapes public data (no API, no login)
  • Be careful with rate limits; actor sleeps/backs off if hit

Development

  • main.js contains actor logic
  • All schema/metadata are in .actor directory

Author & License

MIT (c) 2025 github-wrapped-scraper