gitWrap avatar
gitWrap
Under maintenance

Pricing

from $0.01 / 1,000 results

Go to Apify Store
gitWrap

gitWrap

Under maintenance

Scrapes your GitHub profile, repos, commits, issues, and languages to Generate a yearly ‘GitHub Wrapped’ summary. Outputs top languages, frameworks, activity heatmap, and repo highlights—perfect for Spotify-Wrapped style visualizations. No API needed, fully automated with Apify.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Arya Anand Pathak

Arya Anand Pathak

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

a month ago

Last modified

Share

GitHub Wrapped Scraper

Apify Actor to scrape GitHub profiles, repositories, commits, issues, and more from HTML only


Overview

This Apify Actor scrapes a GitHub user's profile, activity, repositories, and more, to generate a rich dataset for a 'Spotify Wrapped' style frontend. It does NOT use the GitHub API and works solely by scraping HTML pages using CheerioCrawler or PlaywrightCrawler.

Features

  • Scrapes profile info (avatar, name, repo count)
  • Scrapes all public repositories (name, stars, languages, etc), commits, issues, pull requests
  • Auto-detects major frameworks (React, Next.js, Express, Django, Flask, LangChain)
  • Provides heatmap of commit activity, top repos, languages, frameworks, and more
  • Matches Apify actor standards fully (pagination, retries, request queue, logs, etc)

Usage

Input Schema (.actor/INPUT_SCHEMA.json)

{
"githubUsername": "string (required)",
"year": "number (default=2024)",
"maxRepos": "number (default=30)"
}

Output Schema (.actor/OUTPUT_SCHEMA.json)

Results look like:

{
"username": "octocat",
"displayName": "The Octocat",
"profilePicture": "https://github.com/octocat.png",
"year": 2024,
"totalRepos": 10,
"totalCommits": 210,
"totalIssues": 12,
"totalPullRequests": 3,
"topLanguages": [{"language":"JavaScript","count":120}],
"topFrameworks": ["React","Next.js"],
"activityHeatmap": {"2024-01-01": 2, ...},
"mostActiveDays": ["2024-03-12",...],
"topRepositories": [{"name":"repo","commits":70,"stars":35}],
"generatedAt": "2024-12-06T12:00:00Z"
}

How It Works

  • Scrapes public GitHub user profile and repository HTML
  • Follows pagination for repos, commits, issues/PRs
  • Counts/aggregates user activity for heatmaps
  • Tries to fetch and parse package.json for frameworks
  • Respects 429/403 by sleeping/retrying
  • Uses Apify's RequestQueue for robust crawling
  • Pushes exactly one object as final result

Quality & Limitations

  • Error handling and retries for robustness
  • Pagination for repos, commits, etc
  • Only scrapes public data (no API, no login)
  • Be careful with rate limits; actor sleeps/backs off if hit

Development

  • main.js contains actor logic
  • All schema/metadata are in .actor directory

Author & License

MIT (c) 2025 github-wrapped-scraper