Git Commit Authors & Emails
Pricing
from $0.70 / 1,000 results
Git Commit Authors & Emails
Extract commit emails from one or more Git repositories and aggregate commit counts per email. Process multiple repos in one run, detect no-reply addresses, map author aliases, and publish both dataset rows and a structured OUTPUT record for fast analysis, exports, and automation workflows at scale
Pricing
from $0.70 / 1,000 results
Rating
5.0
(1)
Developer

njoylab
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
13 days ago
Last modified
Categories
Share
Git Commit Emails Extractor (Apify Actor)
Extract commit emails from one or more Git repositories and aggregate commit counts per email.
What this actor does
- Accepts multiple repositories as input.
- Supports:
- GitHub URL (
https://github.com/owner/repo) - Git URL (
git@github.com:owner/repo.git) - GitHub shorthand (
owner/repo)
- GitHub URL (
- Reads commit history from
HEAD. - Outputs:
- Dataset rows with commit-email aggregates (one row per email, no
typefield) - Repository summaries in key-value store under
OUTPUT
- Dataset rows with commit-email aggregates (one row per email, no
Input
{"repositories": ["https://github.com/<user>/<repository>"],"excludeNoReply": true,"normalizeEmails": true,"includeAuthorAliases": true}
Input fields
repositories(required): list of repositories to process.excludeNoReply(optional, defaultfalse): exclude*@users.noreply.github.comand*@noreply.*.normalizeEmails(optional, defaulttrue): lowercase emails before grouping.includeAuthorAliases(optional, defaulttrue): include all author names seen for each email.branch(optional): branch name used when cloning remote repositories. If the branch does not exist in a repository, the actor automatically falls back to that repository's default branch.
Output examples
Dataset email item
{"repositoryInput": "apify/crawlee","repositoryName": "apify/crawlee","resolvedRepository": "https://github.com/apify/crawlee","email": "[EMAIL_ADDRESS]","commitCount": 123,"isNoReply": false,"authorName": "<author name>","authorAliases": ["<author name>"]}
OUTPUT (key-value store) summary
{"startedAt": "2026-02-12T00:00:00.000Z","finishedAt": "2026-02-12T00:00:15.000Z","settings": {"excludeNoReply": true,"normalizeEmails": true,"includeAuthorAliases": true,"branch": null},"repositorySummaries": [{"repositoryInput": "https://github.com/apify/crawlee","repositoryName": "apify/crawlee","resolvedRepository": "https://github.com/apify/crawlee.git","totalCommitsInHead": 9339,"totalCommitRowsRead": 9339,"totalCommitsAfterFilters": 9339,"uniqueEmails": 450,"durationMs": 1450,"success": true}]}
Notes
- If contributors use masked emails (for example GitHub
noreply), the actor cannot infer private real emails. - Large repositories can produce large datasets.
Disclaimer
This actor extracts commit metadata that may include personal email addresses. Its purpose is to raise awareness about how data such as email addresses can be exposed in Git repositories. Use the output only for legitimate and compliant purposes, and always follow applicable privacy laws, platform terms, and anti-spam rules. You are responsible for how extracted data is stored, shared, and used.

