Git Commit Authors & Emails avatar

Git Commit Authors & Emails

Pricing

from $0.70 / 1,000 results

Go to Apify Store
Git Commit Authors & Emails

Git Commit Authors & Emails

Extract commit emails from one or more Git repositories and aggregate commit counts per email. Process multiple repos in one run, detect no-reply addresses, map author aliases, and publish both dataset rows and a structured OUTPUT record for fast analysis, exports, and automation workflows at scale

Pricing

from $0.70 / 1,000 results

Rating

5.0

(1)

Developer

njoylab

njoylab

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

13 days ago

Last modified

Share

Git Commit Emails Extractor (Apify Actor)

Extract commit emails from one or more Git repositories and aggregate commit counts per email.

What this actor does

  • Accepts multiple repositories as input.
  • Supports:
    • GitHub URL (https://github.com/owner/repo)
    • Git URL (git@github.com:owner/repo.git)
    • GitHub shorthand (owner/repo)
  • Reads commit history from HEAD.
  • Outputs:
    • Dataset rows with commit-email aggregates (one row per email, no type field)
    • Repository summaries in key-value store under OUTPUT

Input

{
"repositories": [
"https://github.com/<user>/<repository>"
],
"excludeNoReply": true,
"normalizeEmails": true,
"includeAuthorAliases": true
}

Input fields

  • repositories (required): list of repositories to process.
  • excludeNoReply (optional, default false): exclude *@users.noreply.github.com and *@noreply.*.
  • normalizeEmails (optional, default true): lowercase emails before grouping.
  • includeAuthorAliases (optional, default true): include all author names seen for each email.
  • branch (optional): branch name used when cloning remote repositories. If the branch does not exist in a repository, the actor automatically falls back to that repository's default branch.

Output examples

Dataset email item

{
"repositoryInput": "apify/crawlee",
"repositoryName": "apify/crawlee",
"resolvedRepository": "https://github.com/apify/crawlee",
"email": "[EMAIL_ADDRESS]",
"commitCount": 123,
"isNoReply": false,
"authorName": "<author name>",
"authorAliases": [
"<author name>"
]
}

OUTPUT (key-value store) summary

{
"startedAt": "2026-02-12T00:00:00.000Z",
"finishedAt": "2026-02-12T00:00:15.000Z",
"settings": {
"excludeNoReply": true,
"normalizeEmails": true,
"includeAuthorAliases": true,
"branch": null
},
"repositorySummaries": [
{
"repositoryInput": "https://github.com/apify/crawlee",
"repositoryName": "apify/crawlee",
"resolvedRepository": "https://github.com/apify/crawlee.git",
"totalCommitsInHead": 9339,
"totalCommitRowsRead": 9339,
"totalCommitsAfterFilters": 9339,
"uniqueEmails": 450,
"durationMs": 1450,
"success": true
}
]
}

Notes

  • If contributors use masked emails (for example GitHub noreply), the actor cannot infer private real emails.
  • Large repositories can produce large datasets.

Disclaimer

This actor extracts commit metadata that may include personal email addresses. Its purpose is to raise awareness about how data such as email addresses can be exposed in Git repositories. Use the output only for legitimate and compliant purposes, and always follow applicable privacy laws, platform terms, and anti-spam rules. You are responsible for how extracted data is stored, shared, and used.