Pricing

Pay per usage

GitHub Org AI Tool Fingerprinter

Fingerprint which AI dev tools a GitHub organization is using. Scans public repos for CLAUDE.md, AGENTS.md, .cursorrules, Copilot instructions, Continue, Aider, Windsurf and reports adoption rate + per-repo signals.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Yanlong Mu

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

What does GitHub Org AI Tool Fingerprinter do?

GitHub Org AI Tool Fingerprinter scans a GitHub organization's public repositories and reports which AI dev tools the team has adopted — including Claude Code (CLAUDE.md), Cursor (.cursorrules, .cursor/), GitHub Copilot custom instructions (.github/copilot-instructions.md), Aider, Continue, Windsurf, and the emerging AGENTS.md spec. For each hit you get the file path, file size, last-modified date, repo stars, and primary language, plus a final summary row with adoption percentage across the whole org. Run it ad-hoc, schedule it via the Apify platform, integrate via the API, or pipe it into Zapier / Make / your CRM.

Built by Ian Mu as Actor #9 in his 100-actor portfolio. Code style and verification patterns follow the claude-verify-before-stop playbook — short scripts, immutable data, explicit error handling, and silent 404s.

Why use GitHub Org AI Tool Fingerprinter?

Sales intelligence: find orgs that already use Claude Code / Cursor and target them with relevant outreach.
Investor research: gauge AI-tool adoption signal across portfolio companies or a competitor's eng team.
Recruiting: surface AI-forward engineering orgs to source from.
Internal audit: scan your own org to see which repos still need a CLAUDE.md or AGENTS.md.
Ecosystem analysis: track adoption of new dev-tool standards over time by re-running on a schedule.

How to use GitHub Org AI Tool Fingerprinter

Open the Actor in Apify Console and click Run.
Enter a GitHub org or user slug in the Input tab (e.g. vercel, shopify, anthropics).
Optionally tune maxReposToScan (default 30) and signalsToCheck (default: all 7 signals).
Hit Start and wait — typical run is 10–60 seconds for 30 repos.
Open the Dataset tab to see signal-hits and the final _summary row with adoption percentage.
Export to JSON / CSV / Excel, or call the dataset over the Apify API.

If you have a personal GitHub token, set it as the GITHUB_TOKEN env var on the Actor to lift the 60 req/hour anonymous rate limit to 5000/hour.

Input

The Actor accepts three input fields (see the Input tab for the full form):

{
  "githubOrg": "vercel",
  "maxReposToScan": 30,
  "signalsToCheck": [
    "claude-md",
    "agents-md",
    "cursor",
    "continue",
    "copilot-instructions",
    "aider",
    "windsurf"
  ]
}

githubOrg (string, required) — org or user slug to scan.
maxReposToScan (integer, default 30, max 100) — repo cap. The Actor lists repos sorted by most-recently-pushed and skips forks / archived / private.
signalsToCheck (array of strings, default all) — which AI tool fingerprints to look for.

Output

The Actor pushes one row to the dataset per signal-hit, plus a final _summary row. Sample output:

[
  {
    "org": "vercel",
    "repo": "next.js",
    "repoUrl": "https://github.com/vercel/next.js",
    "signal": "claude-md",
    "filePath": "CLAUDE.md",
    "fileSize": 4231,
    "lastModified": "2026-04-12T14:23:00Z",
    "repoStars": 128000,
    "repoLanguage": "JavaScript"
  },
  {
    "_summary": true,
    "org": "vercel",
    "reposScanned": 30,
    "signalsFound": {
      "claude-md": 3,
      "agents-md": 1,
      "cursor": 8,
      "continue": 0,
      "copilot-instructions": 2,
      "aider": 0,
      "windsurf": 0
    },
    "totalReposWithAnyAiSignal": 12,
    "adoptionPct": 40
  }
]

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Data table

Field	Type	Description
`org`	string	The org you scanned
`repo`	string	Repo name (short, no owner prefix)
`repoUrl`	string	Full GitHub URL
`signal`	string	One of: claude-md, agents-md, cursor, continue, copilot-instructions, aider, windsurf
`filePath`	string	Path to the file/dir that matched
`fileSize`	number	File size in bytes (0 for directories)
`lastModified`	string	ISO-8601 date of the latest commit touching the path
`repoStars`	number	Stargazer count
`repoLanguage`	string	Primary language of the repo
`_summary`	boolean	True on the final summary row only
`signalsFound`	object	Per-signal count (only on summary row)
`adoptionPct`	number	% of scanned repos with at least one AI signal

Pricing / Cost estimation

This is a lightweight HTTP-only Actor — no headless browser. A 30-repo scan typically costs a few hundredths of a compute unit and finishes inside a minute. How much does it cost to fingerprint a GitHub org? On a free Apify account you can run this comfortably under the monthly free tier; large scans (100 repos × 7 signals = up to ~800 API calls) are still pennies of compute and the bottleneck is GitHub's rate limit, not Apify compute. Set the GITHUB_TOKEN env var to lift that limit.

Tips and advanced options

Set GITHUB_TOKEN in the Actor's env vars to get 5000 req/hour instead of 60 (anonymous). Use a fine-grained token with read-only public-repo access.
Schedule it monthly via the Apify schedule feature on a list of competitor / portfolio orgs to track adoption trends.
Pipe to Slack / Notion / Sheets via Apify integrations to keep a live "who's using Claude Code" dashboard.
Narrow signals to one or two (["claude-md"]) to scan more repos within the rate-limit budget.
Pair with Apify MCP Server Catalog (Actor #1 in the portfolio) to also surface MCP server adoption inside the same org.

FAQ, disclaimers, and support

Is this legal? Yes — the Actor only calls the public GitHub REST API (api.github.com) at endpoints /orgs/{org}/repos, /repos/{owner}/{repo}/contents/{path}, and /repos/{owner}/{repo}/commits. It respects GitHub's anonymous rate limit (60 req/hr) by stopping early and saving partial results. No scraping of HTML pages, no auth bypass, no ToS issues.

What about private repos? Anonymous scans skip them (they don't appear in the response). If you set GITHUB_TOKEN on the Actor env, private repos your token can read will also be included.

Why is adoptionPct 0%? Either the org doesn't use these tools (yet), or you hit the rate limit before any signal was found — check the _summary.rateLimitHit field and the Actor log.

Found a bug or want a new signal added (e.g. .zed/, roo-cline, cline)? Open an issue on the GitHub repo or contact Ian Mu directly. Custom variants of this Actor (e.g. scanning an entire user's starred-repo network) are available on request.

Limitations:

Anonymous rate limit (60/hr) restricts you to ~7–8 repos worth of probes per run unless you set GITHUB_TOKEN.
Detection is filename-based only — a repo with CLAUDE.md content that's just "TODO" still counts as a hit. Combine with file-size to filter.
The cursor signal matches .cursorrules OR .cursor/ OR .cursor/rules; you can refine post-hoc by filePath.

MIT license. See github.com/ianymu/claude-verify-before-stop for the broader Ian Mu Actor portfolio playbook.

GitHub Repo AI Readiness Score

ianymu/gh-repo-ai-readiness-score

Score any public GitHub repo on AI-readiness. Checks for CLAUDE.md, AGENTS.md, .cursorrules, copilot instructions, .claude/{agents,skills,hooks}, MCP config, AI-tagged CI, and more. Returns 0-100 + AI-Native/Friendly/Curious/Aware/Absent tier + per-signal breakdown.

Yanlong Mu

AI Tool Stack Detector

ianymu/ai-tool-stack-detector

Detect which AI dev tools (Cursor, Claude Code, Copilot, Windsurf, Cody, Codeium, Aider, Continue, Tabnine, Cline) any company's engineering team uses. Aggregates public signals: careers pages, engineering blog, GitHub org repo markers. Returns confidence-scored profile with evidence.

Yanlong Mu

AGENTS.md Generator

veridian-synthetics/agents-md-generator

Generate production-quality [AGENTS.md](https://agents.md) files for any public GitHub, GitLab, or Bitbucket repository — in seconds.

Skyler Kruger

🧠 GitHub Skill & Memory Discovery: Full Content · $5/1k

themineworks/github-skill-discovery

Find every Claude/AI agent skill, memory, or agent config repo for a given domain on GitHub, and extract full text of every SKILL.md, CLAUDE.md, AGENTS.md file plus stars & last commit. Compare candidate skills side by side. MCP ready.

The Mine Works

GitHub Repository & Trending Scraper

rupom888/github-repository-scraper

Search GitHub repos, scrape user profiles with repos, get repo details with contributors, or track GitHub trending. Uses public API - optional token for higher rate limits.

Syed Rupom

GitHub Trending Scraper — Repos, Stars & Language Rankings

samwise.agency/github-trending

Scrape GitHub Trending (daily/weekly/monthly, per language) — the list GitHub has no official API for — plus full repo details and repo search via the official REST API. Clean JSON/CSV for dev-tool market research.

samwise.agency

GitHub README Generator

exquisite_network_w50/github-readme-generator

Generates professional README.md files for any public GitHub repository using AI. Just provide a repo URL and get a complete, well-structured README with badges, installation instructions, usage examples, and more.

Ray Pablo

CLAUDE.md Generator from Repo

ianymu/claudemd-generator

claudemd-generator is an Apify Actor that generates a high-quality draft CLAUDE.md for any public GitHub repo in under 30 seconds by auto-detecting stack, common commands, conventions, and structure. Replaces 30 minutes of manual setup with a one-click draft.

Yanlong Mu

Dev & AI Research MCP Server - Hugging Face, GitHub, HN

fetchcraft/dev-ai-ecosystem-mcp

MCP server giving AI agents 6 dev/AI research tools: search Hugging Face models, GitHub trending, analyze GitHub repos, top dev.to articles, Show HN launches, Product Hunt daily. Streamable HTTP, works with Claude/Cursor/Cline. $0.10 per tool call.