# GitHub Org AI Tool Fingerprinter (`ianymu/gh-org-ai-fingerprinter`) Actor

Fingerprint which AI dev tools a GitHub organization is using. Scans public repos for CLAUDE.md, AGENTS.md, .cursorrules, Copilot instructions, Continue, Aider, Windsurf and reports adoption rate + per-repo signals.

- **URL**: https://apify.com/ianymu/gh-org-ai-fingerprinter.md
- **Developed by:** [Yanlong Mu](https://apify.com/ianymu) (community)
- **Categories:** AI, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### What does GitHub Org AI Tool Fingerprinter do?

**GitHub Org AI Tool Fingerprinter** scans a GitHub organization's public repositories and reports which **AI dev tools** the team has adopted — including Claude Code (`CLAUDE.md`), Cursor (`.cursorrules`, `.cursor/`), GitHub Copilot custom instructions (`.github/copilot-instructions.md`), Aider, Continue, Windsurf, and the emerging `AGENTS.md` spec. For each hit you get the file path, file size, last-modified date, repo stars, and primary language, plus a final summary row with **adoption percentage** across the whole org. Run it ad-hoc, schedule it via the Apify platform, integrate via the API, or pipe it into Zapier / Make / your CRM.

Built by **Ian Mu** as Actor #9 in his [100-actor portfolio](https://github.com/ianymu). Code style and verification patterns follow the [claude-verify-before-stop](https://github.com/ianymu/claude-verify-before-stop) playbook — short scripts, immutable data, explicit error handling, and silent 404s.

### Why use GitHub Org AI Tool Fingerprinter?

- **Sales intelligence**: find orgs that already use Claude Code / Cursor and target them with relevant outreach.
- **Investor research**: gauge AI-tool adoption signal across portfolio companies or a competitor's eng team.
- **Recruiting**: surface AI-forward engineering orgs to source from.
- **Internal audit**: scan your own org to see which repos still need a `CLAUDE.md` or `AGENTS.md`.
- **Ecosystem analysis**: track adoption of new dev-tool standards over time by re-running on a schedule.

### How to use GitHub Org AI Tool Fingerprinter

1. Open the Actor in Apify Console and click **Run**.
2. Enter a GitHub org or user slug in the **Input** tab (e.g. `vercel`, `shopify`, `anthropics`).
3. Optionally tune **maxReposToScan** (default 30) and **signalsToCheck** (default: all 7 signals).
4. Hit **Start** and wait — typical run is 10–60 seconds for 30 repos.
5. Open the **Dataset** tab to see signal-hits and the final `_summary` row with adoption percentage.
6. Export to JSON / CSV / Excel, or call the dataset over the Apify API.

If you have a personal GitHub token, set it as the `GITHUB_TOKEN` env var on the Actor to lift the 60 req/hour anonymous rate limit to 5000/hour.

### Input

The Actor accepts three input fields (see the **Input** tab for the full form):

```json
{
  "githubOrg": "vercel",
  "maxReposToScan": 30,
  "signalsToCheck": [
    "claude-md",
    "agents-md",
    "cursor",
    "continue",
    "copilot-instructions",
    "aider",
    "windsurf"
  ]
}
````

- `githubOrg` (string, required) — org or user slug to scan.
- `maxReposToScan` (integer, default 30, max 100) — repo cap. The Actor lists repos sorted by most-recently-pushed and skips forks / archived / private.
- `signalsToCheck` (array of strings, default all) — which AI tool fingerprints to look for.

### Output

The Actor pushes one row to the dataset per **signal-hit**, plus a final `_summary` row. Sample output:

```json
[
  {
    "org": "vercel",
    "repo": "next.js",
    "repoUrl": "https://github.com/vercel/next.js",
    "signal": "claude-md",
    "filePath": "CLAUDE.md",
    "fileSize": 4231,
    "lastModified": "2026-04-12T14:23:00Z",
    "repoStars": 128000,
    "repoLanguage": "JavaScript"
  },
  {
    "_summary": true,
    "org": "vercel",
    "reposScanned": 30,
    "signalsFound": {
      "claude-md": 3,
      "agents-md": 1,
      "cursor": 8,
      "continue": 0,
      "copilot-instructions": 2,
      "aider": 0,
      "windsurf": 0
    },
    "totalReposWithAnyAiSignal": 12,
    "adoptionPct": 40
  }
]
```

You can download the dataset in various formats such as **JSON, HTML, CSV, or Excel**.

### Data table

| Field | Type | Description |
|---|---|---|
| `org` | string | The org you scanned |
| `repo` | string | Repo name (short, no owner prefix) |
| `repoUrl` | string | Full GitHub URL |
| `signal` | string | One of: claude-md, agents-md, cursor, continue, copilot-instructions, aider, windsurf |
| `filePath` | string | Path to the file/dir that matched |
| `fileSize` | number | File size in bytes (0 for directories) |
| `lastModified` | string | ISO-8601 date of the latest commit touching the path |
| `repoStars` | number | Stargazer count |
| `repoLanguage` | string | Primary language of the repo |
| `_summary` | boolean | True on the final summary row only |
| `signalsFound` | object | Per-signal count (only on summary row) |
| `adoptionPct` | number | % of scanned repos with at least one AI signal |

### Pricing / Cost estimation

This is a lightweight HTTP-only Actor — no headless browser. A 30-repo scan typically costs a few hundredths of a compute unit and finishes inside a minute. **How much does it cost to fingerprint a GitHub org?** On a free Apify account you can run this comfortably under the monthly free tier; large scans (100 repos × 7 signals = up to ~800 API calls) are still pennies of compute and the bottleneck is GitHub's rate limit, not Apify compute. Set the `GITHUB_TOKEN` env var to lift that limit.

### Tips and advanced options

- **Set `GITHUB_TOKEN`** in the Actor's env vars to get 5000 req/hour instead of 60 (anonymous). Use a fine-grained token with read-only public-repo access.
- **Schedule it monthly** via the Apify schedule feature on a list of competitor / portfolio orgs to track adoption trends.
- **Pipe to Slack / Notion / Sheets** via Apify integrations to keep a live "who's using Claude Code" dashboard.
- **Narrow signals** to one or two (`["claude-md"]`) to scan more repos within the rate-limit budget.
- **Pair with [Apify MCP Server Catalog](https://github.com/ianymu)** (Actor #1 in the portfolio) to also surface MCP server adoption inside the same org.

### FAQ, disclaimers, and support

**Is this legal?** Yes — the Actor only calls the public GitHub REST API (`api.github.com`) at endpoints `/orgs/{org}/repos`, `/repos/{owner}/{repo}/contents/{path}`, and `/repos/{owner}/{repo}/commits`. It respects GitHub's anonymous rate limit (60 req/hr) by stopping early and saving partial results. No scraping of HTML pages, no auth bypass, no ToS issues.

**What about private repos?** Anonymous scans skip them (they don't appear in the response). If you set `GITHUB_TOKEN` on the Actor env, private repos your token can read will also be included.

**Why is `adoptionPct` 0%?** Either the org doesn't use these tools (yet), or you hit the rate limit before any signal was found — check the `_summary.rateLimitHit` field and the Actor log.

**Found a bug or want a new signal added (e.g. `.zed/`, `roo-cline`, `cline`)?** Open an issue on the [GitHub repo](https://github.com/ianymu/claude-verify-before-stop) or contact Ian Mu directly. Custom variants of this Actor (e.g. scanning an entire user's starred-repo network) are available on request.

**Limitations:**

- Anonymous rate limit (60/hr) restricts you to ~7–8 repos worth of probes per run unless you set `GITHUB_TOKEN`.
- Detection is filename-based only — a repo with `CLAUDE.md` content that's just "TODO" still counts as a hit. Combine with file-size to filter.
- The `cursor` signal matches `.cursorrules` OR `.cursor/` OR `.cursor/rules`; you can refine post-hoc by `filePath`.

MIT license. See [github.com/ianymu/claude-verify-before-stop](https://github.com/ianymu/claude-verify-before-stop) for the broader Ian Mu Actor portfolio playbook.

# Actor input Schema

## `githubOrg` (type: `string`):

Org/user slug to scan, e.g. 'vercel', 'shopify', 'anthropics'.

## `maxReposToScan` (type: `integer`):

Cap on number of public, non-fork, non-archived repos to probe. Anonymous GitHub API allows ~60 requests/hour, so keep this low unless you set GITHUB\_TOKEN.

## `signalsToCheck` (type: `array`):

Which AI dev-tool fingerprints to probe in each repo. Valid values: claude-md, agents-md, cursor, continue, copilot-instructions, aider, windsurf.

## Actor input object example

```json
{
  "githubOrg": "vercel",
  "maxReposToScan": 30,
  "signalsToCheck": [
    "claude-md",
    "agents-md",
    "cursor",
    "continue",
    "copilot-instructions",
    "aider",
    "windsurf"
  ]
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "githubOrg": "vercel"
};

// Run the Actor and wait for it to finish
const run = await client.actor("ianymu/gh-org-ai-fingerprinter").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "githubOrg": "vercel" }

# Run the Actor and wait for it to finish
run = client.actor("ianymu/gh-org-ai-fingerprinter").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "githubOrg": "vercel"
}' |
apify call ianymu/gh-org-ai-fingerprinter --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=ianymu/gh-org-ai-fingerprinter",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "GitHub Org AI Tool Fingerprinter",
        "description": "Fingerprint which AI dev tools a GitHub organization is using. Scans public repos for CLAUDE.md, AGENTS.md, .cursorrules, Copilot instructions, Continue, Aider, Windsurf and reports adoption rate + per-repo signals.",
        "version": "0.1",
        "x-build-id": "nwFh8HsDvic3jZIfs"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/ianymu~gh-org-ai-fingerprinter/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-ianymu-gh-org-ai-fingerprinter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/ianymu~gh-org-ai-fingerprinter/runs": {
            "post": {
                "operationId": "runs-sync-ianymu-gh-org-ai-fingerprinter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/ianymu~gh-org-ai-fingerprinter/run-sync": {
            "post": {
                "operationId": "run-sync-ianymu-gh-org-ai-fingerprinter",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "githubOrg"
                ],
                "properties": {
                    "githubOrg": {
                        "title": "GitHub organization (or user) name",
                        "type": "string",
                        "description": "Org/user slug to scan, e.g. 'vercel', 'shopify', 'anthropics'."
                    },
                    "maxReposToScan": {
                        "title": "Max repos to scan",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Cap on number of public, non-fork, non-archived repos to probe. Anonymous GitHub API allows ~60 requests/hour, so keep this low unless you set GITHUB_TOKEN.",
                        "default": 30
                    },
                    "signalsToCheck": {
                        "title": "AI tool signals to look for",
                        "type": "array",
                        "description": "Which AI dev-tool fingerprints to probe in each repo. Valid values: claude-md, agents-md, cursor, continue, copilot-instructions, aider, windsurf.",
                        "default": [
                            "claude-md",
                            "agents-md",
                            "cursor",
                            "continue",
                            "copilot-instructions",
                            "aider",
                            "windsurf"
                        ],
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
