# Betterleaks Cloud - GitHub & S3 Secret Scanner (`anshumanatrey/betterleaks-cloud`) Actor

Cloud-hosted Betterleaks v1.3.1 (the successor to Gitleaks, by the original author). Scan GitHub orgs/users/repos/PRs/issues/actions/gists, single git URLs, or S3/R2 buckets. Optional live validation probes each secret against the vendor API to flag which ones are still active.

- **URL**: https://apify.com/anshumanatrey/betterleaks-cloud.md
- **Developed by:** [Anshuman Atrey](https://apify.com/anshumanatrey) (community)
- **Categories:** Developer tools, Other
- **Stats:** 2 total users, 1 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Betterleaks Cloud - GitHub & S3 Secret Scanner

Cloud-hosted [Betterleaks](https://github.com/betterleaks/betterleaks) v1.3.1 for hunting leaked API keys, wallet private keys, and credentials anywhere on GitHub, in cloned git repos, or in S3 / R2 / MinIO buckets. Optional live validation probes each detected secret against the vendor API to flag which ones are still active.

Available as an [Apify Actor](https://apify.com/anshumanatrey/betterleaks-cloud). $0.05 start + $0.05 per repo scanned + $0.30 per live-validated key. No install, no CLI, flat JSON output.

---

### What does it do?

Scans a GitHub org, user, repo, pull request, issue, gist, action workflow, or S3 bucket for leaked credentials. Built on Betterleaks (the Gitleaks successor, maintained by the original Gitleaks author Zachary Rice + co-maintainers from Red Hat, Amazon, RBC). Every Betterleaks CLI flag is exposed as an input field. When you enable validation, each finding is probed live against the vendor API and tagged `valid`, `invalid`, `revoked`, `unknown`, or `error`.

### How is it different from running Betterleaks locally?

| | Local Betterleaks CLI | This actor |
|---|---|---|
| Setup | Install Go binary, set up GitHub token, configure shell | Paste a JSON input |
| Scope | One target at a time | github / git / s3 modes + our `global_github` mode (Code Search picks repos) |
| Validation cost | Your machine's HTTP egress | Apify-managed, parallel workers |
| Output handling | stdout JSON, parse yourself | Dataset records, filterable views |
| Schedule | cron + DIY | Apify Schedules built-in |
| Cost | Your laptop's electricity | $0.05 actor start + $0.05 per repo scanned |

### How is it different from our gitleaks-github-secret-scanner actor?

| | gitleaks-cloud | betterleaks-cloud (this) |
|---|---|---|
| Engine | Gitleaks 8.x | Betterleaks 1.3.1 (Gitleaks successor) |
| Live validation | not supported | yes (opt-in) |
| Source types | github + git only | github, git, s3, dir, stdin, global_github |
| Pricing | $0.01 + $0.02 per repo | $0.05 + $0.05 per repo + $0.30 per live-validated key |
| Best for | Volume scanning, low cost per scan | Premium scans where "is the secret still live?" matters |

### When should I use it?

- Auditing your own GitHub org for secrets that slipped into commits
- Bug bounty hunting where you want to validate keys live before reporting
- Pre-acquisition security check on a target company's GitHub footprint
- Continuous monitoring on a schedule, alerting on any new live-validated finding
- Forensics: tracing which dangling commit on which PR ref leaked a credential
- Scanning a Cloudflare R2 / AWS S3 bucket someone shared publicly

### What does it cost?

Pay-per-event:
- $0.05 per actor start
- $0.05 per repo scanned (in github / git / global_github modes)
- $0.01 per finding pushed to the dataset
- $0.30 per **live-validated** secret (only when validation is on AND status returns `valid`)
- $0.005 per Code Search query (only in `global_github` mode)

Typical scans:

| Scenario | Cost |
|---|---|
| Scan 1 small repo, 1 finding, no validation | $0.11 |
| Scan 1 big repo, 30 findings, no validation | $0.40 |
| Scan an org with 25 repos, 50 findings, no validation | $1.80 |
| 25-repo global search for `rzp_live_` | $1.93 |
| 100-repo scan, 200 findings, 10 live-validated | $10.05 |

Compare to GitGuardian Enterprise at ~$5000/year flat. Per-comprehensive-scan with this actor is 300-500x cheaper.

### Which modes can I scan in?

| Mode | What it scans | Required input |
|---|---|---|
| `github` | A specific GitHub org / user / repo / PR / issue / gist (you choose) | `target_url`, `github_token` recommended |
| `global_github` | All of GitHub by keyword - we add this layer via Code Search | `global_search_query`, `github_token` required |
| `git` | A single git clone URL with full history | `target_url` |
| `s3` | AWS S3, Cloudflare R2, MinIO, or any S3-compatible bucket | `target_url`, S3 creds (or `s3_anonymous` for public) |
| `dir` | A tarball / zip URL - we download, extract, scan filesystem | `dir_source_url` |
| `stdin` | Raw text content you paste in | `stdin_content` |

### Which GitHub resources can it scan in `github` mode?

All 12 surfaces betterleaks supports, selectable via `include` / `exclude`:

- `repos` (default), `forks`
- `prs`, `pr-comments`
- `issues`, `issue-comments`
- `actions`, `action-artifacts`
- `discussions`
- `releases`, `release-assets`
- `gists`

You can also filter by `since` / `until` dates, exclude repos by glob pattern, or target specific GitHub Actions workflow files.

### Does it find leaks that were deleted from current files?

Yes. Betterleaks scans the full git history. A secret that was committed and then deleted in a later commit still appears in the history and gets detected. The `Commit` field on each finding tells you which commit introduced the leak.

You can also enable advanced `--log-opts` pass-through to use git pickaxe (`-S secret_string`) for targeted historical searches.

### Is the secret validated against the live vendor API?

Optional - off by default to keep scans fast. Set `validation: true` and Betterleaks fires HTTP requests against the vendor API for each finding using the rule's built-in CEL validator.

Output fields:
- `ValidationStatus`: `valid` (still active), `invalid` (rejected), `revoked` (explicitly disabled), `unknown` (couldn't determine), or `error`
- `ValidationReason`: human-readable why
- `ValidationMeta`: vendor-specific metadata. For a GitHub PAT, this includes `username`, `name`, and granted `scopes`. For Stripe, the test/live mode + account info.

Note: validation only fires for rules that ship with a `validate` CEL clause in their TOML. Not every detected secret type has one (the upstream rule set is expanding over time).

You can also tune validation behavior:
- `validation_timeout` (per-request HTTP timeout)
- `validation_workers` (parallel HTTP workers)
- `validation_status` (filter output to only certain statuses)
- `validation_debug` (include raw HTTP response in `ValidationMeta`)
- `validation_extract_empty` (include validator outputs even when empty)
- `validation_env_vars` (expose extra env vars to CEL validators)

### Can I use a custom detection rule?

Yes - paste a full betterleaks TOML config into `custom_config_toml`. Either replaces the built-in rules or adds to them depending on how you write the TOML.

You can also use `enable_rule` to run only a whitelist of rule IDs (e.g. `["github-fine-grained-pat", "stripe-secret-key"]`) from the built-in set without writing your own config.

Reference format: https://github.com/betterleaks/betterleaks/blob/main/.betterleaks.toml

### What does the output look like?

Each dataset record is the raw `Finding` struct from betterleaks - no transformation, no field renaming:

```json
{
  "RuleID": "github-fine-grained-pat",
  "Description": "GitHub Fine-Grained Personal Access Token, risking unauthorized repo access.",
  "Match": "github_pat_11AABBCC...",
  "Secret": "github_pat_11AABBCC...",
  "StartLine": 12,
  "EndLine": 12,
  "StartColumn": 18,
  "EndColumn": 105,
  "MatchContext": "...optional surrounding lines if match_context is set...",
  "CaptureGroups": {},
  "Fragment": {
    "FilePath": "config/secrets.env",
    "Url": "https://github.com/owner/repo/blob/<sha>/config/secrets.env#L12"
  },
  "Attributes": {
    "path": "config/secrets.env",
    "resource": "git.patch_content",
    "url": "https://github.com/owner/repo/blob/<sha>/config/secrets.env#L12",
    "git.author_name": "alice",
    "git.author_email": "alice@example.com",
    "git.commit": "abcd1234",
    "git.date": "2025-06-12T10:23:14Z",
    "git.message": "Add config",
    "github.owner": "owner",
    "github.repo": "repo",
    "github.visibility": "public"
  },
  "Tags": [],
  "Fingerprint": "abcd1234:config/secrets.env:github-fine-grained-pat:12",
  "ValidationStatus": "valid",
  "ValidationReason": "",
  "ValidationMeta": {
    "username": "alice",
    "name": "Alice Doe",
    "scopes": "repo, workflow"
  }
}
````

The dataset has two pre-configured views:

- **Findings**: all records
- **Live secrets only**: filter to `ValidationStatus == valid`, the high-signal triage list

### How accurate is it?

Based on the underlying Betterleaks 1.3.1 rule set (200+ detectors), plus the upstream's BPE-tokenization based false-positive filter and CEL-based contextual filters. Compute usage observed in real test runs:

- Single small repo scan: ~7 seconds, 0.0084 compute units
- Scan of `anshumanatrey/*` (~30 repos): ~134 seconds, 0.1491 compute units, 735 findings
- 3-repo `global_github` scan with query `rzp_live_`: ~8.5 seconds, 0.0095 compute units, 14 findings

### Common questions

**Q: Can I scan a private repo?** Yes - provide a `github_token` with `repo` scope.

**Q: Does it scan PR branches and dangling commits?** PR refs yes via `--include prs`. Dangling commits aren't currently exposed but the underlying tool supports it.

**Q: How does it handle rate limits?** Without a `github_token`, GitHub limits to ~60 req/hr shared. With a token: 5000 req/hr on your account. Always provide a token for any real scan.

**Q: Can I run it on a schedule?** Yes via Apify Schedules. Recommended for continuous monitoring use cases.

**Q: Can I use my own betterleaks rule TOML?** Yes via `custom_config_toml`. Or whitelist specific built-in rules via `enable_rule`.

**Q: What's `global_github` mode?** Our addition on top of upstream betterleaks. You provide a search query, we use GitHub Code Search to find unique candidate repos, then run `betterleaks git` against each in parallel. Useful for hunting a specific secret pattern across all of GitHub.

**Q: Why not use the cheaper gitleaks-github-secret-scanner actor instead?** Use that one for high-volume cost-sensitive scans. Use this one when live validation matters (bug bounty triage, incident response, knowing which secrets to actually rotate).

**Q: What happens to detected secrets?** Written to the run's dataset, accessible only to the user who ran the actor. We never store them server-side.

**Q: Can I scan an S3 bucket I don't own?** Only if it's public (anonymous mode) or you have credentials.

**Q: What's the maximum scan size?** No hard cap on repos. `max_target_megabytes` (default 100) skips files larger than that.

### Limitations

- Per-CEL-rule validation: only rules with a `validate` clause actually probe live. Not every secret type has one yet.
- `global_github` mode requires a GitHub PAT (Code Search is auth-only).
- `dir` mode requires the source to be a downloadable tarball URL, not a local file.
- Apify run timeout caps individual scans at 30 minutes per subprocess. For org scans larger than 200 repos, split into multiple runs.
- The `stdin` mode is limited by Apify's input size cap (~5 MB).

### Ethical use

For authorized security testing only. This actor scans public GitHub content (the same content any logged-in GitHub user can see), or private repos that your PAT explicitly grants access to.

Findings should be reported through responsible disclosure channels:

- The leaking developer (commit author email)
- The vendor's `security@` address
- A bug-bounty program if one exists for the vendor

Using leaked credentials without owner authorization is illegal in most jurisdictions.

### Other actors in my portfolio

- [gitleaks-github-secret-scanner](https://github.com/AnshumanAtrey/gitleaks-github-secret-scanner) - the cheaper Gitleaks-based scanner without validation
- [holehe-email-osint](https://github.com/AnshumanAtrey/holehe-email-osint) - email to registered accounts (120+ sites)
- [phoneinfoga-phone-osint](https://github.com/AnshumanAtrey/phoneinfoga-phone-osint) - phone OSINT
- [theharvester-osint](https://github.com/AnshumanAtrey/theharvester-osint) - emails / subdomains / hosts via public sources
- [social-analyzer](https://github.com/AnshumanAtrey/social-analyzer) - username across 300+ social platforms
- [nmap-scanner](https://github.com/AnshumanAtrey/nmap-scanner) - Nmap port scanner
- [netintel](https://github.com/AnshumanAtrey/netintel) - DNS / WHOIS / IP geo / port scan / SSL / tech stack
- [instagram-profile-intel-no-login](https://github.com/AnshumanAtrey/instagram-profile-intel-no-login) - Instagram profile intel without login
- [bug-bounty-finder](https://github.com/AnshumanAtrey/bug-bounty-finder) - search HackerOne, Bugcrowd, Intigriti

### License

MIT. The upstream Betterleaks binary is also MIT, maintained by [betterleaks/betterleaks](https://github.com/betterleaks/betterleaks) (Zachary Rice and co-maintainers from Red Hat, Amazon, RBC).

# Actor input Schema

## `mode` (type: `string`):

Which betterleaks command to run. 'global\_github' is our addition - GitHub Code Search picks candidate repos then runs betterleaks on each.

## `target_url` (type: `string`):

github: https://github.com/owner | https://github.com/owner/repo | https://github.com/owner/repo/pull/123. git: any clone URL. s3: https://bucket.s3.amazonaws.com/prefix/ or https://account.r2.cloudflarestorage.com/...

## `global_search_query` (type: `string`):

Required for global\_github mode. Used as the GitHub Code Search query to discover candidate repos. Examples: 'CASHFREE\_APP\_ID', 'rzp\_live\_', '"sk-ant-api03-"'.

## `global_max_repos` (type: `integer`):

Cap on how many unique repos to scan. Each repo runs betterleaks separately, so cost scales linearly. Default 25.

## `github_token` (type: `string`):

Required for github mode (recommended) and REQUIRED for global\_github mode (Code Search is auth-only). Generate at github.com/settings/tokens. 'public\_repo' scope is enough for public repos.

## `include` (type: `array`):

Pick which GitHub surfaces to scan. Default = repos.

## `exclude` (type: `array`):

Subtracts from the include list. Lets you scan everything except certain resource types.

## `exclude_repo` (type: `array`):

Examples: 'archived-*', 'docs-*'.

## `actions_workflow` (type: `array`):

Workflow filenames like 'ci.yml'. Empty = scan all workflows.

## `since` (type: `string`):

YYYY-MM-DD or RFC3339.

## `until` (type: `string`):

github mode. Pair with 'since' for a date window.

## `git_workers` (type: `integer`):

git / github modes. 0 = single process (lowest memory). Higher = faster, more memory.

## `log_opts` (type: `string`):

git / github modes. Passed verbatim to 'git log'. Examples: '--all --since=2024-01-01', '--author=alice', '-S secret\_string'.

## `git_platform` (type: `string`):

git mode. Used to generate permalinks. 'github' or 'gitlab'.

## `validation` (type: `boolean`):

Probe each detected secret against its vendor API. Off by default. Only rules that ship with a 'validate' CEL clause will actually probe.

## `validation_status` (type: `string`):

Comma-separated list. Options: valid, invalid, revoked, unknown, error. Empty = include all.

## `validation_timeout` (type: `integer`):

Per-request timeout for vendor API probes. Default 10s.

## `validation_workers` (type: `integer`):

Parallel validation HTTP requests. Default 10.

## `validation_debug` (type: `boolean`):

Useful for debugging CEL validators. Verbose.

## `validation_extract_empty` (type: `boolean`):

Include validation extractor results even when they are empty strings.

## `validation_env_vars` (type: `string`):

Comma-separated env var names. Some rules' validators need extra credentials.

## `s3_access_key` (type: `string`):

Leave blank + check 'Anonymous' for public buckets.

## `s3_secret_key` (type: `string`):

s3 mode. Paired with access key for AWS / S3-compatible authentication.

## `s3_session_token` (type: `string`):

For IAM role / temporary credentials.

## `s3_anonymous` (type: `boolean`):

s3 mode. Skip credentials entirely - scans only public buckets/objects.

## `s3_region` (type: `string`):

Required for non-AWS endpoints. 'auto' for R2, 'us-east-1' for AWS, etc.

## `s3_max_object_size` (type: `integer`):

Skip objects larger than this. 0 = default 250 MiB.

## `s3_workers` (type: `integer`):

0 = default 16.

## `dir_source_url` (type: `string`):

Public HTTPS URL of a .tar.gz or .zip archive. We download, extract, then run 'betterleaks dir' on it.

## `dir_follow_symlinks` (type: `boolean`):

dir mode. Scan files that are symbolic links to other files.

## `stdin_content` (type: `string`):

Paste the full text you want scanned. Limited by Apify input size cap (~5MB).

## `redact` (type: `integer`):

Percentage of secret to MASK in output (0 = show all, 75 = show 25% / mask 75%, 100 = fully redacted). Useful when sharing findings externally.

## `match_context` (type: `string`):

Format: 'NL' for N lines (e.g. '5L') or 'NC' for N characters (e.g. '100C'). Empty = no context.

## `verbose` (type: `boolean`):

Enable detailed scan progress in the actor log.

## `log_level` (type: `string`):

trace / debug / info / warn / error / fatal.

## `legacy_print` (type: `boolean`):

Use the legacy gitleaks-style verbose output format (key/value pairs).

## `no_color` (type: `boolean`):

Disable ANSI color codes in the verbose output. Default ON for cloud logs.

## `no_banner` (type: `boolean`):

Suppress the betterleaks ASCII banner at start of run. Default ON for cleaner logs.

## `enable_rule` (type: `array`):

Whitelist. Empty = run all built-in rules.

## `custom_config_toml` (type: `string`):

Paste a full TOML config to override built-in rules. See https://github.com/betterleaks/betterleaks/blob/main/.betterleaks.toml

## `ignore_gitleaks_allow` (type: `boolean`):

When ON, allowlist comments in source code are ignored.

## `gitleaks_ignore_path` (type: `string`):

Default '.'.

## `baseline_url` (type: `string`):

Public HTTPS URL of a previous report JSON. Findings present in baseline will be filtered out (only NEW leaks are reported).

## `max_target_megabytes` (type: `integer`):

Skip files larger than this. Default 100.

## `max_archive_depth` (type: `integer`):

0 = do not extract archives. Higher values let scanner reach into zip-in-zip-in-jar.

## `max_decode_depth` (type: `integer`):

Default 5. Controls how deep base64/url-decoding goes.

## `timeout_seconds` (type: `integer`):

Hard cap on the whole scan duration.

## `regex_engine` (type: `string`):

Default re2 (fast, no backtracking). stdlib supports more features.

## `experiments` (type: `string`):

Upstream feature flags. Refer to betterleaks docs.

## `report_template` (type: `string`):

Go text/template content for custom report format. Advanced.

## `diagnostics` (type: `string`):

Comma-separated: cpu, mem, trace, http. Outputs perf profiles to the actor log.

## `exit_on_findings` (type: `boolean`):

Default ON. Turn OFF for monitoring (run never fails on leaks).

## Actor input object example

```json
{
  "mode": "github",
  "target_url": "https://github.com/octocat",
  "global_search_query": "rzp_live_",
  "global_max_repos": 25,
  "git_workers": 0,
  "validation": false,
  "validation_status": "valid,revoked",
  "validation_timeout": 10,
  "validation_workers": 10,
  "validation_debug": false,
  "validation_extract_empty": false,
  "s3_anonymous": false,
  "s3_max_object_size": 0,
  "s3_workers": 0,
  "dir_source_url": "https://example.com/codebase.tar.gz",
  "dir_follow_symlinks": false,
  "redact": 0,
  "match_context": "5L",
  "verbose": false,
  "log_level": "",
  "legacy_print": false,
  "no_color": true,
  "no_banner": true,
  "ignore_gitleaks_allow": false,
  "max_target_megabytes": 100,
  "max_archive_depth": 0,
  "max_decode_depth": 5,
  "timeout_seconds": 0,
  "regex_engine": "",
  "exit_on_findings": true
}
```

# Actor output Schema

## `findings` (type: `string`):

Default dataset. One record per detected secret. Filter by ValidationStatus=valid to see only live, exploitable credentials when validation was enabled.

## `log` (type: `string`):

Full betterleaks scan log including subprocess stdout/stderr.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("anshumanatrey/betterleaks-cloud").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("anshumanatrey/betterleaks-cloud").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call anshumanatrey/betterleaks-cloud --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=anshumanatrey/betterleaks-cloud",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Betterleaks Cloud - GitHub & S3 Secret Scanner",
        "description": "Cloud-hosted Betterleaks v1.3.1 (the successor to Gitleaks, by the original author). Scan GitHub orgs/users/repos/PRs/issues/actions/gists, single git URLs, or S3/R2 buckets. Optional live validation probes each secret against the vendor API to flag which ones are still active.",
        "version": "0.3",
        "x-build-id": "ecAnke1ZVQibnWFAo"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/anshumanatrey~betterleaks-cloud/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-anshumanatrey-betterleaks-cloud",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/anshumanatrey~betterleaks-cloud/runs": {
            "post": {
                "operationId": "runs-sync-anshumanatrey-betterleaks-cloud",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/anshumanatrey~betterleaks-cloud/run-sync": {
            "post": {
                "operationId": "run-sync-anshumanatrey-betterleaks-cloud",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "mode"
                ],
                "properties": {
                    "mode": {
                        "title": "Scan mode",
                        "enum": [
                            "github",
                            "global_github",
                            "git",
                            "s3",
                            "dir",
                            "stdin"
                        ],
                        "type": "string",
                        "description": "Which betterleaks command to run. 'global_github' is our addition - GitHub Code Search picks candidate repos then runs betterleaks on each.",
                        "default": "github"
                    },
                    "target_url": {
                        "title": "Target URL (github / git / s3 modes)",
                        "type": "string",
                        "description": "github: https://github.com/owner | https://github.com/owner/repo | https://github.com/owner/repo/pull/123. git: any clone URL. s3: https://bucket.s3.amazonaws.com/prefix/ or https://account.r2.cloudflarestorage.com/..."
                    },
                    "global_search_query": {
                        "title": "Search query (global_github mode)",
                        "type": "string",
                        "description": "Required for global_github mode. Used as the GitHub Code Search query to discover candidate repos. Examples: 'CASHFREE_APP_ID', 'rzp_live_', '\"sk-ant-api03-\"'."
                    },
                    "global_max_repos": {
                        "title": "Max unique repos to scan (global_github mode)",
                        "minimum": 1,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Cap on how many unique repos to scan. Each repo runs betterleaks separately, so cost scales linearly. Default 25.",
                        "default": 25
                    },
                    "github_token": {
                        "title": "GitHub Personal Access Token",
                        "type": "string",
                        "description": "Required for github mode (recommended) and REQUIRED for global_github mode (Code Search is auth-only). Generate at github.com/settings/tokens. 'public_repo' scope is enough for public repos."
                    },
                    "include": {
                        "title": "Include these resource types (github mode)",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Pick which GitHub surfaces to scan. Default = repos.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "repos",
                                "forks",
                                "prs",
                                "pr-comments",
                                "issues",
                                "issue-comments",
                                "actions",
                                "action-artifacts",
                                "discussions",
                                "releases",
                                "release-assets",
                                "gists"
                            ]
                        }
                    },
                    "exclude": {
                        "title": "Exclude these resource types (github mode)",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Subtracts from the include list. Lets you scan everything except certain resource types.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "repos",
                                "forks",
                                "prs",
                                "pr-comments",
                                "issues",
                                "issue-comments",
                                "actions",
                                "action-artifacts",
                                "discussions",
                                "releases",
                                "release-assets",
                                "gists"
                            ]
                        }
                    },
                    "exclude_repo": {
                        "title": "Exclude repos matching these glob patterns (github mode)",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Examples: 'archived-*', 'docs-*'.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "actions_workflow": {
                        "title": "Only scan these GitHub Actions workflow files (github mode)",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Workflow filenames like 'ci.yml'. Empty = scan all workflows.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "since": {
                        "title": "Only scan API items created after this date (github mode)",
                        "type": "string",
                        "description": "YYYY-MM-DD or RFC3339."
                    },
                    "until": {
                        "title": "Only scan API items created before this date (github mode)",
                        "type": "string",
                        "description": "github mode. Pair with 'since' for a date window."
                    },
                    "git_workers": {
                        "title": "Parallel git workers per repo",
                        "minimum": 0,
                        "maximum": 64,
                        "type": "integer",
                        "description": "git / github modes. 0 = single process (lowest memory). Higher = faster, more memory.",
                        "default": 0
                    },
                    "log_opts": {
                        "title": "Git log options pass-through",
                        "type": "string",
                        "description": "git / github modes. Passed verbatim to 'git log'. Examples: '--all --since=2024-01-01', '--author=alice', '-S secret_string'."
                    },
                    "git_platform": {
                        "title": "Git platform (link format)",
                        "enum": [
                            "",
                            "github",
                            "gitlab"
                        ],
                        "type": "string",
                        "description": "git mode. Used to generate permalinks. 'github' or 'gitlab'."
                    },
                    "validation": {
                        "title": "Enable live secret validation",
                        "type": "boolean",
                        "description": "Probe each detected secret against its vendor API. Off by default. Only rules that ship with a 'validate' CEL clause will actually probe.",
                        "default": false
                    },
                    "validation_status": {
                        "title": "Only include findings with these validation statuses",
                        "type": "string",
                        "description": "Comma-separated list. Options: valid, invalid, revoked, unknown, error. Empty = include all."
                    },
                    "validation_timeout": {
                        "title": "Validation HTTP request timeout (seconds)",
                        "minimum": 1,
                        "maximum": 300,
                        "type": "integer",
                        "description": "Per-request timeout for vendor API probes. Default 10s.",
                        "default": 10
                    },
                    "validation_workers": {
                        "title": "Concurrent validation workers",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Parallel validation HTTP requests. Default 10.",
                        "default": 10
                    },
                    "validation_debug": {
                        "title": "Include raw HTTP responses in validation output",
                        "type": "boolean",
                        "description": "Useful for debugging CEL validators. Verbose.",
                        "default": false
                    },
                    "validation_extract_empty": {
                        "title": "Include empty extractor values in ValidationMeta",
                        "type": "boolean",
                        "description": "Include validation extractor results even when they are empty strings.",
                        "default": false
                    },
                    "validation_env_vars": {
                        "title": "Env vars accessible to validation CEL",
                        "type": "string",
                        "description": "Comma-separated env var names. Some rules' validators need extra credentials."
                    },
                    "s3_access_key": {
                        "title": "S3 access key (s3 mode)",
                        "type": "string",
                        "description": "Leave blank + check 'Anonymous' for public buckets."
                    },
                    "s3_secret_key": {
                        "title": "S3 secret key (s3 mode)",
                        "type": "string",
                        "description": "s3 mode. Paired with access key for AWS / S3-compatible authentication."
                    },
                    "s3_session_token": {
                        "title": "S3 session token (s3 mode)",
                        "type": "string",
                        "description": "For IAM role / temporary credentials."
                    },
                    "s3_anonymous": {
                        "title": "Anonymous (s3 mode - public buckets only)",
                        "type": "boolean",
                        "description": "s3 mode. Skip credentials entirely - scans only public buckets/objects.",
                        "default": false
                    },
                    "s3_region": {
                        "title": "S3 region (s3 mode)",
                        "type": "string",
                        "description": "Required for non-AWS endpoints. 'auto' for R2, 'us-east-1' for AWS, etc."
                    },
                    "s3_max_object_size": {
                        "title": "S3 max object size (bytes)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Skip objects larger than this. 0 = default 250 MiB.",
                        "default": 0
                    },
                    "s3_workers": {
                        "title": "S3 concurrent object fetches",
                        "minimum": 0,
                        "maximum": 256,
                        "type": "integer",
                        "description": "0 = default 16.",
                        "default": 0
                    },
                    "dir_source_url": {
                        "title": "Tarball / zip URL to download (dir mode)",
                        "type": "string",
                        "description": "Public HTTPS URL of a .tar.gz or .zip archive. We download, extract, then run 'betterleaks dir' on it."
                    },
                    "dir_follow_symlinks": {
                        "title": "Follow symlinks (dir mode)",
                        "type": "boolean",
                        "description": "dir mode. Scan files that are symbolic links to other files.",
                        "default": false
                    },
                    "stdin_content": {
                        "title": "Raw text content to scan (stdin mode)",
                        "type": "string",
                        "description": "Paste the full text you want scanned. Limited by Apify input size cap (~5MB)."
                    },
                    "redact": {
                        "title": "Redact secrets in output (0-100)",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Percentage of secret to MASK in output (0 = show all, 75 = show 25% / mask 75%, 100 = fully redacted). Useful when sharing findings externally.",
                        "default": 0
                    },
                    "match_context": {
                        "title": "Context around each match",
                        "type": "string",
                        "description": "Format: 'NL' for N lines (e.g. '5L') or 'NC' for N characters (e.g. '100C'). Empty = no context."
                    },
                    "verbose": {
                        "title": "Verbose scan log",
                        "type": "boolean",
                        "description": "Enable detailed scan progress in the actor log.",
                        "default": false
                    },
                    "log_level": {
                        "title": "Log level",
                        "enum": [
                            "",
                            "trace",
                            "debug",
                            "info",
                            "warn",
                            "error",
                            "fatal"
                        ],
                        "type": "string",
                        "description": "trace / debug / info / warn / error / fatal.",
                        "default": ""
                    },
                    "legacy_print": {
                        "title": "Use legacy key-value verbose format",
                        "type": "boolean",
                        "description": "Use the legacy gitleaks-style verbose output format (key/value pairs).",
                        "default": false
                    },
                    "no_color": {
                        "title": "Disable color in output",
                        "type": "boolean",
                        "description": "Disable ANSI color codes in the verbose output. Default ON for cloud logs.",
                        "default": true
                    },
                    "no_banner": {
                        "title": "Suppress betterleaks banner",
                        "type": "boolean",
                        "description": "Suppress the betterleaks ASCII banner at start of run. Default ON for cleaner logs.",
                        "default": true
                    },
                    "enable_rule": {
                        "title": "Only run these rule IDs",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Whitelist. Empty = run all built-in rules.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "custom_config_toml": {
                        "title": "Custom betterleaks config (TOML)",
                        "type": "string",
                        "description": "Paste a full TOML config to override built-in rules. See https://github.com/betterleaks/betterleaks/blob/main/.betterleaks.toml"
                    },
                    "ignore_gitleaks_allow": {
                        "title": "Ignore 'gitleaks:allow' comments in code",
                        "type": "boolean",
                        "description": "When ON, allowlist comments in source code are ignored.",
                        "default": false
                    },
                    "gitleaks_ignore_path": {
                        "title": "Path to gitleaks ignore file",
                        "type": "string",
                        "description": "Default '.'."
                    },
                    "baseline_url": {
                        "title": "Baseline report URL",
                        "type": "string",
                        "description": "Public HTTPS URL of a previous report JSON. Findings present in baseline will be filtered out (only NEW leaks are reported)."
                    },
                    "max_target_megabytes": {
                        "title": "Max file size to scan (MB)",
                        "minimum": 0,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Skip files larger than this. Default 100.",
                        "default": 100
                    },
                    "max_archive_depth": {
                        "title": "Max nested archive scanning depth",
                        "minimum": 0,
                        "maximum": 10,
                        "type": "integer",
                        "description": "0 = do not extract archives. Higher values let scanner reach into zip-in-zip-in-jar.",
                        "default": 0
                    },
                    "max_decode_depth": {
                        "title": "Recursive decoding depth limit",
                        "minimum": 0,
                        "maximum": 20,
                        "type": "integer",
                        "description": "Default 5. Controls how deep base64/url-decoding goes.",
                        "default": 5
                    },
                    "timeout_seconds": {
                        "title": "Command timeout (seconds, 0 = no limit)",
                        "minimum": 0,
                        "maximum": 7200,
                        "type": "integer",
                        "description": "Hard cap on the whole scan duration.",
                        "default": 0
                    },
                    "regex_engine": {
                        "title": "Regex engine",
                        "enum": [
                            "",
                            "re2",
                            "stdlib"
                        ],
                        "type": "string",
                        "description": "Default re2 (fast, no backtracking). stdlib supports more features.",
                        "default": ""
                    },
                    "experiments": {
                        "title": "Comma-separated experimental features",
                        "type": "string",
                        "description": "Upstream feature flags. Refer to betterleaks docs."
                    },
                    "report_template": {
                        "title": "Report template content",
                        "type": "string",
                        "description": "Go text/template content for custom report format. Advanced."
                    },
                    "diagnostics": {
                        "title": "Enable diagnostics",
                        "type": "string",
                        "description": "Comma-separated: cpu, mem, trace, http. Outputs perf profiles to the actor log."
                    },
                    "exit_on_findings": {
                        "title": "Exit with code 1 when findings are present",
                        "type": "boolean",
                        "description": "Default ON. Turn OFF for monitoring (run never fails on leaks).",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```