# Docs-to-RAG Optimizer (`vamsi-krishna/docs-to-rag-optimizer`) Actor

Convert public developer documentation into clean Markdown, semantic RAG chunks, token counts, duplicate hashes, JSONL exports, and quality warnings for AI assistants.

- **URL**: https://apify.com/vamsi-krishna/docs-to-rag-optimizer.md
- **Developed by:** [Vamsi Krishna](https://apify.com/vamsi-krishna) (community)
- **Categories:** AI, Developer tools, Automation
- **Stats:** 2 total users, 1 monthly users, 33.3% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.50 / 1,000 page processeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Docs to RAG - Documentation to Markdown, JSONL & AI Chunks

Turn public developer documentation into clean, LLM-ready data for RAG pipelines.

This Apify Actor crawls docs websites, removes navigation/sidebar/footer noise, converts pages to Markdown, splits content into semantic chunks, counts tokens, detects duplicates, and exports JSONL files that are easy to load into vector databases and AI search systems.

### Best For

- Building AI assistants over product or developer documentation
- Preparing docs for OpenAI vector stores, Pinecone, Supabase Vector, Weaviate, Qdrant, Chroma, LangChain, and LlamaIndex
- Converting Docusaurus, GitBook, MkDocs/Material, MDN-style, and custom docs pages into clean Markdown
- Creating stable page and chunk records with content hashes for incremental RAG ingestion

### What You Get

- Clean Markdown for every processed page
- Page JSON records in the `pages` dataset
- Chunk JSON records in the `chunks` dataset
- Default dataset records for easy Apify Console/API export
- Consolidated `pages.jsonl` and `chunks.jsonl` exports in the key-value store
- Token counts for pages and chunks using OpenAI-style tokenization
- Header-aware RAG chunks with heading paths and previous/next chunk IDs
- SHA-256 content hashes for pages and chunks
- Exact duplicate detection with `duplicateOf`
- RAG quality score, warnings, and `recommendedAction`
- Optional per-page Markdown files in key-value store

### Why Use This Instead of a Generic Web Scraper?

Generic website scrapers are useful when you need broad website crawling. This Actor is built specifically for documentation-to-RAG workflows:

- Docs-specific cleanup for Docusaurus, GitBook, and MkDocs/Material
- Header-aware chunks instead of fixed character splitting
- `embeddingText` on every chunk for direct vector database ingestion
- Page and chunk JSONL exports for batch pipelines
- Duplicate detection to avoid embedding the same page twice
- Quality warnings so bad extractions are visible before you embed them
- Page-based pricing at `$1.00 / 1,000 pages`, not per generated chunk

### Supported Documentation Platforms

The Actor is optimized for:

- Docusaurus
- GitBook
- MkDocs / Material for MkDocs

Unknown or custom documentation sites use a Readability-based fallback extractor.

### Example Use Cases

- Crawl `https://docusaurus.io/docs` and create JSONL chunks for a docs chatbot
- Convert GitBook docs into Markdown files for an internal knowledge base
- Extract MkDocs/Material documentation into chunk records for Supabase Vector
- Deduplicate repeated docs pages before embedding to reduce vector database cost
- Build an AI search index from public developer documentation

### Example Input

```json
{
  "startUrls": [{ "url": "https://docusaurus.io/docs" }],
  "maxPages": 50,
  "maxDepth": 3,
  "includePatterns": ["^https://docusaurus\\.io/docs"],
  "excludePatterns": ["/blog/"],
  "outputFormats": ["json", "markdown"],
  "chunkingEnabled": true,
  "chunkStrategy": "header-aware",
  "chunkSize": 800,
  "chunkOverlap": 100,
  "deduplicateContent": true,
  "respectRobotsTxt": true,
  "maxConcurrency": 5
}
````

### Example Page Output

```json
{
  "recordType": "page",
  "url": "https://docusaurus.io/docs",
  "canonicalUrl": "https://docusaurus.io/docs",
  "title": "Introduction | Docusaurus",
  "metadata": {
    "docsPlatform": "docusaurus",
    "language": "en"
  },
  "tokenCount": 2189,
  "contentHash": "sha256:...",
  "duplicateOf": null,
  "qualityScore": 95,
  "recommendedAction": "use"
}
```

Page records also include `cleanMarkdown`, `textContent`, `headings`, `codeBlocks`, `tables`, `links`, `qualityWarnings`, and `crawledAt`.

### Example Chunk Output

```json
{
  "recordType": "chunk",
  "chunkId": "chunk_abc123_000",
  "sourceUrl": "https://docusaurus.io/docs",
  "pageTitle": "Introduction | Docusaurus",
  "sectionTitle": "Getting started",
  "headingPath": ["Introduction", "Getting started"],
  "embeddingText": "Install Docusaurus and create your first docs site...",
  "tokenCount": 392,
  "chunkIndex": 0,
  "previousChunkId": null,
  "nextChunkId": "chunk_abc123_001",
  "contentHash": "..."
}
```

Chunk records also include `chunkMarkdown`, `chunkText`, and metadata such as `docsPlatform`, `hasCodeBlock`, `hasTable`, `sourceLastModified`, and `sourceContentHash`.

### Copy-Paste API Example

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });

const run = await client.actor('YOUR_USERNAME/docs-to-rag-optimizer').call({
  startUrls: [{ url: 'https://docusaurus.io/docs' }],
  maxPages: 50,
  includePatterns: ['^https://docusaurus\\.io/docs'],
  outputFormats: ['json', 'markdown'],
  chunkingEnabled: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
const chunks = items.filter((item) => item.recordType === 'chunk');
```

For embeddings, use `embeddingText` from chunk records and store `sourceUrl`, `pageTitle`, `headingPath`, `contentHash`, and `metadata` as vector metadata.

### Output Locations

- Named dataset `pages`: one page record per successfully processed page
- Named dataset `chunks`: one chunk record per generated chunk
- Key-value store `pages.jsonl`: consolidated page export
- Key-value store `chunks.jsonl`: consolidated chunk export
- Key-value store `OUTPUT.json`: run summary with counts and export keys
- Key-value store `pages_<sha256>.md`: optional per-page Markdown when `outputFormats` includes `markdown`

### Pricing

Pricing is based on successfully processed pages:

- Base price: `$1.00 / 1,000 pages`
- Starter discount: `$0.90 / 1,000 pages`
- Scale discount: `$0.75 / 1,000 pages`
- Business discount: `$0.50 / 1,000 pages`

The Actor charges the `page-processed` event only after a page has been crawled, extracted, converted, saved, and chunked when chunking is enabled.

It does not charge per chunk. Large pages may produce many chunks, but billing remains page-based.

### Known Limits

- Private docs behind login are not supported in v1.
- PDF/DOCX extraction is not included in v1.
- JavaScript-heavy docs use a Playwright fallback, but static docs are faster and cheaper.
- Exact duplicate detection uses normalized text hashes; near-duplicate detection is not included yet.

### Input Fields

- `startUrls`: documentation URLs to start crawling
- `sitemapUrls`: optional XML sitemap URLs
- `maxPages`: maximum successfully processed pages
- `maxDepth`: maximum crawl depth
- `includePatterns`: JavaScript regex strings for allowed URLs
- `excludePatterns`: JavaScript regex strings for blocked URLs
- `crawlOnlyDocs`: skip obvious non-doc paths such as blog, pricing, login, legal
- `outputFormats`: `json`, `markdown`, or both
- `removeSelectors`: CSS selectors to remove before extraction
- `keepSelectors`: CSS selectors to restrict extraction to specific areas
- `preserveCodeBlocks`: keep fenced code blocks
- `preserveTables`: keep GitHub-Flavored Markdown tables
- `preserveLinks`: keep links in Markdown and JSON
- `chunkingEnabled`: generate RAG chunks
- `chunkStrategy`: `header-aware`
- `chunkSize`: target chunk size in tokens
- `chunkOverlap`: approximate chunk overlap in tokens
- `deduplicateContent`: mark exact duplicate pages and skip duplicate chunking
- `respectRobotsTxt`: respect robots.txt rules
- `maxConcurrency`: maximum concurrent requests

### URL Pattern Policy

`includePatterns` and `excludePatterns` are treated as JavaScript regular expression strings and compiled with `new RegExp(pattern)`.

Example:

```json
{
  "includePatterns": ["^https://developer\\.mozilla\\.org/en-US/docs/Web/JavaScript"],
  "excludePatterns": ["/contributors\\.txt$", "/blog/"]
}
```

### Quality Signals

Each page includes:

- `qualityScore`: deterministic 0-100 score
- `qualityWarnings`: extraction/chunking warnings
- `recommendedAction`: `use`, `review`, or `skip`

These fields help identify pages that are ready for embedding versus pages that need manual review.

### Local Development

```bash
pnpm install
pnpm run build
pnpm start
```

Run locally with Apify CLI:

```bash
apify run --purge --input-file INPUT.example.json
```

### Search Keywords

RAG, LLM, AI assistant, documentation scraper, docs scraper, Markdown scraper, JSONL export, vector database, embeddings, chunks, semantic chunking, Docusaurus scraper, GitBook scraper, MkDocs scraper, Material for MkDocs, developer docs, AI search, LangChain, LlamaIndex, OpenAI, Pinecone, Supabase Vector.

# Actor input Schema

## `startUrls` (type: `array`):

One or more public documentation URLs to crawl and convert into Markdown, JSONL, and RAG chunks.

## `sitemapUrls` (type: `array`):

Optional XML sitemap URLs to seed additional pages.

## `maxPages` (type: `integer`):

Maximum number of pages to successfully process (stored + charged).

## `maxDepth` (type: `integer`):

Maximum crawl depth from the start URLs.

## `includePatterns` (type: `array`):

Only crawl URLs matching at least one regex. If empty, include all (subject to other filters).

## `excludePatterns` (type: `array`):

Skip URLs matching any regex.

## `crawlOnlyDocs` (type: `boolean`):

Skip obvious non-docs pages (blog, pricing, login, etc.).

## `outputFormats` (type: `array`):

Output formats to produce.

## `removeSelectors` (type: `array`):

CSS selectors to remove before extraction/markdown conversion.

## `keepSelectors` (type: `array`):

If provided, restrict extraction to these selectors.

## `preserveCodeBlocks` (type: `boolean`):

Preserve fenced code blocks in the generated Markdown and extract code blocks into JSON.

## `preserveTables` (type: `boolean`):

Convert HTML tables into GitHub-Flavored Markdown tables and include table HTML in JSON.

## `preserveLinks` (type: `boolean`):

Keep links in Markdown and include normalized links in JSON.

## `chunkingEnabled` (type: `boolean`):

Generate chunk-level records suitable for RAG ingestion.

## `chunkStrategy` (type: `string`):

Chunking strategy used to split pages into semantic chunks.

## `chunkSize` (type: `integer`):

Target maximum tokens per chunk (approximate).

## `chunkOverlap` (type: `integer`):

Approximate overlap between consecutive chunks to improve retrieval continuity.

## `deduplicateContent` (type: `boolean`):

Detect exact-duplicate pages by content hash and mark them via duplicateOf.

## `respectRobotsTxt` (type: `boolean`):

Respect robots.txt disallow rules when crawling.

## `maxConcurrency` (type: `integer`):

Maximum concurrent page requests.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://docusaurus.io/docs"
    }
  ],
  "sitemapUrls": [],
  "maxPages": 100,
  "maxDepth": 5,
  "includePatterns": [],
  "excludePatterns": [],
  "crawlOnlyDocs": true,
  "outputFormats": [
    "json",
    "markdown"
  ],
  "removeSelectors": [],
  "keepSelectors": [],
  "preserveCodeBlocks": true,
  "preserveTables": true,
  "preserveLinks": true,
  "chunkingEnabled": true,
  "chunkStrategy": "header-aware",
  "chunkSize": 800,
  "chunkOverlap": 100,
  "deduplicateContent": true,
  "respectRobotsTxt": true,
  "maxConcurrency": 5
}
```

# Actor output Schema

## `dataset` (type: `string`):

No description

## `exports` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://docusaurus.io/docs"
        }
    ],
    "sitemapUrls": [],
    "outputFormats": [
        "json",
        "markdown"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("vamsi-krishna/docs-to-rag-optimizer").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://docusaurus.io/docs" }],
    "sitemapUrls": [],
    "outputFormats": [
        "json",
        "markdown",
    ],
}

# Run the Actor and wait for it to finish
run = client.actor("vamsi-krishna/docs-to-rag-optimizer").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://docusaurus.io/docs"
    }
  ],
  "sitemapUrls": [],
  "outputFormats": [
    "json",
    "markdown"
  ]
}' |
apify call vamsi-krishna/docs-to-rag-optimizer --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=vamsi-krishna/docs-to-rag-optimizer",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Docs-to-RAG Optimizer",
        "description": "Convert public developer documentation into clean Markdown, semantic RAG chunks, token counts, duplicate hashes, JSONL exports, and quality warnings for AI assistants.",
        "version": "0.1",
        "x-build-id": "ceR4NC1byF0zmjUFg"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/vamsi-krishna~docs-to-rag-optimizer/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-vamsi-krishna-docs-to-rag-optimizer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/vamsi-krishna~docs-to-rag-optimizer/runs": {
            "post": {
                "operationId": "runs-sync-vamsi-krishna-docs-to-rag-optimizer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/vamsi-krishna~docs-to-rag-optimizer/run-sync": {
            "post": {
                "operationId": "run-sync-vamsi-krishna-docs-to-rag-optimizer",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "One or more public documentation URLs to crawl and convert into Markdown, JSONL, and RAG chunks.",
                        "default": [
                            {
                                "url": "https://docusaurus.io/docs"
                            }
                        ],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "sitemapUrls": {
                        "title": "Sitemap URLs",
                        "type": "array",
                        "description": "Optional XML sitemap URLs to seed additional pages.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "maxPages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of pages to successfully process (stored + charged).",
                        "default": 100
                    },
                    "maxDepth": {
                        "title": "Max depth",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Maximum crawl depth from the start URLs.",
                        "default": 5
                    },
                    "includePatterns": {
                        "title": "Include patterns (regex)",
                        "type": "array",
                        "description": "Only crawl URLs matching at least one regex. If empty, include all (subject to other filters).",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "excludePatterns": {
                        "title": "Exclude patterns (regex)",
                        "type": "array",
                        "description": "Skip URLs matching any regex.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "crawlOnlyDocs": {
                        "title": "Crawl only docs pages",
                        "type": "boolean",
                        "description": "Skip obvious non-docs pages (blog, pricing, login, etc.).",
                        "default": true
                    },
                    "outputFormats": {
                        "title": "Output formats",
                        "type": "array",
                        "description": "Output formats to produce.",
                        "items": {
                            "type": "string",
                            "enum": [
                                "json",
                                "markdown"
                            ]
                        },
                        "default": [
                            "json",
                            "markdown"
                        ]
                    },
                    "removeSelectors": {
                        "title": "Remove selectors",
                        "type": "array",
                        "description": "CSS selectors to remove before extraction/markdown conversion.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "keepSelectors": {
                        "title": "Keep selectors",
                        "type": "array",
                        "description": "If provided, restrict extraction to these selectors.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "preserveCodeBlocks": {
                        "title": "Preserve code blocks",
                        "type": "boolean",
                        "description": "Preserve fenced code blocks in the generated Markdown and extract code blocks into JSON.",
                        "default": true
                    },
                    "preserveTables": {
                        "title": "Preserve tables",
                        "type": "boolean",
                        "description": "Convert HTML tables into GitHub-Flavored Markdown tables and include table HTML in JSON.",
                        "default": true
                    },
                    "preserveLinks": {
                        "title": "Preserve links",
                        "type": "boolean",
                        "description": "Keep links in Markdown and include normalized links in JSON.",
                        "default": true
                    },
                    "chunkingEnabled": {
                        "title": "Chunking enabled",
                        "type": "boolean",
                        "description": "Generate chunk-level records suitable for RAG ingestion.",
                        "default": true
                    },
                    "chunkStrategy": {
                        "title": "Chunk strategy",
                        "enum": [
                            "header-aware"
                        ],
                        "type": "string",
                        "description": "Chunking strategy used to split pages into semantic chunks.",
                        "default": "header-aware"
                    },
                    "chunkSize": {
                        "title": "Chunk size (tokens)",
                        "minimum": 50,
                        "type": "integer",
                        "description": "Target maximum tokens per chunk (approximate).",
                        "default": 800
                    },
                    "chunkOverlap": {
                        "title": "Chunk overlap (tokens)",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Approximate overlap between consecutive chunks to improve retrieval continuity.",
                        "default": 100
                    },
                    "deduplicateContent": {
                        "title": "Deduplicate content",
                        "type": "boolean",
                        "description": "Detect exact-duplicate pages by content hash and mark them via duplicateOf.",
                        "default": true
                    },
                    "respectRobotsTxt": {
                        "title": "Respect robots.txt",
                        "type": "boolean",
                        "description": "Respect robots.txt disallow rules when crawling.",
                        "default": true
                    },
                    "maxConcurrency": {
                        "title": "Max concurrency",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum concurrent page requests.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
