# Website Content Extractor for RAG: Markdown, HTML, Text (`nezha/website-content-crawler`) Actor

Turn docs sites, help centers, blogs, and websites into clean markdown, text, or HTML for RAG, AI knowledge bases, and internal search. Crawl from start URLs or sitemaps and keep the crawl in scope.

- **URL**: https://apify.com/nezha/website-content-crawler.md
- **Developed by:** [nezha](https://apify.com/nezha) (community)
- **Categories:** AI, Developer tools, SEO tools
- **Stats:** 17 total users, 4 monthly users, 95.3% runs succeeded, 1 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $0.01 / result

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Website Content Extractor for RAG: Markdown, HTML, Text

Turn docs sites, help centers, blogs, and websites into clean markdown, text, or HTML for RAG, internal search, and AI knowledge bases.

### What this Actor does

Most teams do not need "a crawler." They need a fast way to turn a website into usable content for:

- embeddings and chunking pipelines
- internal search and AI assistants
- help center or docs ingestion
- markdown, text, or HTML exports without manual copy-paste

This Actor helps by turning website pages into a structured content dataset with cleaned page text, markdown, HTML, headings, crawl metadata, and optional clean HTML records in key-value store.

### Quick start

1. Paste a docs site, help center, or website URL into **Website or Docs URLs**.
2. Keep `crawlMode: sitemap`, `maxPages: 3`, and `outputFormat: markdown` for the first run.
3. Click **Run**.
4. Download the dataset or use the API output directly.

### Use cases

**Docs site to RAG**  
Crawl developer docs, product docs, or API docs, then export markdown or clean HTML ready for chunking, embeddings, and retrieval.

**Help center to AI support**  
Extract support articles as clean text or markdown for internal search, support copilots, and FAQ assistants.

**Website to knowledge base**  
Capture blog posts, product pages, and guide content as structured text with titles, headings, canonical URLs, and crawl metadata.

### Output preview

Here is a simplified preview of the extracted dataset:

| URL | Title | Format | Words | Language | Depth |
| --- | --- | --- | --- | --- | --- |
| `/academy/web-scraping-for-beginners` | Web scraping for beginners | markdown | 1842 | en | 1 |
| `/academy/api-integration-guide` | API integration guide | markdown | 1267 | en | 1 |
| `/academy/rag-pipeline-basics` | RAG pipeline basics | markdown | 2135 | en | 1 |

The same record can also include:

| Extra field group | Example value |
| --- | --- |
| Content outputs | `content`, `markdown`, `text`, `html` |
| Structure signals | `title`, `description`, `headings`, `canonicalUrl` |
| Crawl metadata | `depth`, `httpStatusCode`, `language`, `wordCount`, `crawledAt` |
| Clean HTML storage | `CLEAN_HTML_INDEX` plus separate clean HTML records |
| Run diagnostics | `OUTPUT_SUMMARY`, `FAILED_PAGES`, `SKIPPED_PAGES` |

Typical fields include:

- page identity: `url`, `title`, `description`, `canonicalUrl`
- main content outputs: `content`, `markdown`, `text`, `html`, `cleanHtml`
- page structure: `headings`
- crawl metadata: `contentFormat`, `wordCount`, `language`, `depth`, `httpStatusCode`, `crawledAt`
- run-level outputs: `OUTPUT_SUMMARY`, `FAILED_PAGES`, `SKIPPED_PAGES`, `CLEAN_HTML_INDEX`

#### Full JSON preview

If you want to inspect a more complete example record, open the preview below.

<details>
<summary>Show full example JSON record</summary>

```json
{
  "url": "https://docs.apify.com/api",
  "title": "Apify Documentation",
  "description": "",
  "content": "### REST API\nThe Apify API is built around HTTP REST and returns JSON-encoded responses...",
  "contentFormat": "markdown",
  "markdown": "### REST API\nThe Apify API is built around HTTP REST and returns JSON-encoded responses...",
  "text": "REST API The Apify API is built around HTTP REST and returns JSON-encoded responses...",
  "html": "<h2>REST API</h2><p>The Apify API is built around HTTP REST...</p>",
  "cleanHtml": "<h2>REST API</h2><p>The Apify API is built around HTTP REST...</p>",
  "canonicalUrl": "https://docs.apify.com/api",
  "headings": [
    "REST API",
    "API clients",
    "JavaScript API client"
  ],
  "wordCount": 654,
  "language": "en",
  "depth": 1,
  "httpStatusCode": 200,
  "crawledAt": "2026-04-24T12:34:56.000Z"
}
````

</details>

### Examples

#### Option 1: Crawl directly from website pages

Best when you want to start from one section and follow links recursively.

```json
{
  "startUrls": [
    {
      "url": "https://docs.apify.com/academy"
    }
  ],
  "crawlMode": "website",
  "outputFormat": "markdown",
  "maxPages": 20,
  "maxDepth": 2,
  "sameDomainOnly": true,
  "saveCleanHtml": true
}
```

#### Option 2: Crawl from sitemap URLs

Best when the target site already has a sitemap and you want broader coverage with cleaner URL discovery.

```json
{
  "startUrls": [
    {
      "url": "https://docs.apify.com/academy"
    }
  ],
  "crawlMode": "sitemap",
  "sitemapUrls": [
    "https://docs.apify.com/sitemap.xml"
  ],
  "maxPages": 50,
  "maxDepth": 0,
  "outputFormat": "markdown",
  "sameDomainOnly": true,
  "saveCleanHtml": true
}
```

### Best practices

This Actor does more than return a list of URLs.

- You get the main content in markdown, text, and HTML.
- You get structure signals such as titles, headings, descriptions, and canonical URLs.
- You get crawl metadata such as word count, depth, language, status code, and crawl time.
- You can store clean HTML separately for downstream parsing or chunking.
- You also get run diagnostics for failed pages, skipped pages, and summary totals.

That combination makes the output useful not just for scraping, but for ingestion, QA, chunking, embeddings, search, and AI application pipelines.

### API access

Developers can run this Actor programmatically through the Apify API or the Apify Python and JavaScript clients.

- API reference: [Apify API](https://docs.apify.com/api/v2)
- Client docs: [Apify clients](https://docs.apify.com/api/client)

# Actor input Schema

## `startUrls` (type: `array`):

One or more docs, help center, blog, or website URLs to crawl.

## `crawlMode` (type: `string`):

Choose whether to follow links from website pages or load URLs from sitemaps. Sitemap mode is the fastest and most stable first run for docs sites.

## `sitemapUrls` (type: `array`):

Optional sitemap URLs. Used in sitemap mode. If empty, the actor tries /sitemap.xml from each start URL origin.

## `maxPages` (type: `integer`):

Maximum number of pages to extract in one run. Keep 3 for a fast preview and increase it after validation.

## `maxDepth` (type: `integer`):

How deep the crawler can follow links from the start URLs.

## `outputFormat` (type: `string`):

Choose what the main content field should store in the dataset.

## `sameDomainOnly` (type: `boolean`):

When enabled, only crawl links within the same origin and path scope as the start URLs.

## `contentSelector` (type: `string`):

Optional CSS selector for the content root. Falls back to main/article/body.

## `removeSelectors` (type: `array`):

Optional CSS selectors to remove before extracting content.

## `includeUrlGlobs` (type: `array`):

Optional glob patterns for links you want to keep.

## `excludeUrlGlobs` (type: `array`):

Optional glob patterns for links you want to skip.

## `excludeFileExtensions` (type: `array`):

File extensions to skip, for example pdf, jpg, png, or zip.

## `minTextLength` (type: `integer`):

Skip thin pages when extracted text is shorter than this number of characters.

## `waitForSelector` (type: `string`):

Optional CSS selector to wait for before extracting content.

## `navigationTimeoutSecs` (type: `integer`):

Timeout for page navigation and optional selector waiting.

## `saveCleanHtml` (type: `boolean`):

Store cleaned HTML separately in key-value store records and index them in CLEAN\_HTML\_INDEX for downstream chunking or parsing. Disable for the fastest preview run.

## `proxyConfiguration` (type: `object`):

Optional Apify proxy configuration.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://docs.apify.com/academy"
    }
  ],
  "crawlMode": "sitemap",
  "sitemapUrls": [
    "https://docs.apify.com/sitemap.xml"
  ],
  "maxPages": 3,
  "maxDepth": 2,
  "outputFormat": "markdown",
  "sameDomainOnly": true,
  "minTextLength": 0,
  "navigationTimeoutSecs": 25,
  "saveCleanHtml": false
}
```

# Actor output Schema

## `dataset` (type: `string`):

Extracted pages with content in markdown, text, or HTML.

## `outputSummary` (type: `string`):

Run totals, crawl settings, and extraction summary.

## `failedPages` (type: `string`):

Pages that failed during crawling.

## `skippedPages` (type: `string`):

Pages skipped because of scope or content rules.

## `cleanHtmlIndex` (type: `string`):

Index of separate clean HTML records stored in the key-value store.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://docs.apify.com/academy"
        }
    ],
    "sitemapUrls": [
        "https://docs.apify.com/sitemap.xml"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("nezha/website-content-crawler").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://docs.apify.com/academy" }],
    "sitemapUrls": ["https://docs.apify.com/sitemap.xml"],
}

# Run the Actor and wait for it to finish
run = client.actor("nezha/website-content-crawler").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://docs.apify.com/academy"
    }
  ],
  "sitemapUrls": [
    "https://docs.apify.com/sitemap.xml"
  ]
}' |
apify call nezha/website-content-crawler --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=nezha/website-content-crawler",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Website Content Extractor for RAG: Markdown, HTML, Text",
        "description": "Turn docs sites, help centers, blogs, and websites into clean markdown, text, or HTML for RAG, AI knowledge bases, and internal search. Crawl from start URLs or sitemaps and keep the crawl in scope.",
        "version": "0.1",
        "x-build-id": "D3m79KEDVJDrqhjFs"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/nezha~website-content-crawler/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-nezha-website-content-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/nezha~website-content-crawler/runs": {
            "post": {
                "operationId": "runs-sync-nezha-website-content-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/nezha~website-content-crawler/run-sync": {
            "post": {
                "operationId": "run-sync-nezha-website-content-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Website or Docs URLs",
                        "type": "array",
                        "description": "One or more docs, help center, blog, or website URLs to crawl.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "crawlMode": {
                        "title": "How To Discover Pages",
                        "enum": [
                            "website",
                            "sitemap"
                        ],
                        "type": "string",
                        "description": "Choose whether to follow links from website pages or load URLs from sitemaps. Sitemap mode is the fastest and most stable first run for docs sites.",
                        "default": "sitemap"
                    },
                    "sitemapUrls": {
                        "title": "Sitemap URLs",
                        "type": "array",
                        "description": "Optional sitemap URLs. Used in sitemap mode. If empty, the actor tries /sitemap.xml from each start URL origin.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxPages": {
                        "title": "Max Pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of pages to extract in one run. Keep 3 for a fast preview and increase it after validation.",
                        "default": 3
                    },
                    "maxDepth": {
                        "title": "Max Depth",
                        "minimum": 0,
                        "type": "integer",
                        "description": "How deep the crawler can follow links from the start URLs.",
                        "default": 2
                    },
                    "outputFormat": {
                        "title": "Main Content Format",
                        "enum": [
                            "markdown",
                            "text",
                            "html"
                        ],
                        "type": "string",
                        "description": "Choose what the main content field should store in the dataset.",
                        "default": "markdown"
                    },
                    "sameDomainOnly": {
                        "title": "Stay In Target Site Scope",
                        "type": "boolean",
                        "description": "When enabled, only crawl links within the same origin and path scope as the start URLs.",
                        "default": true
                    },
                    "contentSelector": {
                        "title": "Content Selector",
                        "type": "string",
                        "description": "Optional CSS selector for the content root. Falls back to main/article/body."
                    },
                    "removeSelectors": {
                        "title": "Remove Selectors",
                        "type": "array",
                        "description": "Optional CSS selectors to remove before extracting content.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "includeUrlGlobs": {
                        "title": "Include URL Globs",
                        "type": "array",
                        "description": "Optional glob patterns for links you want to keep.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeUrlGlobs": {
                        "title": "Exclude URL Globs",
                        "type": "array",
                        "description": "Optional glob patterns for links you want to skip.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludeFileExtensions": {
                        "title": "Exclude File Extensions",
                        "type": "array",
                        "description": "File extensions to skip, for example pdf, jpg, png, or zip.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "minTextLength": {
                        "title": "Min Text Length",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Skip thin pages when extracted text is shorter than this number of characters.",
                        "default": 0
                    },
                    "waitForSelector": {
                        "title": "Wait For Selector",
                        "type": "string",
                        "description": "Optional CSS selector to wait for before extracting content."
                    },
                    "navigationTimeoutSecs": {
                        "title": "Navigation Timeout (secs)",
                        "minimum": 15,
                        "type": "integer",
                        "description": "Timeout for page navigation and optional selector waiting.",
                        "default": 25
                    },
                    "saveCleanHtml": {
                        "title": "Store Clean HTML Separately",
                        "type": "boolean",
                        "description": "Store cleaned HTML separately in key-value store records and index them in CLEAN_HTML_INDEX for downstream chunking or parsing. Disable for the fastest preview run.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Optional Apify proxy configuration."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
