# Docs-to-RAG AI Crawler (`charitable_jeopardy/webscraperap`) Actor

Stop wasting space on website headers, footers, cookie banners, and navigation menus.

Extract clean body text, chunk it for RAG, and detect page changes across runs crawling public docs, blogs, and knowledge bases,

- **URL**: https://apify.com/charitable\_jeopardy/webscraperap.md
- **Developed by:** [charitable\_jeopardy](https://apify.com/charitable_jeopardy) (community)
- **Categories:** AI, Developer tools, Automation
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $0.20 / 1,000 page scrapeds

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## AI & RAG Documentation Ingester (Pre-Chunked Web Crawler)

Stop wasting LLM tokens and vector DB space on website headers, footers, cookie banners, and navigation menus. 

This Actor crawls public documentation sites, blogs, and knowledge bases, extracts **only the core body content**, and outputs clean, pre-chunked text records mapped to their nearest headings—complete with incremental change detection to keep your vector database synced efficiently.

---

### 🎯 Best For

*   **RAG & LLM Developers** looking to ingest clean documentation, guides, or manuals into vector databases (Pinecone, Qdrant, PGVector, etc.).
*   **AI Product Teams** building custom customer support agents or search engines over vertical/niche websites.
*   **Knowledge Engineers** who need to monitor specific websites and ingest only new or updated pages.

---

### Why this is better than a generic crawler

1.  **Zero Noise**: Automatically strips out navigation links, scripts, CSS, sidebars, newsletter boxes, and cookie overlays before parsing.
2.  **Context-Aware Chunking**: Instead of naive character splitting, it generates overlapping text blocks and attaches the relevant heading hierarchy (`h1`–`h6`) to every single chunk.
3.  **Stateful Incremental Ingestion**: Uses a persistent Key-Value Store across runs to compare page content hashes. It flags pages as `new`, `changed`, or `unchanged` so you only update changed chunks in your database.

---

### 💡 Example Workflow: Ingesting a Blog to Pinecone

1.  **Configure Target**: Input the seed URL or sitemap (e.g., `https://example.com/sitemap.xml`).
2.  **Filter blog posts**: Add `https://example.com/blog/**` to **Include patterns** and exclude tags/authors.
3.  **Enable Chunking & Change Detection**: Set `chunkText: true` and `detectChanges: true`.
4.  **Configure Output**: Set format to `chunks` or `pagesAndChunks`.
5.  **Sync**: Run the Actor, retrieve only the `new` or `changed` chunks from the dataset, and upsert them to your vector database.

---

### 📄 Example Output: Chunk Record

Each chunk is a self-contained record ready for embedding generation:

```json
{
  "recordType": "chunk",
  "chunkId": "a8f9c118bc28a192c73d9059f0f9bde0",
  "pageUrl": "https://example.com/docs/getting-started",
  "canonicalUrl": "https://example.com/docs/getting-started",
  "site": "example.com",
  "title": "Getting Started Guide | Documentation",
  "chunkIndex": 0,
  "chunkText": "To install the library, run 'npm install @sdk/core'. Make sure you have Node.js version 20 or higher installed in your environment before initiating setup...",
  "chunkCharStart": 0,
  "chunkCharEnd": 150,
  "chunkSize": 1000,
  "chunkOverlap": 150,
  "headingsContext": [
    { "level": 1, "text": "Getting Started" },
    { "level": 2, "text": "Installation" }
  ],
  "language": "en",
  "contentHash": "8f3c9e...",
  "timestamp": "2026-06-06T12:00:00.000Z"
}
````

***

### ⚙️ Quick Start

1. **Start URLs / Sitemap URLs**: Provide at least one URL. The default input uses `https://example.com/` so the Actor produces a small dataset item without setup.
2. **Use Browser Rendering**: Toggle on if the page relies heavily on client-side JavaScript (React, Vue, etc.) to render body text.
3. **Max Pages Per Site**: Bounded limit (default `1`) to keep the prefilled run fast and prevent uncontrolled resource use.
4. **Chunk Size & Overlap**: Match this to your LLM's context window guidelines (e.g., size `1000` chars, overlap `150` chars).

### Example Input

```json
{
  "startUrls": [{ "url": "https://example.com/" }],
  "sitemapUrls": [],
  "maxPagesPerSite": 1,
  "includePatterns": [],
  "excludePatterns": [],
  "crawlDepth": 0,
  "maxCrawlRetries": 1,
  "useBrowserRendering": false,
  "languageDetection": true,
  "chunkText": false,
  "chunkSize": 1000,
  "chunkOverlap": 150,
  "outputFormat": "pages",
  "detectChanges": false,
  "storeRawHtml": false,
  "storeCleanText": true
}
```

# Actor input Schema

## `startUrls` (type: `array`):

Public seed URLs to fetch and crawl from.

## `sitemapUrls` (type: `array`):

Sitemap XML, sitemap index, or plain text URL list sources.

## `maxPagesPerSite` (type: `integer`):

The maximum number of successfully crawled pages per hostname/site.

## `includePatterns` (type: `array`):

Glob patterns. Empty allows all in-scope URLs unless excluded.

## `excludePatterns` (type: `array`):

Glob patterns. Exclusions take precedence.

## `crawlDepth` (type: `integer`):

The maximum depth of links to traverse from seed URLs (sitemaps are depth 0).

## `maxCrawlRetries` (type: `integer`):

The maximum number of retry attempts for failed requests.

## `useBrowserRendering` (type: `boolean`):

Enable to render pages using a headless browser (Playwright/Chrome) for JS-heavy sites.

## `languageDetection` (type: `boolean`):

Enable to detect primary language of clean text.

## `chunkText` (type: `boolean`):

Enable to split extracted text into smaller chunk records for RAG.

## `chunkSize` (type: `integer`):

Target character length of each text chunk.

## `chunkOverlap` (type: `integer`):

The number of overlapping characters between consecutive chunks.

## `outputFormat` (type: `string`):

Determines records written to the default Dataset.

## `detectChanges` (type: `boolean`):

Enable to compare content hashes against prior runs using a persistent store.

## `storeRawHtml` (type: `boolean`):

Enable to store raw fetched/rendered HTML in the default Key-Value Store.

## `storeCleanText` (type: `boolean`):

Include cleanText in page records (always used internally for chunking/hashing).

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://example.com/"
    }
  ],
  "sitemapUrls": [],
  "maxPagesPerSite": 1,
  "includePatterns": [],
  "excludePatterns": [],
  "crawlDepth": 0,
  "maxCrawlRetries": 1,
  "useBrowserRendering": false,
  "languageDetection": true,
  "chunkText": false,
  "chunkSize": 1000,
  "chunkOverlap": 150,
  "outputFormat": "pages",
  "detectChanges": false,
  "storeRawHtml": false,
  "storeCleanText": true
}
```

# Actor output Schema

## `CRAWL_SUMMARY` (type: `string`):

Detailed execution metrics (pages fetched, crawled, duplicates, changes, timing) stored as a JSON object in the default Key-Value Store.

## `scrapedResults` (type: `string`):

Cleaned page content and text chunk records pushed to the default Dataset.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://example.com/"
        }
    ],
    "sitemapUrls": [],
    "maxPagesPerSite": 1,
    "includePatterns": [],
    "excludePatterns": [],
    "crawlDepth": 0,
    "maxCrawlRetries": 1,
    "useBrowserRendering": false,
    "languageDetection": true,
    "chunkText": false,
    "chunkSize": 1000,
    "chunkOverlap": 150,
    "outputFormat": "pages",
    "detectChanges": false,
    "storeRawHtml": false,
    "storeCleanText": true
};

// Run the Actor and wait for it to finish
const run = await client.actor("charitable_jeopardy/webscraperap").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://example.com/" }],
    "sitemapUrls": [],
    "maxPagesPerSite": 1,
    "includePatterns": [],
    "excludePatterns": [],
    "crawlDepth": 0,
    "maxCrawlRetries": 1,
    "useBrowserRendering": False,
    "languageDetection": True,
    "chunkText": False,
    "chunkSize": 1000,
    "chunkOverlap": 150,
    "outputFormat": "pages",
    "detectChanges": False,
    "storeRawHtml": False,
    "storeCleanText": True,
}

# Run the Actor and wait for it to finish
run = client.actor("charitable_jeopardy/webscraperap").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://example.com/"
    }
  ],
  "sitemapUrls": [],
  "maxPagesPerSite": 1,
  "includePatterns": [],
  "excludePatterns": [],
  "crawlDepth": 0,
  "maxCrawlRetries": 1,
  "useBrowserRendering": false,
  "languageDetection": true,
  "chunkText": false,
  "chunkSize": 1000,
  "chunkOverlap": 150,
  "outputFormat": "pages",
  "detectChanges": false,
  "storeRawHtml": false,
  "storeCleanText": true
}' |
apify call charitable_jeopardy/webscraperap --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=charitable_jeopardy/webscraperap",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Docs-to-RAG AI Crawler",
        "description": "Stop wasting space on website headers, footers, cookie banners, and navigation menus.\n\nExtract clean body text, chunk it for RAG, and detect page changes across runs crawling public docs, blogs, and knowledge bases,",
        "version": "0.0",
        "x-build-id": "z0fvOAl49PZ0wGc17"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/charitable_jeopardy~webscraperap/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-charitable_jeopardy-webscraperap",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/charitable_jeopardy~webscraperap/runs": {
            "post": {
                "operationId": "runs-sync-charitable_jeopardy-webscraperap",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/charitable_jeopardy~webscraperap/run-sync": {
            "post": {
                "operationId": "run-sync-charitable_jeopardy-webscraperap",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Public seed URLs to fetch and crawl from.",
                        "default": [
                            {
                                "url": "https://example.com/"
                            }
                        ],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "sitemapUrls": {
                        "title": "Sitemap URLs",
                        "type": "array",
                        "description": "Sitemap XML, sitemap index, or plain text URL list sources.",
                        "default": [],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxPagesPerSite": {
                        "title": "Max pages per site",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "The maximum number of successfully crawled pages per hostname/site.",
                        "default": 1
                    },
                    "includePatterns": {
                        "title": "Include URL patterns",
                        "type": "array",
                        "description": "Glob patterns. Empty allows all in-scope URLs unless excluded.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "excludePatterns": {
                        "title": "Exclude URL patterns",
                        "type": "array",
                        "description": "Glob patterns. Exclusions take precedence.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "crawlDepth": {
                        "title": "Crawl depth",
                        "minimum": 0,
                        "maximum": 10,
                        "type": "integer",
                        "description": "The maximum depth of links to traverse from seed URLs (sitemaps are depth 0).",
                        "default": 0
                    },
                    "maxCrawlRetries": {
                        "title": "Max crawl retries",
                        "minimum": 0,
                        "maximum": 5,
                        "type": "integer",
                        "description": "The maximum number of retry attempts for failed requests.",
                        "default": 1
                    },
                    "useBrowserRendering": {
                        "title": "Use browser rendering",
                        "type": "boolean",
                        "description": "Enable to render pages using a headless browser (Playwright/Chrome) for JS-heavy sites.",
                        "default": false
                    },
                    "languageDetection": {
                        "title": "Language detection",
                        "type": "boolean",
                        "description": "Enable to detect primary language of clean text.",
                        "default": true
                    },
                    "chunkText": {
                        "title": "Chunk text",
                        "type": "boolean",
                        "description": "Enable to split extracted text into smaller chunk records for RAG.",
                        "default": false
                    },
                    "chunkSize": {
                        "title": "Chunk size",
                        "minimum": 200,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Target character length of each text chunk.",
                        "default": 1000
                    },
                    "chunkOverlap": {
                        "title": "Chunk overlap",
                        "minimum": 0,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "The number of overlapping characters between consecutive chunks.",
                        "default": 150
                    },
                    "outputFormat": {
                        "title": "Output format",
                        "enum": [
                            "pages",
                            "chunks",
                            "pagesAndChunks"
                        ],
                        "type": "string",
                        "description": "Determines records written to the default Dataset.",
                        "default": "pages"
                    },
                    "detectChanges": {
                        "title": "Detect changes",
                        "type": "boolean",
                        "description": "Enable to compare content hashes against prior runs using a persistent store.",
                        "default": false
                    },
                    "storeRawHtml": {
                        "title": "Store raw HTML",
                        "type": "boolean",
                        "description": "Enable to store raw fetched/rendered HTML in the default Key-Value Store.",
                        "default": false
                    },
                    "storeCleanText": {
                        "title": "Store clean text",
                        "type": "boolean",
                        "description": "Include cleanText in page records (always used internally for chunking/hashing).",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```