# Modern Web Crawler — Adaptive + Stealth + Analytics (`brilliant_gum/phantom-reborn-crawler`) Actor

Modern replacement for the Legacy PhantomJS Crawler. Auto HTTP/Browser detection, basic anti-bot stealth, built-in analytics, data quality scoring, captcha solver integration. Modern Chrome + Cheerio engine — no PhantomJS, no abandoned tech. Proxies included.

- **URL**: https://apify.com/brilliant\_gum/phantom-reborn-crawler.md
- **Developed by:** [Yuliia Kulakova](https://apify.com/brilliant_gum) (community)
- **Categories:** Developer tools, SEO tools, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 page scrapeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Modern Web Crawler — Adaptive + Stealth + Analytics

> Auto HTTP/Browser detection. Anti-bot stealth. Captcha solver. Proxies included. Production-grade replacement for the Legacy PhantomJS Crawler — no PhantomJS, no abandoned browser engine.

![Modern Web Crawler — Adaptive · Stealth · Analytics](https://i.imgur.com/mFQBVQs.png)

---

### 🔥 Why migrate from PhantomJS Crawler?

The [Legacy PhantomJS Crawler](https://apify.com/apify/legacy-phantomjs-crawler) ships with a browser engine **abandoned in 2018**. It only speaks ES5.1, can't render modern HTML5/CSS3, is trivially fingerprinted by every anti-bot system, and receives no security updates.

This actor is what PhantomJS Crawler users have been asking for — a modern crawler with a familiar input shape (`startUrls`, `pageFunction`, `globs`/`excludeGlobs`), backed by real Chrome + Cheerio under the hood.

---

### ✨ What you get

#### 🎯 Adaptive crawler mode

Pick **HTTP**, **Browser**, or let the crawler decide:

- **HTTP (Cheerio)** — fast and cheap, ~10–20× lighter than browser mode. Great for static pages, news sites, blogs, REST APIs.
- **Browser (Playwright + Chromium)** — full JavaScript rendering. Required for SPAs, React/Vue/Next.js, infinite scroll, sites that fetch content via XHR.
- **Adaptive (default)** — fetches every page with HTTP first. If the page is detected as JavaScript-rendered (empty content, SPA shell, inline `var data=[...]` patterns, fetch/XHR-driven rendering), the crawler remembers the domain and switches it to Browser mode for the rest of the crawl. Best of both worlds — no wasted browser launches on static content.

#### 🥷 Anti-bot stealth (basic protection sites)

Built-in patches that defeat common detection patterns:

- `navigator.webdriver` flag masking
- Realistic plugin / language / permission fingerprints
- Rotating, browser-realistic User-Agent + full `Sec-Fetch-*` header set
- Session pool with cookie persistence across requests
- Per-request random delays with human-like jitter

**What this defeats:** sites with basic to moderate detection — most corporate sites, news portals, documentation pages, marketplaces without dedicated bot defence.

**What this does *not* defeat:** premium anti-bot services like **Cloudflare Pro Challenge**, **PerimeterX / HUMAN**, **DataDome**, **Akamai Bot Manager**, **Imperva Incapsula**. For those sites, configure the **captcha solver** below or use commercial unblocking services.

#### 🧠 Captcha solver integration

Plug in your **2Captcha / Anti-Captcha / CapMonster** API key in `captchaSolverApiKey` and the crawler will automatically detect and solve:

- Cloudflare Turnstile
- reCAPTCHA v2 / v3
- hCaptcha
- Image / text challenges

Captcha-protected pages get a second pass after the solver returns the token.

#### 📊 Built-in analytics

Every crawl produces a comprehensive report saved to the run's Key-Value store as `ANALYTICS_REPORT`:

- Success rate, error breakdown by category
- Response time percentiles (avg, median, p95, p99)
- HTTP vs Browser distribution + adaptive switch count
- Per-domain crawl speed
- Total bandwidth consumed
- Top 10 errors with URLs (for debugging)
- Depth distribution histogram

No extra setup — just one toggle (`enableAnalytics`, default on).

#### ✅ Data quality score

Every item gets a `dataQuality` block — a 0–100 score plus a breakdown of what's missing:

```json
{
  "isValid": true,
  "score": 100,
  "missingFields": [],
  "emptyFields": [],
  "warnings": []
}
````

- `score` — 0–100 completeness rating. With `requiredFields` set, it's the percentage of those fields that are present and non-empty. Without `requiredFields`, the score reflects general data issues (encoding, whitespace-only values, etc.).
- `isValid` — `true` when all required fields are present and non-empty.
- `missingFields` / `emptyFields` — which required fields didn't make it.
- `warnings` — encoding issues, replacement characters, whitespace-only values.

Validate against any field from the merged item (top-level **and** your page function's return value). Filter or sort your dataset by `dataQuality.score` to surface the best results first.

#### 🪲 Error debugging

When a page fails:

- Categorized error type (Timeout, Blocked, RateLimited, Navigation, JsError, …)
- Optional HTML snapshot saved to Key-Value store (`saveSnapshots: true`, default on)
- Optional screenshot in browser mode (`saveScreenshotPerPage: true`)

Makes "why did this page fail?" answerable in seconds.

***

### 💡 Use cases

| Audience | Example use |
|---|---|
| Migration from PhantomJS Crawler | Drop-in replacement — same `pageFunction` shape, same `startUrls`/`globs` filters |
| SEO research | Crawl a domain, extract Open Graph + Twitter + JSON-LD metadata at scale |
| Lead generation | Extract contacts (emails, phones, social profiles) from company sites |
| Content monitoring | Watch a list of URLs for changes — adaptive rendering catches client-side updates |
| AI training data | Crawl reference sites, output clean text per page for embeddings |
| QA / compliance | Validate every page in a site against a schema (e.g. "every product page must have `price`, `availability`, `sku`") |

***

### 🚀 Quick start

Minimum input — start from a URL, accept all defaults:

```json
{
  "startUrls": [{"url": "https://example.com/"}]
}
```

This crawls `example.com` in adaptive mode, max 100 pages, depth 10, extracts metadata, validates output. The default page function returns `{ url, title, description, h1, canonicalUrl }`.

***

### 📥 Common inputs

#### Limit the crawl

```json
{
  "startUrls": [{"url": "https://example.com/"}],
  "maxCrawlPages": 50,
  "maxCrawlDepth": 3,
  "maxConcurrency": 10
}
```

#### Filter URLs with globs

```json
{
  "startUrls": [{"url": "https://example.com/"}],
  "globs": [{"glob": "https://example.com/products/**"}],
  "excludeGlobs": [
    {"glob": "**/login**"},
    {"glob": "**/account/**"}
  ]
}
```

#### Custom page function (extract structured data)

```json
{
  "startUrls": [{"url": "https://quotes.toscrape.com/"}],
  "maxCrawlPages": 10,
  "pageFunction": "async function pageFunction(context) {\n  const { $, request } = context;\n  const quotes = [];\n  $('div.quote').each((i, el) => {\n    quotes.push({\n      text: $(el).find('span.text').text(),\n      author: $(el).find('small.author').text()\n    });\n  });\n  return { url: request.loadedUrl, quotes };\n}"
}
```

#### Force browser mode (for JS-heavy sites)

```json
{
  "startUrls": [{"url": "https://app.example.com/"}],
  "crawlerMode": "browser",
  "waitUntil": "networkidle"
}
```

#### Premium anti-bot site (captcha solver)

```json
{
  "startUrls": [{"url": "https://protected-site.com/"}],
  "crawlerMode": "browser",
  "stealthMode": true,
  "captchaSolverApiKey": "YOUR_2CAPTCHA_KEY"
}
```

***

### 📤 Output

One item per crawled page. Top-level fields the actor always populates:

| Field | Type | Description |
|---|---|---|
| `url` | string | Original request URL |
| `loadedUrl` | string | Final URL after redirects |
| `httpStatus` | number | Final HTTP status code |
| `title` | string | `<title>` of the page |
| `type` | string | `"StartUrl"` or `"FoundLink"` |
| `referrerUrl` | string | Page that linked to this one (null for start URLs) |
| `crawlDepth` | number | Link distance from a start URL |
| `crawlerMode` | string | `"http"` or `"browser"` for this specific page |
| `loadTimeMs` | number | Page load time |
| `downloadedBytes` | number | Response body size |
| `requestedAt` / `loadingFinishedAt` / `timestamp` | ISO date | Per-request timing |
| `requestId` | string | Unique ID for cross-reference |
| `retryCount` | number | Retries before success |
| `method` | string | HTTP method used |
| `responseHeaders` | object | All response headers |
| `pageFunctionResult` | object | Whatever your `pageFunction` returned |
| `metadata` | object | `og`, `twitter`, `jsonLd`, `meta` (if `extractMetadata: true`) |
| `contacts` | object | `emails`, `phones`, `socialLinks` (if `extractContacts: true`) |
| `cleanText` | string | Boilerplate-free body text (if `extractCleanText: true`) |
| `dataQuality` | object | `{isValid, score, missingFields, emptyFields, warnings}` (if `dataQualityChecks: true`) |

***

### 💰 Pricing

Pay only for what you use:

| Event | Price |
|---|---|
| Actor start | $0.01 per run |
| Page scraped | $0.01 per page saved to the dataset |

**Example runs:**

- 50 pages → $0.51
- 500 pages → $5.01
- 5,000 pages → $50.01

**Proxies included.** No configuration required — the actor handles proxy setup automatically so the scraper works out of the box. You can supply your own proxy URLs in `proxyConfiguration` if you need a specific country, ISP, or your own provider.

Platform compute and storage are billed separately by Apify at standard rates.

***

### ❓ FAQ

**Is this a drop-in replacement for the Legacy PhantomJS Crawler?**
Input shape is intentionally close — `startUrls`, `pageFunction`, `globs`, `excludeGlobs`, `pseudoUrls`, `linkSelector` all map directly. The main difference is that `pageFunction` now runs in a Node.js / Playwright context with full ES2022+ support instead of PhantomJS's ES5.1.

**Does adaptive mode always pick the right renderer?**
It catches the vast majority of cases (pure SPAs, partial-SPAs with inline data, framework root mount points, sites with `enable JavaScript` notices). Rare edge cases that look static but render content via deep XHR chains may still slip through — explicitly set `crawlerMode: "browser"` for those.

**Does the stealth mode bypass Cloudflare / DataDome / PerimeterX?**
No. The built-in stealth defeats *basic* detection (webdriver flag, fingerprint patterns, header analysis). Premium anti-bot services use TLS fingerprinting, behavioural analysis, and challenge pages that need either a captcha solver (configure `captchaSolverApiKey`) or commercial unblocking services. This is honest scope — most crawlers in this price range have the same limitation.

**Why are there two ways to define link patterns (`globs` vs `pseudoUrls`)?**
`globs` is the modern, recommended format. `pseudoUrls` exists for backward compatibility with the Legacy PhantomJS Crawler so existing input templates still work.

**Can I send POST requests?**
Yes — give each start URL a `method` and `payload`:

```json
{
  "startUrls": [
    {"url": "https://api.example.com/search", "method": "POST", "payload": "{\"query\": \"x\"}"}
  ]
}
```

**Does it respect robots.txt?**
Yes by default. Set `ignoreRobotsTxt: true` to override (you take responsibility for compliance with the target site's terms of service).

**Can I run scheduled crawls?**
Yes — use Apify's built-in Schedules. Great for daily site monitoring or weekly SEO audits.

**Where do screenshots and snapshots go?**
To the run's default Key-Value store. Keys are `snapshot-<requestId>.html` and `screenshot-<requestId>.png`. Open the run in Apify Console → Storage → Key-Value store to view them.

***

### 🛠️ Maintained by

**brilliant\_gum** — modernising legacy Apify infrastructure since 2025.

Bug reports, feature requests, custom scraping needs — open an issue on this actor's page.

If this actor saved you a migration headache, leave a ⭐ — it helps other ex-PhantomJS users find it.

# Actor input Schema

## `startUrls` (type: `array`):

URLs to start crawling from. Supports GET/POST, custom headers, labels, and userData per URL.

## `linkSelector` (type: `string`):

CSS selector to find links on the page. Empty string = scrape only start URLs (no link following).

## `globs` (type: `array`):

Glob patterns to filter which discovered URLs to crawl. Only matching URLs will be enqueued. Example: https://example.com/products/\*\*

## `excludeGlobs` (type: `array`):

Glob patterns for URLs to exclude from crawling.

## `pseudoUrls` (type: `array`):

Legacy pseudo-URL patterns with regex in square brackets. For backward compatibility — prefer Globs for new projects.

## `keepUrlFragments` (type: `boolean`):

Treat URL #fragments as unique pages (important for SPAs with hash-based routing).

## `pageFunction` (type: `string`):

Custom JavaScript function executed on each page. Receives context with: { page, $, body, json, request, response, log, enqueueLinks, pushData, getValue, setValue, customData, crawler }. Return extracted data or use pushData().

## `preNavigationHooks` (type: `string`):

JavaScript code executed BEFORE each page navigation. Use to set cookies, modify headers, or prepare the page. Available variables: { request, page, session, proxyInfo, log }.

## `postNavigationHooks` (type: `string`):

JavaScript code executed AFTER each page navigation, before the page function. Use to check page state, dismiss popups, or handle auth flows.

## `crawlerMode` (type: `string`):

'adaptive' auto-detects JS-heavy pages. 'browser' always uses Playwright. 'http' uses fast HTTP (Cheerio, no JS rendering).

## `headless` (type: `boolean`):

Run browser in headless mode. Disable for debugging or sites that detect headless browsers.

## `useChrome` (type: `boolean`):

Use the real Google Chrome browser instead of Chromium. Better anti-detection but slower startup.

## `waitUntil` (type: `string`):

When to consider the page loaded. 'networkidle' waits for no network activity (slowest but safest). 'domcontentloaded' is faster.

## `waitForSelectorOrTimeout` (type: `string`):

Extra wait after page load: CSS selector to appear OR milliseconds. Examples: '.product-card', '3000'

## `maxCrawlDepth` (type: `integer`):

Max link distance from start URLs. 0 = only start URLs.

## `maxCrawlPages` (type: `integer`):

Maximum total pages to crawl (including pages that don't produce output).

## `maxResultsPerCrawl` (type: `integer`):

Maximum records to save to the dataset. 0 = unlimited. Separate from max pages — crawl 1000 pages but output only 100 results.

## `maxConcurrency` (type: `integer`):

Maximum parallel pages. Lower = safer for anti-bot, higher = faster.

## `maxRequestRetries` (type: `integer`):

How many times to retry a failed page before giving up.

## `pageLoadTimeoutSecs` (type: `integer`):

Maximum time to wait for a page to load.

## `pageFunctionTimeoutSecs` (type: `integer`):

Maximum time for the page function to complete. Separate from page load timeout. Increase for heavy extraction.

## `maxInfiniteScrollHeight` (type: `integer`):

Scroll down this many pixels for infinite scroll pages. 0 = no scrolling. Browser mode only.

## `closeCookieModals` (type: `boolean`):

Automatically dismiss GDPR cookie consent banners and popups. Browser mode only.

## `delayBetweenRequestsMs` (type: `integer`):

Minimum delay between requests. Randomized by ±25% for human-like behavior.

## `blockResources` (type: `array`):

Block unnecessary resources to speed up crawling (browser mode). Valid values: image, stylesheet, font, media, script, other.

## `ignoreSslErrors` (type: `boolean`):

Continue crawling when SSL certificate verification fails.

## `ignoreRobotsTxt` (type: `boolean`):

Ignore robots.txt rules. Use responsibly and check legal requirements.

## `ignoreCorsAndCsp` (type: `boolean`):

Bypass Cross-Origin Resource Sharing and Content Security Policy restrictions. Allows cross-domain XHR from page function.

## `stealthMode` (type: `boolean`):

Anti-detection: fingerprint randomization, realistic headers, webdriver flag masking. Recommended for sites with bot protection.

## `proxyConfiguration` (type: `object`):

Optional. Proxies are included and configured automatically — leave this empty. Override only if you want to use your own proxy URLs or a specific country.

## `proxyRotation` (type: `string`):

'recommended' rotates automatically. 'per\_request' uses new proxy per request. 'until\_failure' keeps proxy until it fails.

## `sessionPoolOptions` (type: `boolean`):

Maintain persistent sessions (cookies + proxy pairs) for consistent browsing.

## `sessionPoolName` (type: `string`):

Name for the session pool. Named pools can be shared across actor runs for persistent sessions.

## `maxSessionUsageCount` (type: `integer`):

Retire a session after this many requests. Lower values reduce detection risk.

## `captchaSolverApiKey` (type: `string`):

Your 2captcha.com API key. When set, the crawler automatically detects and solves Cloudflare Turnstile and DataDome captchas. Leave empty to disable captcha solving.

## `captchaSolverTimeoutSecs` (type: `integer`):

Maximum time to wait for captcha solution from 2captcha.

## `captchaMaxRetries` (type: `integer`):

Maximum captcha solve attempts per page before giving up.

## `customHeaders` (type: `object`):

Custom HTTP headers sent with every request. Example: {"Accept-Language": "de-DE"}

## `initialCookies` (type: `array`):

Cookies to set before crawling. Array of {name, value, domain} objects.

## `customData` (type: `object`):

Arbitrary JSON passed to page function as context.customData.

## `extractMetadata` (type: `boolean`):

Automatically extract Open Graph, Twitter Cards, JSON-LD / Schema.org structured data, canonical URL, and SEO metadata from every page.

## `extractContacts` (type: `boolean`):

Automatically find and extract email addresses, phone numbers, and social media links (Twitter, LinkedIn, Facebook, Instagram, YouTube, GitHub) from page content.

## `extractCleanText` (type: `boolean`):

Extract the main content text from each page, stripping navigation, ads, sidebars, and boilerplate. Like 'Reader Mode' in browsers. Great for LLM/RAG pipelines.

## `saveScreenshotPerPage` (type: `boolean`):

Capture a screenshot of every crawled page (browser mode only). Saved to key-value store. Useful for visual monitoring, archiving, and debugging.

## `enableAnalytics` (type: `boolean`):

Generate detailed crawl report: success rates, response time percentiles, error breakdown, data quality metrics, crawl speed. Saved to key-value store as ANALYTICS\_REPORT.

## `dataQualityChecks` (type: `boolean`):

Validate extracted data quality. Warns about empty fields, encoding issues, suspiciously short content.

## `requiredFields` (type: `array`):

Fields that must be non-empty in pageFunctionResult. Items failing validation are flagged in analytics.

## `saveSnapshots` (type: `boolean`):

Save HTML snapshots and screenshots when errors occur. Useful for debugging anti-bot blocks.

## `debugLog` (type: `boolean`):

Enable verbose debug logging. Warning: significantly increases log volume.

## `browserLog` (type: `boolean`):

Include browser console.log messages in the actor log. Useful for debugging page function.

## `datasetName` (type: `string`):

Store output in a named dataset instead of the default one. Useful for multi-step workflows.

## `keyValueStoreName` (type: `string`):

Use a named key-value store for screenshots and analytics. Useful for multi-step workflows.

## `requestQueueName` (type: `string`):

Use a named request queue. Useful for resuming crawls or sharing URLs between runs.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.example.com/"
    }
  ],
  "linkSelector": "a[href]",
  "globs": [],
  "excludeGlobs": [],
  "pseudoUrls": [],
  "keepUrlFragments": false,
  "pageFunction": "async function pageFunction(context) {\n    const { request, log, page, $ } = context;\n\n    // page (Playwright) in browser mode\n    // $ (Cheerio) in HTTP mode\n    const title = page\n        ? await page.title()\n        : $('title').text();\n\n    log.info(`Scraping: ${title}`, { url: request.loadedUrl });\n\n    return {\n        url: request.loadedUrl,\n        title,\n    };\n}",
  "preNavigationHooks": "// Example: set a cookie before navigation\n// async ({ request, page, session, log }) => {\n//     if (page) await page.context().addCookies([{ name: 'auth', value: 'token123', domain: '.example.com' }]);\n// }",
  "postNavigationHooks": "// Example: dismiss a popup after page loads\n// async ({ request, page, response, log }) => {\n//     if (page) {\n//         const popup = page.locator('.popup-close');\n//         if (await popup.isVisible()) await popup.click();\n//     }\n// }",
  "crawlerMode": "adaptive",
  "headless": true,
  "useChrome": false,
  "waitUntil": "domcontentloaded",
  "waitForSelectorOrTimeout": "",
  "maxCrawlDepth": 10,
  "maxCrawlPages": 100,
  "maxResultsPerCrawl": 0,
  "maxConcurrency": 10,
  "maxRequestRetries": 3,
  "pageLoadTimeoutSecs": 60,
  "pageFunctionTimeoutSecs": 60,
  "maxInfiniteScrollHeight": 0,
  "closeCookieModals": false,
  "delayBetweenRequestsMs": 500,
  "blockResources": [],
  "ignoreSslErrors": false,
  "ignoreRobotsTxt": false,
  "ignoreCorsAndCsp": false,
  "stealthMode": true,
  "proxyRotation": "recommended",
  "sessionPoolOptions": true,
  "sessionPoolName": "",
  "maxSessionUsageCount": 50,
  "captchaSolverTimeoutSecs": 120,
  "captchaMaxRetries": 2,
  "customHeaders": {},
  "initialCookies": [],
  "customData": {},
  "extractMetadata": true,
  "extractContacts": false,
  "extractCleanText": false,
  "saveScreenshotPerPage": false,
  "enableAnalytics": true,
  "dataQualityChecks": true,
  "requiredFields": [],
  "saveSnapshots": true,
  "debugLog": false,
  "browserLog": false,
  "datasetName": "",
  "keyValueStoreName": "",
  "requestQueueName": ""
}
```

# Actor output Schema

## `results` (type: `string`):

Dataset containing all extracted page data

## `analytics` (type: `string`):

Detailed crawl analytics and data quality report

## `errorSnapshots` (type: `string`):

HTML snapshots and screenshots from failed pages

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.example.com/"
        }
    ],
    "pageFunction": async function pageFunction(context) {
        const { request, log, page, $ } = context;
    
        // page (Playwright) in browser mode
        // $ (Cheerio) in HTTP mode
        const title = page
            ? await page.title()
            : $('title').text();
    
        log.info(`Scraping: ${title}`, { url: request.loadedUrl });
    
        return {
            url: request.loadedUrl,
            title,
        };
    },
    "preNavigationHooks": `// Example: set a cookie before navigation
// async ({ request, page, session, log }) => {
//     if (page) await page.context().addCookies([{ name: 'auth', value: 'token123', domain: '.example.com' }]);
// }`,
    "postNavigationHooks": `// Example: dismiss a popup after page loads
// async ({ request, page, response, log }) => {
//     if (page) {
//         const popup = page.locator('.popup-close');
//         if (await popup.isVisible()) await popup.click();
//     }
// }`
};

// Run the Actor and wait for it to finish
const run = await client.actor("brilliant_gum/phantom-reborn-crawler").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.example.com/" }],
    "pageFunction": """async function pageFunction(context) {
    const { request, log, page, $ } = context;

    // page (Playwright) in browser mode
    // $ (Cheerio) in HTTP mode
    const title = page
        ? await page.title()
        : $('title').text();

    log.info(`Scraping: ${title}`, { url: request.loadedUrl });

    return {
        url: request.loadedUrl,
        title,
    };
}""",
    "preNavigationHooks": """// Example: set a cookie before navigation
// async ({ request, page, session, log }) => {
//     if (page) await page.context().addCookies([{ name: 'auth', value: 'token123', domain: '.example.com' }]);
// }""",
    "postNavigationHooks": """// Example: dismiss a popup after page loads
// async ({ request, page, response, log }) => {
//     if (page) {
//         const popup = page.locator('.popup-close');
//         if (await popup.isVisible()) await popup.click();
//     }
// }""",
}

# Run the Actor and wait for it to finish
run = client.actor("brilliant_gum/phantom-reborn-crawler").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.example.com/"
    }
  ],
  "pageFunction": "async function pageFunction(context) {\\n    const { request, log, page, $ } = context;\\n\\n    // page (Playwright) in browser mode\\n    // $ (Cheerio) in HTTP mode\\n    const title = page\\n        ? await page.title()\\n        : $('\''title'\'').text();\\n\\n    log.info(`Scraping: ${title}`, { url: request.loadedUrl });\\n\\n    return {\\n        url: request.loadedUrl,\\n        title,\\n    };\\n}",
  "preNavigationHooks": "// Example: set a cookie before navigation\\n// async ({ request, page, session, log }) => {\\n//     if (page) await page.context().addCookies([{ name: '\''auth'\'', value: '\''token123'\'', domain: '\''.example.com'\'' }]);\\n// }",
  "postNavigationHooks": "// Example: dismiss a popup after page loads\\n// async ({ request, page, response, log }) => {\\n//     if (page) {\\n//         const popup = page.locator('\''.popup-close'\'');\\n//         if (await popup.isVisible()) await popup.click();\\n//     }\\n// }"
}' |
apify call brilliant_gum/phantom-reborn-crawler --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=brilliant_gum/phantom-reborn-crawler",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Modern Web Crawler — Adaptive + Stealth + Analytics",
        "description": "Modern replacement for the Legacy PhantomJS Crawler. Auto HTTP/Browser detection, basic anti-bot stealth, built-in analytics, data quality scoring, captcha solver integration. Modern Chrome + Cheerio engine — no PhantomJS, no abandoned tech. Proxies included.",
        "version": "1.0",
        "x-build-id": "Po3YsV8GFzk60uHkx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/brilliant_gum~phantom-reborn-crawler/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-brilliant_gum-phantom-reborn-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/brilliant_gum~phantom-reborn-crawler/runs": {
            "post": {
                "operationId": "runs-sync-brilliant_gum-phantom-reborn-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/brilliant_gum~phantom-reborn-crawler/run-sync": {
            "post": {
                "operationId": "run-sync-brilliant_gum-phantom-reborn-crawler",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "URLs to start crawling from. Supports GET/POST, custom headers, labels, and userData per URL.",
                        "items": {
                            "type": "object",
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL",
                                    "description": "Page URL"
                                }
                            }
                        }
                    },
                    "linkSelector": {
                        "title": "Link Selector",
                        "type": "string",
                        "description": "CSS selector to find links on the page. Empty string = scrape only start URLs (no link following).",
                        "default": "a[href]"
                    },
                    "globs": {
                        "title": "URL Patterns (Globs)",
                        "type": "array",
                        "description": "Glob patterns to filter which discovered URLs to crawl. Only matching URLs will be enqueued. Example: https://example.com/products/**",
                        "items": {
                            "type": "object"
                        },
                        "default": []
                    },
                    "excludeGlobs": {
                        "title": "Exclude URL Patterns",
                        "type": "array",
                        "description": "Glob patterns for URLs to exclude from crawling.",
                        "items": {
                            "type": "object"
                        },
                        "default": []
                    },
                    "pseudoUrls": {
                        "title": "Pseudo-URLs (Legacy)",
                        "type": "array",
                        "description": "Legacy pseudo-URL patterns with regex in square brackets. For backward compatibility — prefer Globs for new projects.",
                        "items": {
                            "type": "object"
                        },
                        "default": []
                    },
                    "keepUrlFragments": {
                        "title": "Keep URL Fragments",
                        "type": "boolean",
                        "description": "Treat URL #fragments as unique pages (important for SPAs with hash-based routing).",
                        "default": false
                    },
                    "pageFunction": {
                        "title": "Page Function",
                        "type": "string",
                        "description": "Custom JavaScript function executed on each page. Receives context with: { page, $, body, json, request, response, log, enqueueLinks, pushData, getValue, setValue, customData, crawler }. Return extracted data or use pushData()."
                    },
                    "preNavigationHooks": {
                        "title": "Pre-Navigation Hooks",
                        "type": "string",
                        "description": "JavaScript code executed BEFORE each page navigation. Use to set cookies, modify headers, or prepare the page. Available variables: { request, page, session, proxyInfo, log }.",
                        "default": ""
                    },
                    "postNavigationHooks": {
                        "title": "Post-Navigation Hooks",
                        "type": "string",
                        "description": "JavaScript code executed AFTER each page navigation, before the page function. Use to check page state, dismiss popups, or handle auth flows.",
                        "default": ""
                    },
                    "crawlerMode": {
                        "title": "Crawler Mode",
                        "enum": [
                            "adaptive",
                            "browser",
                            "http"
                        ],
                        "type": "string",
                        "description": "'adaptive' auto-detects JS-heavy pages. 'browser' always uses Playwright. 'http' uses fast HTTP (Cheerio, no JS rendering).",
                        "default": "adaptive"
                    },
                    "headless": {
                        "title": "Headless Mode",
                        "type": "boolean",
                        "description": "Run browser in headless mode. Disable for debugging or sites that detect headless browsers.",
                        "default": true
                    },
                    "useChrome": {
                        "title": "Use Chrome (not Chromium)",
                        "type": "boolean",
                        "description": "Use the real Google Chrome browser instead of Chromium. Better anti-detection but slower startup.",
                        "default": false
                    },
                    "waitUntil": {
                        "title": "Wait Until",
                        "enum": [
                            "domcontentloaded",
                            "load",
                            "networkidle"
                        ],
                        "type": "string",
                        "description": "When to consider the page loaded. 'networkidle' waits for no network activity (slowest but safest). 'domcontentloaded' is faster.",
                        "default": "domcontentloaded"
                    },
                    "waitForSelectorOrTimeout": {
                        "title": "Additional Wait (Selector or ms)",
                        "type": "string",
                        "description": "Extra wait after page load: CSS selector to appear OR milliseconds. Examples: '.product-card', '3000'",
                        "default": ""
                    },
                    "maxCrawlDepth": {
                        "title": "Max Crawl Depth",
                        "minimum": 0,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Max link distance from start URLs. 0 = only start URLs.",
                        "default": 10
                    },
                    "maxCrawlPages": {
                        "title": "Max Pages to Crawl",
                        "minimum": 1,
                        "maximum": 10000000,
                        "type": "integer",
                        "description": "Maximum total pages to crawl (including pages that don't produce output).",
                        "default": 100
                    },
                    "maxResultsPerCrawl": {
                        "title": "Max Output Results",
                        "minimum": 0,
                        "maximum": 10000000,
                        "type": "integer",
                        "description": "Maximum records to save to the dataset. 0 = unlimited. Separate from max pages — crawl 1000 pages but output only 100 results.",
                        "default": 0
                    },
                    "maxConcurrency": {
                        "title": "Max Concurrency",
                        "minimum": 1,
                        "maximum": 200,
                        "type": "integer",
                        "description": "Maximum parallel pages. Lower = safer for anti-bot, higher = faster.",
                        "default": 10
                    },
                    "maxRequestRetries": {
                        "title": "Max Retries",
                        "minimum": 0,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many times to retry a failed page before giving up.",
                        "default": 3
                    },
                    "pageLoadTimeoutSecs": {
                        "title": "Page Load Timeout (sec)",
                        "minimum": 1,
                        "maximum": 300,
                        "type": "integer",
                        "description": "Maximum time to wait for a page to load.",
                        "default": 60
                    },
                    "pageFunctionTimeoutSecs": {
                        "title": "Page Function Timeout (sec)",
                        "minimum": 1,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Maximum time for the page function to complete. Separate from page load timeout. Increase for heavy extraction.",
                        "default": 60
                    },
                    "maxInfiniteScrollHeight": {
                        "title": "Max Scroll Height (px)",
                        "minimum": 0,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Scroll down this many pixels for infinite scroll pages. 0 = no scrolling. Browser mode only.",
                        "default": 0
                    },
                    "closeCookieModals": {
                        "title": "Close Cookie Modals",
                        "type": "boolean",
                        "description": "Automatically dismiss GDPR cookie consent banners and popups. Browser mode only.",
                        "default": false
                    },
                    "delayBetweenRequestsMs": {
                        "title": "Delay Between Requests (ms)",
                        "minimum": 0,
                        "maximum": 60000,
                        "type": "integer",
                        "description": "Minimum delay between requests. Randomized by ±25% for human-like behavior.",
                        "default": 500
                    },
                    "blockResources": {
                        "title": "Block Resources",
                        "type": "array",
                        "description": "Block unnecessary resources to speed up crawling (browser mode). Valid values: image, stylesheet, font, media, script, other.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "ignoreSslErrors": {
                        "title": "Ignore SSL Errors",
                        "type": "boolean",
                        "description": "Continue crawling when SSL certificate verification fails.",
                        "default": false
                    },
                    "ignoreRobotsTxt": {
                        "title": "Ignore robots.txt",
                        "type": "boolean",
                        "description": "Ignore robots.txt rules. Use responsibly and check legal requirements.",
                        "default": false
                    },
                    "ignoreCorsAndCsp": {
                        "title": "Ignore CORS and CSP",
                        "type": "boolean",
                        "description": "Bypass Cross-Origin Resource Sharing and Content Security Policy restrictions. Allows cross-domain XHR from page function.",
                        "default": false
                    },
                    "stealthMode": {
                        "title": "Stealth Mode",
                        "type": "boolean",
                        "description": "Anti-detection: fingerprint randomization, realistic headers, webdriver flag masking. Recommended for sites with bot protection.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration (optional)",
                        "type": "object",
                        "description": "Optional. Proxies are included and configured automatically — leave this empty. Override only if you want to use your own proxy URLs or a specific country."
                    },
                    "proxyRotation": {
                        "title": "Proxy Rotation Strategy",
                        "enum": [
                            "recommended",
                            "per_request",
                            "until_failure"
                        ],
                        "type": "string",
                        "description": "'recommended' rotates automatically. 'per_request' uses new proxy per request. 'until_failure' keeps proxy until it fails.",
                        "default": "recommended"
                    },
                    "sessionPoolOptions": {
                        "title": "Use Session Pool",
                        "type": "boolean",
                        "description": "Maintain persistent sessions (cookies + proxy pairs) for consistent browsing.",
                        "default": true
                    },
                    "sessionPoolName": {
                        "title": "Session Pool Name",
                        "type": "string",
                        "description": "Name for the session pool. Named pools can be shared across actor runs for persistent sessions.",
                        "default": ""
                    },
                    "maxSessionUsageCount": {
                        "title": "Max Uses Per Session",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Retire a session after this many requests. Lower values reduce detection risk.",
                        "default": 50
                    },
                    "captchaSolverApiKey": {
                        "title": "Captcha Solver API Key (2captcha)",
                        "type": "string",
                        "description": "Your 2captcha.com API key. When set, the crawler automatically detects and solves Cloudflare Turnstile and DataDome captchas. Leave empty to disable captcha solving."
                    },
                    "captchaSolverTimeoutSecs": {
                        "title": "Captcha Solver Timeout (sec)",
                        "minimum": 30,
                        "maximum": 300,
                        "type": "integer",
                        "description": "Maximum time to wait for captcha solution from 2captcha.",
                        "default": 120
                    },
                    "captchaMaxRetries": {
                        "title": "Captcha Max Retries",
                        "minimum": 1,
                        "maximum": 5,
                        "type": "integer",
                        "description": "Maximum captcha solve attempts per page before giving up.",
                        "default": 2
                    },
                    "customHeaders": {
                        "title": "Custom HTTP Headers",
                        "type": "object",
                        "description": "Custom HTTP headers sent with every request. Example: {\"Accept-Language\": \"de-DE\"}",
                        "default": {}
                    },
                    "initialCookies": {
                        "title": "Initial Cookies",
                        "type": "array",
                        "description": "Cookies to set before crawling. Array of {name, value, domain} objects.",
                        "items": {
                            "type": "object"
                        },
                        "default": []
                    },
                    "customData": {
                        "title": "Custom Data",
                        "type": "object",
                        "description": "Arbitrary JSON passed to page function as context.customData.",
                        "default": {}
                    },
                    "extractMetadata": {
                        "title": "Auto-Extract Page Metadata",
                        "type": "boolean",
                        "description": "Automatically extract Open Graph, Twitter Cards, JSON-LD / Schema.org structured data, canonical URL, and SEO metadata from every page.",
                        "default": true
                    },
                    "extractContacts": {
                        "title": "Auto-Extract Contact Info",
                        "type": "boolean",
                        "description": "Automatically find and extract email addresses, phone numbers, and social media links (Twitter, LinkedIn, Facebook, Instagram, YouTube, GitHub) from page content.",
                        "default": false
                    },
                    "extractCleanText": {
                        "title": "Auto-Extract Clean Text",
                        "type": "boolean",
                        "description": "Extract the main content text from each page, stripping navigation, ads, sidebars, and boilerplate. Like 'Reader Mode' in browsers. Great for LLM/RAG pipelines.",
                        "default": false
                    },
                    "saveScreenshotPerPage": {
                        "title": "Save Screenshot Per Page",
                        "type": "boolean",
                        "description": "Capture a screenshot of every crawled page (browser mode only). Saved to key-value store. Useful for visual monitoring, archiving, and debugging.",
                        "default": false
                    },
                    "enableAnalytics": {
                        "title": "Enable Analytics Report",
                        "type": "boolean",
                        "description": "Generate detailed crawl report: success rates, response time percentiles, error breakdown, data quality metrics, crawl speed. Saved to key-value store as ANALYTICS_REPORT.",
                        "default": true
                    },
                    "dataQualityChecks": {
                        "title": "Data Quality Checks",
                        "type": "boolean",
                        "description": "Validate extracted data quality. Warns about empty fields, encoding issues, suspiciously short content.",
                        "default": true
                    },
                    "requiredFields": {
                        "title": "Required Fields",
                        "type": "array",
                        "description": "Fields that must be non-empty in pageFunctionResult. Items failing validation are flagged in analytics.",
                        "items": {
                            "type": "string"
                        },
                        "default": []
                    },
                    "saveSnapshots": {
                        "title": "Save Snapshots on Error",
                        "type": "boolean",
                        "description": "Save HTML snapshots and screenshots when errors occur. Useful for debugging anti-bot blocks.",
                        "default": true
                    },
                    "debugLog": {
                        "title": "Debug Log",
                        "type": "boolean",
                        "description": "Enable verbose debug logging. Warning: significantly increases log volume.",
                        "default": false
                    },
                    "browserLog": {
                        "title": "Browser Console Log",
                        "type": "boolean",
                        "description": "Include browser console.log messages in the actor log. Useful for debugging page function.",
                        "default": false
                    },
                    "datasetName": {
                        "title": "Dataset Name",
                        "type": "string",
                        "description": "Store output in a named dataset instead of the default one. Useful for multi-step workflows.",
                        "default": ""
                    },
                    "keyValueStoreName": {
                        "title": "Key-Value Store Name",
                        "type": "string",
                        "description": "Use a named key-value store for screenshots and analytics. Useful for multi-step workflows.",
                        "default": ""
                    },
                    "requestQueueName": {
                        "title": "Request Queue Name",
                        "type": "string",
                        "description": "Use a named request queue. Useful for resuming crawls or sharing URLs between runs.",
                        "default": ""
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```