# Sitemap URL Status Auditor (`automation-lab/sitemap-url-status-auditor`) Actor

Audit XML sitemaps for broken URLs, redirects, HTTP status codes, response timing, content type, canonical tags, and robots metadata.

- **URL**: https://apify.com/automation-lab/sitemap-url-status-auditor.md
- **Developed by:** [Stas Persiianenko](https://apify.com/automation-lab) (community)
- **Categories:** SEO tools, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Sitemap URL Status Auditor

Audit every URL listed in XML sitemaps and sitemap indexes for HTTP status, redirects, response timing, content type, and optional SEO metadata.

Use this actor when you need a repeatable sitemap checker for migrations, releases, content QA, and technical SEO monitoring.

### What does Sitemap URL Status Auditor do?

Sitemap URL Status Auditor starts from one or more XML sitemap URLs, sitemap indexes, website roots, or domains.

It downloads sitemap XML files, follows nested sitemap indexes, extracts `<loc>` URLs, deduplicates them, and checks each listed page URL.

For each URL, it records status code, final URL, redirect count, redirect chain, response time, content type, content length, and a normalized error category.

Optionally, it can fetch page HTML to extract canonical URLs and robots meta tags.

### Who is it for?

SEO specialists use it to catch broken URLs in sitemaps before search engines waste crawl budget.

Web QA teams use it after deployments to confirm that sitemap URLs still resolve.

Migration teams use it to check final URLs and redirect counts after domain, CMS, or URL-structure changes.

Agencies use it for recurring client health checks and exportable audit evidence.

Developers use it as a fast HTTP-only smoke test for public sitemap quality.

### Why use this actor?

🗺️ It is sitemap-first, not just a generic URL checker.

🔁 It recursively expands sitemap indexes.

🧹 It deduplicates URLs before checking them.

🚦 It uses HEAD first with GET fallback for servers that reject HEAD.

📊 It returns one clean dataset table for export, dashboards, and alerts.

⚙️ It includes concurrency, caps, timeout, retry, and polite User-Agent controls.

### What data can you extract?

| Field | Description |
| --- | --- |
| `url` | Page URL discovered in the sitemap. |
| `sourceSitemap` | Sitemap XML file where the URL was found. |
| `sitemapDepth` | Recursion depth inside sitemap indexes. |
| `statusCode` | HTTP status code or null for network failures. |
| `ok` | True for 2xx and 3xx responses. |
| `method` | HEAD, GET, SITEMAP, or NONE. |
| `finalUrl` | Final URL after redirects. |
| `redirectCount` | Number of redirects followed. |
| `redirectChain` | Redirect URLs exposed by the HTTP client. |
| `contentType` | Content-Type response header. |
| `contentLength` | Content-Length header when available. |
| `responseTimeMs` | Request duration in milliseconds. |
| `errorCategory` | none, http_error, timeout, dns_error, tls_error, network_error, parse_error, blocked, or not_checked. |
| `errorMessage` | Human-readable error message. |
| `canonicalUrl` | Canonical link when metadata extraction is enabled. |
| `robotsMeta` | Robots meta tag when metadata extraction is enabled. |
| `xRobotsTag` | X-Robots-Tag response header. |
| `checkedAt` | ISO timestamp of the check. |

### How much does it cost to audit sitemap URL status?

This actor uses pay-per-event pricing.

There is a $0.005 start event for each run and a per-URL result event for each dataset row produced.

| Plan tier | Per URL result |
| --- | ---: |
| Free | $0.000029952 |
| Starter / Bronze | $0.000026046 |
| Scale / Silver | $0.000020316 |
| Business / Gold | $0.000015627 |
| Platinum | $0.000010418 |
| Diamond | $0.000010000 |

For most users, cost scales with the number of sitemap URLs checked.

Use `maxUrls` for small first tests and increase it after you confirm the sitemap source is correct.

### How to use it

1. Open the actor on Apify.
2. Add one or more sitemap URLs, website roots, or domains.
3. Set `maxUrls` to the number of URLs you want to audit.
4. Keep `headFirst` enabled for faster status checks.
5. Enable `includePageMetadata` only when you need canonical and robots meta extraction.
6. Run the actor.
7. Export the dataset as JSON, CSV, Excel, or connect it to your workflow.

### Input settings

#### Sitemap URLs or websites

Use `startUrls` for XML sitemap URLs, sitemap index URLs, website roots, or domains.

Examples:

- `https://example.com/sitemap.xml`
- `https://example.com/sitemap_index.xml`
- `https://example.com/`
- `example.com`

Website roots and bare domains automatically resolve to `/sitemap.xml`.

#### Additional domains

Use `domains` when you want a simple list of extra domains in addition to `startUrls`.

Each domain is converted to a sitemap URL.

#### Maximum URLs

`maxUrls` controls the maximum unique page URLs audited.

Start with 100 for a cheap test.

Increase to 1,000 or more for full-site checks.

#### Maximum sitemap files

`maxSitemaps` prevents very large sitemap indexes from expanding forever.

Large ecommerce sites can have hundreds of sitemap files.

#### Maximum sitemap index depth

`maxDepth` controls how deeply nested sitemap indexes are followed.

The default of 3 is enough for normal sitemap structures.

#### Concurrency

`concurrency` controls how many URL checks run in parallel.

Use lower values for small or fragile sites.

Use higher values for durable sites and faster audits.

#### Request timeout and retries

`requestTimeoutSecs` and `maxRetries` balance speed and reliability.

Timeouts are recorded as dataset rows rather than crashing the whole run.

#### Use HEAD before GET

`headFirst` checks URLs with HEAD first and falls back to GET when needed.

This keeps most audits fast and lightweight.

#### Follow redirects

`followRedirects` records the final URL and redirect count.

This is useful for migrations and canonicalization checks.

#### Include page metadata

`includePageMetadata` fetches full HTML pages with GET and extracts canonical and robots meta tags.

Enable it for deeper SEO audits.

Leave it off for faster pure status checks.

#### User-Agent

The default User-Agent identifies the actor politely.

You can override it for internal policies or target-specific requirements.

### Output example

```json
{
  "url": "https://example.com/about/",
  "sourceSitemap": "https://example.com/sitemap.xml",
  "sitemapDepth": 0,
  "statusCode": 200,
  "ok": true,
  "method": "HEAD",
  "finalUrl": "https://example.com/about/",
  "redirectCount": 0,
  "redirectChain": [],
  "contentType": "text/html; charset=utf-8",
  "contentLength": 12345,
  "responseTimeMs": 184,
  "errorCategory": "none",
  "errorMessage": null,
  "canonicalUrl": null,
  "robotsMeta": null,
  "xRobotsTag": null,
  "checkedAt": "2026-06-27T00:00:00.000Z"
}
````

### Common workflows

#### Broken sitemap URL audit

Run the actor with `maxUrls` set to your sitemap size.

Filter output where `ok` is false.

Review `errorCategory` and `errorMessage`.

#### Redirect migration QA

Run before and after a migration.

Compare `finalUrl`, `redirectCount`, and `statusCode`.

Flag URLs with long redirect chains or unexpected final domains.

#### Canonical and robots review

Enable `includePageMetadata`.

Filter rows where canonical URLs are missing, unexpected, or off-domain.

Review `robotsMeta` and `xRobotsTag` for accidental noindex directives.

#### Release smoke test

Schedule a small run after deployment.

Use `maxUrls` to audit the most important sitemap subset.

Send failures into Slack, email, or a dashboard with Apify integrations.

### Integrations

Connect the dataset to Google Sheets for SEO reports.

Use Apify webhooks to send failed URL rows to monitoring systems.

Pull results with the Apify API into Looker Studio, BigQuery, Snowflake, or your internal QA tools.

Run it from CI/CD after a website deployment.

Use recurring tasks for daily or weekly sitemap status monitoring.

### API usage

#### Node.js

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/sitemap-url-status-auditor').call({
  startUrls: [{ url: 'https://apify.com/sitemap.xml' }],
  maxUrls: 100,
});
console.log(run.defaultDatasetId);
```

#### Python

```python
from apify_client import ApifyClient
import os

client = ApifyClient(os.environ['APIFY_TOKEN'])
run = client.actor('automation-lab/sitemap-url-status-auditor').call(run_input={
    'startUrls': [{'url': 'https://apify.com/sitemap.xml'}],
    'maxUrls': 100,
})
print(run['defaultDatasetId'])
```

#### cURL

```bash
curl -X POST 'https://api.apify.com/v2/acts/automation-lab~sitemap-url-status-auditor/runs?token=YOUR_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"startUrls":[{"url":"https://apify.com/sitemap.xml"}],"maxUrls":100}'
```

### MCP usage

Use this actor from MCP-compatible clients through Apify MCP Server.

Claude Desktop MCP URL:

`https://mcp.apify.com/?tools=automation-lab/sitemap-url-status-auditor`

Claude Code MCP URL:

`https://mcp.apify.com/?tools=automation-lab/sitemap-url-status-auditor`

Claude Code setup command:

```bash
claude mcp add apify-sitemap-auditor https://mcp.apify.com/?tools=automation-lab/sitemap-url-status-auditor
```

Claude Desktop JSON config example:

```json
{
  "mcpServers": {
    "apify-sitemap-auditor": {
      "url": "https://mcp.apify.com/?tools=automation-lab/sitemap-url-status-auditor"
    }
  }
}
```

Example prompts:

- "Audit this sitemap and summarize broken URLs."
- "Check redirect counts for URLs in this sitemap index."
- "Find sitemap URLs that return 404, 500, timeout, or blocked responses."
- "Run a canonical and robots metadata audit for this sitemap."

### Tips for best results

Start with a small `maxUrls` value.

Use exact sitemap XML URLs when you know them.

Reduce concurrency when a site returns 429 or intermittent errors.

Enable metadata extraction only when canonical or robots tags matter.

Keep sitemap and URL caps aligned with your budget.

Review sitemap error rows; they often reveal invalid sitemap indexes or blocked XML files.

### Troubleshooting

#### The actor says no `<loc>` URLs were found

Check that the input URL points to XML sitemap content, not an HTML page.

If you entered a website root, confirm `/sitemap.xml` exists.

#### Many URLs show blocked or 403

Lower concurrency and use a clear User-Agent.

Some sites block automated HEAD requests; the actor falls back to GET for common blocked HEAD statuses.

#### Metadata fields are empty

Canonical and robots meta fields require `includePageMetadata` to be enabled.

Headers such as `xRobotsTag` can still appear without metadata mode.

#### The run is slower than expected

Large sitemap indexes, slow target servers, metadata extraction, redirects, and retries increase runtime.

Lower `maxUrls` or increase concurrency carefully.

### Data quality notes

HTTP status checks reflect the response seen during the run.

Target websites can rate-limit, geo-vary, or serve different responses to different clients.

The actor records those outcomes instead of hiding them.

Redirect chains depend on what the HTTP client exposes after following redirects.

### Legality and ethics

This actor is designed for public XML sitemaps and public URLs.

Use it on websites you own, manage, audit, or are otherwise authorized to check.

Respect target website terms, rate limits, and robots guidance.

Reduce concurrency if a site appears stressed or rate-limited.

### Related scrapers and tools

- [Bulk URL Status Checker](https://apify.com/automation-lab/bulk-url-status-checker)
- [Website Contact Finder](https://apify.com/automation-lab/website-contact-finder)
- [Robots.txt Validator](https://apify.com/automation-lab/robots-txt-validator)

### FAQ

#### Can this actor audit sitemap indexes?

Yes. It detects sitemap indexes and recursively expands nested sitemap files up to `maxDepth` and `maxSitemaps`.

#### Does it need a browser?

No. It is an HTTP-only actor for public XML and URL checks.

#### Can it audit password-protected staging sites?

Not in the default workflow. Public unauthenticated URLs are the intended use case.

#### Does it use proxies?

No proxy is required by default. It performs plain HTTP requests from the actor runtime.

#### Can I schedule recurring audits?

Yes. Use Apify tasks and schedules to run the same input daily, weekly, or after deployments.

#### Can I export results?

Yes. Apify datasets export to JSON, CSV, Excel, XML, RSS, and HTML table formats.

#### How do I audit more than one domain?

Add multiple sitemap URLs to `startUrls` or put additional domains in `domains`.

#### What counts as OK?

The `ok` field is true for HTTP 2xx and 3xx responses.

#### What happens when a sitemap URL is broken?

The actor emits an error row for the sitemap itself with method `SITEMAP` and a normalized error category.

#### What happens when a page URL times out?

The actor emits a row for that URL with `statusCode` null, `ok` false, and `errorCategory` set to `timeout`.

# Actor input Schema

## `startUrls` (type: `array`):

XML sitemap URLs, sitemap index URLs, or website roots/domains. Website roots automatically use /sitemap.xml.

## `domains` (type: `array`):

Optional domains or website URLs to check via /sitemap.xml, one per item.

## `maxUrls` (type: `integer`):

Maximum unique page URLs checked across all sitemap sources.

## `maxSitemaps` (type: `integer`):

Maximum XML sitemap documents to fetch while expanding sitemap indexes.

## `maxDepth` (type: `integer`):

How many nested sitemap-index levels to follow.

## `concurrency` (type: `integer`):

Parallel page URL checks. Lower this for fragile websites.

## `requestTimeoutSecs` (type: `integer`):

Timeout for each sitemap or page request.

## `maxRetries` (type: `integer`):

Retry transient network failures before recording an error row.

## `headFirst` (type: `boolean`):

Check URLs with HEAD first, then fallback to GET for servers that block or mishandle HEAD.

## `followRedirects` (type: `boolean`):

Follow HTTP redirects and report the final URL and redirect count.

## `includePageMetadata` (type: `boolean`):

Fetch each page with GET to extract canonical URL and robots meta. This is slower but useful for SEO audits.

## `userAgent` (type: `string`):

Polite User-Agent sent to target websites.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://apify.com/sitemap.xml"
    }
  ],
  "domains": [],
  "maxUrls": 20,
  "maxSitemaps": 20,
  "maxDepth": 3,
  "concurrency": 10,
  "requestTimeoutSecs": 20,
  "maxRetries": 1,
  "headFirst": true,
  "followRedirects": true,
  "includePageMetadata": false,
  "userAgent": "Automation-Lab-Sitemap-URL-Status-Auditor/1.0 (+https://apify.com/automation-lab/sitemap-url-status-auditor)"
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://apify.com/sitemap.xml"
        }
    ],
    "domains": [],
    "maxUrls": 20,
    "maxSitemaps": 20,
    "maxDepth": 3,
    "concurrency": 10,
    "requestTimeoutSecs": 20,
    "maxRetries": 1,
    "headFirst": true,
    "followRedirects": true,
    "includePageMetadata": false,
    "userAgent": "Automation-Lab-Sitemap-URL-Status-Auditor/1.0 (+https://apify.com/automation-lab/sitemap-url-status-auditor)"
};

// Run the Actor and wait for it to finish
const run = await client.actor("automation-lab/sitemap-url-status-auditor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://apify.com/sitemap.xml" }],
    "domains": [],
    "maxUrls": 20,
    "maxSitemaps": 20,
    "maxDepth": 3,
    "concurrency": 10,
    "requestTimeoutSecs": 20,
    "maxRetries": 1,
    "headFirst": True,
    "followRedirects": True,
    "includePageMetadata": False,
    "userAgent": "Automation-Lab-Sitemap-URL-Status-Auditor/1.0 (+https://apify.com/automation-lab/sitemap-url-status-auditor)",
}

# Run the Actor and wait for it to finish
run = client.actor("automation-lab/sitemap-url-status-auditor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://apify.com/sitemap.xml"
    }
  ],
  "domains": [],
  "maxUrls": 20,
  "maxSitemaps": 20,
  "maxDepth": 3,
  "concurrency": 10,
  "requestTimeoutSecs": 20,
  "maxRetries": 1,
  "headFirst": true,
  "followRedirects": true,
  "includePageMetadata": false,
  "userAgent": "Automation-Lab-Sitemap-URL-Status-Auditor/1.0 (+https://apify.com/automation-lab/sitemap-url-status-auditor)"
}' |
apify call automation-lab/sitemap-url-status-auditor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=automation-lab/sitemap-url-status-auditor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Sitemap URL Status Auditor",
        "description": "Audit XML sitemaps for broken URLs, redirects, HTTP status codes, response timing, content type, canonical tags, and robots metadata.",
        "version": "0.1",
        "x-build-id": "ZxH9h69fX9EM382Uv"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/automation-lab~sitemap-url-status-auditor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-automation-lab-sitemap-url-status-auditor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/automation-lab~sitemap-url-status-auditor/runs": {
            "post": {
                "operationId": "runs-sync-automation-lab-sitemap-url-status-auditor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/automation-lab~sitemap-url-status-auditor/run-sync": {
            "post": {
                "operationId": "run-sync-automation-lab-sitemap-url-status-auditor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "startUrls"
                ],
                "properties": {
                    "startUrls": {
                        "title": "Sitemap URLs or websites",
                        "type": "array",
                        "description": "XML sitemap URLs, sitemap index URLs, or website roots/domains. Website roots automatically use /sitemap.xml.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "domains": {
                        "title": "Additional domains",
                        "type": "array",
                        "description": "Optional domains or website URLs to check via /sitemap.xml, one per item.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxUrls": {
                        "title": "Maximum URLs to audit",
                        "minimum": 1,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Maximum unique page URLs checked across all sitemap sources.",
                        "default": 20
                    },
                    "maxSitemaps": {
                        "title": "Maximum sitemap files",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum XML sitemap documents to fetch while expanding sitemap indexes.",
                        "default": 20
                    },
                    "maxDepth": {
                        "title": "Maximum sitemap index depth",
                        "minimum": 0,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many nested sitemap-index levels to follow.",
                        "default": 3
                    },
                    "concurrency": {
                        "title": "URL check concurrency",
                        "minimum": 1,
                        "maximum": 50,
                        "type": "integer",
                        "description": "Parallel page URL checks. Lower this for fragile websites.",
                        "default": 10
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout seconds",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Timeout for each sitemap or page request.",
                        "default": 20
                    },
                    "maxRetries": {
                        "title": "Retries per request",
                        "minimum": 0,
                        "maximum": 5,
                        "type": "integer",
                        "description": "Retry transient network failures before recording an error row.",
                        "default": 1
                    },
                    "headFirst": {
                        "title": "Use HEAD before GET",
                        "type": "boolean",
                        "description": "Check URLs with HEAD first, then fallback to GET for servers that block or mishandle HEAD.",
                        "default": true
                    },
                    "followRedirects": {
                        "title": "Follow redirects",
                        "type": "boolean",
                        "description": "Follow HTTP redirects and report the final URL and redirect count.",
                        "default": true
                    },
                    "includePageMetadata": {
                        "title": "Extract canonical and robots meta",
                        "type": "boolean",
                        "description": "Fetch each page with GET to extract canonical URL and robots meta. This is slower but useful for SEO audits.",
                        "default": false
                    },
                    "userAgent": {
                        "title": "User-Agent",
                        "type": "string",
                        "description": "Polite User-Agent sent to target websites.",
                        "default": "Automation-Lab-Sitemap-URL-Status-Auditor/1.0 (+https://apify.com/automation-lab/sitemap-url-status-auditor)"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
