# Universal Data Structure Converter (`moving_beacon-owner1/my-actor-63`) Actor

A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.

- **URL**: https://apify.com/moving\_beacon-owner1/my-actor-63.md
- **Developed by:** [Jamshaid Arif](https://apify.com/moving_beacon-owner1) (community)
- **Categories:** Developer tools, Automation, Integrations
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🔄 Universal Data Structure Converter — Apify Actor

A production-grade Apify actor that converts between **HTML, XML, CSV, YAML, and JSON** formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.

### 🌐 Supported Conversions

| ## | Conversion    | Description                                |
|---|---------------|--------------------------------------------|
| 1 | HTML → JSON   | Parse DOM tree or extract `<table>` data   |
| 2 | XML  → JSON   | Full tree with attributes & namespaces     |
| 3 | CSV  → JSON   | With auto type-casting (int/float/bool)    |
| 4 | YAML → JSON   | Single or multi-document streams           |
| 5 | JSON → XML    | Custom root/item tags, XML declaration     |
| 6 | JSON → CSV    | Nested object flattening to dot-columns    |
| 7 | JSON → YAML   | Block or flow style output                 |
| 8 | YAML → XML    | Chained (YAML → JSON → XML)               |
| 9 | CSV  → XML    | Chained (CSV → JSON → XML)                |

### ✨ Key Features

- **Auto-Detection** — Set conversion to `auto` and the actor detects whether input is HTML, XML, JSON, YAML, or CSV
- **URL Fetching** — Provide a list of URLs to fetch and convert in batch
- **HTML Table Scraping** — Extract `<table>` elements directly into structured JSON arrays
- **Smart Type-Casting** — CSV values like `"30"`, `"true"`, `"99.5"` auto-cast to `int`, `bool`, `float`
- **Nested Flattening** — `{"a": {"b": 1}}` becomes CSV column `a.b` when exporting JSON → CSV
- **Proxy Support** — Use Apify Proxy for fetching URLs behind firewalls
- **Custom Delimiters** — Comma, tab, semicolon, pipe for CSV input/output
- **Pretty-Print or Minify** — Configurable indentation or compact output

### 📋 Input Schema

| Parameter            | Type    | Default    | Description                                        |
|----------------------|---------|------------|----------------------------------------------------|
| `conversionType`     | string  | `auto`     | Conversion to perform (or `auto` to detect)        |
| `outputFormat`       | string  | `json`     | Target format when using auto-detect               |
| `inputData`          | string  | *(sample)* | Raw data to convert (paste directly)               |
| `sourceUrls`         | array   | `[]`       | URLs to fetch and convert in batch                 |
| `csvDelimiter`       | string  | `,`        | CSV column separator                               |
| `csvHasHeader`       | boolean | `true`     | Treat first CSV row as column names                |
| `typeCast`           | boolean | `true`     | Auto-cast CSV strings to native types              |
| `flattenNested`      | boolean | `true`     | Flatten nested JSON for CSV export                 |
| `flattenSeparator`   | string  | `.`        | Separator for flattened key names                  |
| `xmlRootTag`         | string  | `root`     | Root element name for XML output                   |
| `xmlListItemTag`     | string  | `item`     | Tag for array items in XML output                  |
| `xmlDeclaration`     | boolean | `true`     | Include XML `<?xml?>` header                       |
| `xmlStripNamespaces` | boolean | `true`     | Remove namespace prefixes from XML tags            |
| `htmlExtractTables`  | boolean | `false`    | Extract only `<table>` elements from HTML          |
| `htmlParser`         | string  | `lxml`     | BeautifulSoup parser engine                        |
| `yamlMultiDoc`       | boolean | `false`    | Parse multi-document YAML streams                  |
| `indent`             | integer | `2`        | Spaces for pretty-printing (0-8)                   |
| `minify`             | boolean | `false`    | Compact output (overrides indent)                  |
| `outputAsString`     | boolean | `false`    | Store result as raw string instead of parsed JSON  |
| `proxyConfiguration` | object  | *disabled* | Proxy settings for URL fetching                    |

### 🚀 Usage Examples

#### Example 1: Convert CSV → JSON (default)

Just run the actor with defaults — it ships with sample CSV data and auto-detects the conversion:

```json
{
    "conversionType": "auto",
    "outputFormat": "json"
}
````

#### Example 2: HTML Table Scraping

```json
{
    "conversionType": "html2json",
    "inputData": "<table><tr><th>Name</th><th>Age</th></tr><tr><td>Alice</td><td>30</td></tr></table>",
    "htmlExtractTables": true
}
```

#### Example 3: Batch URL Processing

```json
{
    "conversionType": "auto",
    "outputFormat": "json",
    "sourceUrls": [
        { "url": "https://example.com/data.csv" },
        { "url": "https://api.example.com/config.yaml" }
    ]
}
```

#### Example 4: JSON → CSV with Flattening

```json
{
    "conversionType": "json2csv",
    "inputData": "[{\"id\":1,\"name\":\"Alice\",\"address\":{\"city\":\"NYC\",\"zip\":\"10001\"}}]",
    "flattenNested": true,
    "flattenSeparator": "."
}
```

#### Example 5: XML → JSON (Strip Namespaces)

```json
{
    "conversionType": "xml2json",
    "inputData": "<?xml version='1.0'?><catalog><book id='1'><title>Hello</title></book></catalog>",
    "xmlStripNamespaces": true
}
```

### 📤 Output Format

Each converted item is stored in the dataset with this structure:

```json
{
    "source": "inline_input",
    "conversion": "csv2json",
    "inputFormat": "csv",
    "outputFormat": "json",
    "timestamp": "2026-04-01T17:30:00.000Z",
    "status": "success",
    "error": null,
    "data": [ ... ]
}
```

- **`data`** — Parsed result (for JSON outputs)
- **`rawOutput`** — Raw string result (for XML/CSV/YAML outputs, or when `outputAsString` is true)
- **`status`** — `"success"` or `"failed"`
- **`error`** — Error message if conversion failed

Run statistics are stored in the Key-Value Store under the key `RUN_STATS`.

### 🛠 Local Development

```bash
## Clone and install
cd apify-data-converter
pip install -r requirements.txt

## Run locally with Apify CLI
apify run --input-file=input.json
```

### 📦 Dependencies

- `apify` — Apify SDK for Python
- `httpx` — Async HTTP client for URL fetching
- `pyyaml` — YAML parsing and serialization
- `beautifulsoup4` + `lxml` — HTML parsing
- `html5lib` — Lenient HTML parser for broken markup

# Actor input Schema

## `conversionType` (type: `string`):

The source→target format conversion to perform. Use 'auto' to auto-detect input format (output inferred from outputFormat).

## `outputFormat` (type: `string`):

When conversionType is 'auto', specify the desired output format. Ignored if a specific conversion is chosen.

## `inputData` (type: `string`):

Paste raw data here (HTML, XML, CSV, YAML, or JSON). Leave empty if using sourceUrls to fetch data from URLs instead.

## `sourceUrls` (type: `array`):

List of URLs to fetch data from. Each URL is fetched, converted, and stored as a separate result. Leave empty if pasting data directly in inputData.

## `csvDelimiter` (type: `string`):

Column separator for CSV input/output. Common: ',' (comma), '\t' (tab), ';' (semicolon), '|' (pipe).

## `csvHasHeader` (type: `boolean`):

If enabled, the first row of CSV data is treated as column names. If disabled, data is returned as arrays of arrays.

## `typeCast` (type: `boolean`):

Automatically convert CSV string values to appropriate types: numbers become int/float, 'true'/'false' become booleans, 'null'/'none' become null.

## `flattenNested` (type: `boolean`):

When converting JSON → CSV, flatten nested objects into dot-separated column names. e.g., {"address": {"city": "NYC"}} becomes column 'address.city'.

## `flattenSeparator` (type: `string`):

Character(s) used to join nested key names when flattening. e.g., '.' gives 'parent.child', '\_' gives 'parent\_child'.

## `xmlRootTag` (type: `string`):

Name for the root XML element when converting to XML format.

## `xmlListItemTag` (type: `string`):

Tag name used to wrap each array item when converting JSON arrays to XML.

## `xmlDeclaration` (type: `boolean`):

Add <?xml version='1.0' encoding='utf-8'?> header to XML output.

## `xmlStripNamespaces` (type: `boolean`):

Remove namespace prefixes from XML tag names when converting XML → JSON for cleaner output.

## `htmlExtractTables` (type: `boolean`):

Instead of parsing the full DOM tree, extract only <table> elements into structured JSON arrays. Ideal for scraping tabular data from web pages.

## `htmlParser` (type: `string`):

BeautifulSoup parser backend. 'lxml' is fastest, 'html.parser' needs no extra deps, 'html5lib' handles broken HTML best.

## `yamlMultiDoc` (type: `boolean`):

Parse all documents in a multi-document YAML stream (separated by ---). Returns a JSON array of documents.

## `indent` (type: `integer`):

Number of spaces for pretty-printing JSON/XML/YAML output. Set to 0 for compact output.

## `minify` (type: `boolean`):

Produce compact output with no indentation or extra whitespace. Overrides the indent setting.

## `outputAsString` (type: `boolean`):

Store the converted result as a single raw string field instead of parsed objects. Useful for XML, CSV, and YAML outputs that aren't natively JSON.

## `proxyConfiguration` (type: `object`):

Proxy settings for fetching source URLs. Leave empty for direct connections.

## Actor input object example

```json
{
  "conversionType": "auto",
  "outputFormat": "json",
  "inputData": "name,age,city,role,salary\nAlice Johnson,30,New York,Engineer,125000.50\nBob Smith,25,London,Designer,98000.00\nCharlie Chen,35,Tokyo,Manager,135000.75\nDiana Patel,28,Berlin,Analyst,92000.00\nEthan Williams,32,Sydney,Developer,110000.25",
  "sourceUrls": [],
  "csvDelimiter": ",",
  "csvHasHeader": true,
  "typeCast": true,
  "flattenNested": true,
  "flattenSeparator": ".",
  "xmlRootTag": "root",
  "xmlListItemTag": "item",
  "xmlDeclaration": true,
  "xmlStripNamespaces": true,
  "htmlExtractTables": false,
  "htmlParser": "lxml",
  "yamlMultiDoc": false,
  "indent": 2,
  "minify": false,
  "outputAsString": false,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "conversionType": "auto",
    "outputFormat": "json",
    "inputData": `name,age,city,role,salary
Alice Johnson,30,New York,Engineer,125000.50
Bob Smith,25,London,Designer,98000.00
Charlie Chen,35,Tokyo,Manager,135000.75
Diana Patel,28,Berlin,Analyst,92000.00
Ethan Williams,32,Sydney,Developer,110000.25`,
    "sourceUrls": [],
    "csvDelimiter": ",",
    "csvHasHeader": true,
    "typeCast": true,
    "flattenNested": true,
    "flattenSeparator": ".",
    "xmlRootTag": "root",
    "xmlListItemTag": "item",
    "xmlDeclaration": true,
    "xmlStripNamespaces": true,
    "htmlExtractTables": false,
    "htmlParser": "lxml",
    "yamlMultiDoc": false,
    "indent": 2,
    "minify": false,
    "outputAsString": false,
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("moving_beacon-owner1/my-actor-63").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "conversionType": "auto",
    "outputFormat": "json",
    "inputData": """name,age,city,role,salary
Alice Johnson,30,New York,Engineer,125000.50
Bob Smith,25,London,Designer,98000.00
Charlie Chen,35,Tokyo,Manager,135000.75
Diana Patel,28,Berlin,Analyst,92000.00
Ethan Williams,32,Sydney,Developer,110000.25""",
    "sourceUrls": [],
    "csvDelimiter": ",",
    "csvHasHeader": True,
    "typeCast": True,
    "flattenNested": True,
    "flattenSeparator": ".",
    "xmlRootTag": "root",
    "xmlListItemTag": "item",
    "xmlDeclaration": True,
    "xmlStripNamespaces": True,
    "htmlExtractTables": False,
    "htmlParser": "lxml",
    "yamlMultiDoc": False,
    "indent": 2,
    "minify": False,
    "outputAsString": False,
    "proxyConfiguration": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("moving_beacon-owner1/my-actor-63").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "conversionType": "auto",
  "outputFormat": "json",
  "inputData": "name,age,city,role,salary\\nAlice Johnson,30,New York,Engineer,125000.50\\nBob Smith,25,London,Designer,98000.00\\nCharlie Chen,35,Tokyo,Manager,135000.75\\nDiana Patel,28,Berlin,Analyst,92000.00\\nEthan Williams,32,Sydney,Developer,110000.25",
  "sourceUrls": [],
  "csvDelimiter": ",",
  "csvHasHeader": true,
  "typeCast": true,
  "flattenNested": true,
  "flattenSeparator": ".",
  "xmlRootTag": "root",
  "xmlListItemTag": "item",
  "xmlDeclaration": true,
  "xmlStripNamespaces": true,
  "htmlExtractTables": false,
  "htmlParser": "lxml",
  "yamlMultiDoc": false,
  "indent": 2,
  "minify": false,
  "outputAsString": false,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call moving_beacon-owner1/my-actor-63 --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=moving_beacon-owner1/my-actor-63",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Universal Data Structure Converter",
        "description": "A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.",
        "version": "0.0",
        "x-build-id": "yyQBXmnvzYCFbrEll"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/moving_beacon-owner1~my-actor-63/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-moving_beacon-owner1-my-actor-63",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/moving_beacon-owner1~my-actor-63/runs": {
            "post": {
                "operationId": "runs-sync-moving_beacon-owner1-my-actor-63",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/moving_beacon-owner1~my-actor-63/run-sync": {
            "post": {
                "operationId": "run-sync-moving_beacon-owner1-my-actor-63",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "conversionType"
                ],
                "properties": {
                    "conversionType": {
                        "title": "Conversion Type",
                        "enum": [
                            "auto",
                            "html2json",
                            "xml2json",
                            "csv2json",
                            "yaml2json",
                            "json2xml",
                            "json2csv",
                            "json2yaml",
                            "yaml2xml",
                            "csv2xml"
                        ],
                        "type": "string",
                        "description": "The source→target format conversion to perform. Use 'auto' to auto-detect input format (output inferred from outputFormat).",
                        "default": "auto"
                    },
                    "outputFormat": {
                        "title": "Output Format (for Auto-Detect)",
                        "enum": [
                            "json",
                            "xml",
                            "csv",
                            "yaml"
                        ],
                        "type": "string",
                        "description": "When conversionType is 'auto', specify the desired output format. Ignored if a specific conversion is chosen.",
                        "default": "json"
                    },
                    "inputData": {
                        "title": "Input Data",
                        "type": "string",
                        "description": "Paste raw data here (HTML, XML, CSV, YAML, or JSON). Leave empty if using sourceUrls to fetch data from URLs instead.",
                        "default": "name,age,city,role,salary\nAlice Johnson,30,New York,Engineer,125000.50\nBob Smith,25,London,Designer,98000.00\nCharlie Chen,35,Tokyo,Manager,135000.75\nDiana Patel,28,Berlin,Analyst,92000.00\nEthan Williams,32,Sydney,Developer,110000.25"
                    },
                    "sourceUrls": {
                        "title": "Source URLs",
                        "type": "array",
                        "description": "List of URLs to fetch data from. Each URL is fetched, converted, and stored as a separate result. Leave empty if pasting data directly in inputData.",
                        "default": [],
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "csvDelimiter": {
                        "title": "CSV Delimiter",
                        "enum": [
                            ",",
                            "\t",
                            ";",
                            "|"
                        ],
                        "type": "string",
                        "description": "Column separator for CSV input/output. Common: ',' (comma), '\\t' (tab), ';' (semicolon), '|' (pipe).",
                        "default": ","
                    },
                    "csvHasHeader": {
                        "title": "CSV Has Header Row",
                        "type": "boolean",
                        "description": "If enabled, the first row of CSV data is treated as column names. If disabled, data is returned as arrays of arrays.",
                        "default": true
                    },
                    "typeCast": {
                        "title": "Auto Type-Cast CSV Values",
                        "type": "boolean",
                        "description": "Automatically convert CSV string values to appropriate types: numbers become int/float, 'true'/'false' become booleans, 'null'/'none' become null.",
                        "default": true
                    },
                    "flattenNested": {
                        "title": "Flatten Nested JSON for CSV",
                        "type": "boolean",
                        "description": "When converting JSON → CSV, flatten nested objects into dot-separated column names. e.g., {\"address\": {\"city\": \"NYC\"}} becomes column 'address.city'.",
                        "default": true
                    },
                    "flattenSeparator": {
                        "title": "Flatten Key Separator",
                        "type": "string",
                        "description": "Character(s) used to join nested key names when flattening. e.g., '.' gives 'parent.child', '_' gives 'parent_child'.",
                        "default": "."
                    },
                    "xmlRootTag": {
                        "title": "XML Root Tag",
                        "type": "string",
                        "description": "Name for the root XML element when converting to XML format.",
                        "default": "root"
                    },
                    "xmlListItemTag": {
                        "title": "XML List Item Tag",
                        "type": "string",
                        "description": "Tag name used to wrap each array item when converting JSON arrays to XML.",
                        "default": "item"
                    },
                    "xmlDeclaration": {
                        "title": "Include XML Declaration",
                        "type": "boolean",
                        "description": "Add <?xml version='1.0' encoding='utf-8'?> header to XML output.",
                        "default": true
                    },
                    "xmlStripNamespaces": {
                        "title": "Strip XML Namespaces",
                        "type": "boolean",
                        "description": "Remove namespace prefixes from XML tag names when converting XML → JSON for cleaner output.",
                        "default": true
                    },
                    "htmlExtractTables": {
                        "title": "HTML Table Extraction Mode",
                        "type": "boolean",
                        "description": "Instead of parsing the full DOM tree, extract only <table> elements into structured JSON arrays. Ideal for scraping tabular data from web pages.",
                        "default": false
                    },
                    "htmlParser": {
                        "title": "HTML Parser Engine",
                        "enum": [
                            "lxml",
                            "html.parser",
                            "html5lib"
                        ],
                        "type": "string",
                        "description": "BeautifulSoup parser backend. 'lxml' is fastest, 'html.parser' needs no extra deps, 'html5lib' handles broken HTML best.",
                        "default": "lxml"
                    },
                    "yamlMultiDoc": {
                        "title": "YAML Multi-Document Mode",
                        "type": "boolean",
                        "description": "Parse all documents in a multi-document YAML stream (separated by ---). Returns a JSON array of documents.",
                        "default": false
                    },
                    "indent": {
                        "title": "Output Indentation",
                        "minimum": 0,
                        "maximum": 8,
                        "type": "integer",
                        "description": "Number of spaces for pretty-printing JSON/XML/YAML output. Set to 0 for compact output.",
                        "default": 2
                    },
                    "minify": {
                        "title": "Minify Output",
                        "type": "boolean",
                        "description": "Produce compact output with no indentation or extra whitespace. Overrides the indent setting.",
                        "default": false
                    },
                    "outputAsString": {
                        "title": "Output as Raw String",
                        "type": "boolean",
                        "description": "Store the converted result as a single raw string field instead of parsed objects. Useful for XML, CSV, and YAML outputs that aren't natively JSON.",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy Configuration",
                        "type": "object",
                        "description": "Proxy settings for fetching source URLs. Leave empty for direct connections.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
