Universal Data Structure Converter
Pricing
from $10.00 / 1,000 results
Universal Data Structure Converter
A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Jamshaid Arif
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
3 days ago
Last modified
Categories
Share
π Universal Data Structure Converter β Apify Actor
A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.
π Supported Conversions
| # | Conversion | Description |
|---|---|---|
| 1 | HTML β JSON | Parse DOM tree or extract <table> data |
| 2 | XML β JSON | Full tree with attributes & namespaces |
| 3 | CSV β JSON | With auto type-casting (int/float/bool) |
| 4 | YAML β JSON | Single or multi-document streams |
| 5 | JSON β XML | Custom root/item tags, XML declaration |
| 6 | JSON β CSV | Nested object flattening to dot-columns |
| 7 | JSON β YAML | Block or flow style output |
| 8 | YAML β XML | Chained (YAML β JSON β XML) |
| 9 | CSV β XML | Chained (CSV β JSON β XML) |
β¨ Key Features
- Auto-Detection β Set conversion to
autoand the actor detects whether input is HTML, XML, JSON, YAML, or CSV - URL Fetching β Provide a list of URLs to fetch and convert in batch
- HTML Table Scraping β Extract
<table>elements directly into structured JSON arrays - Smart Type-Casting β CSV values like
"30","true","99.5"auto-cast toint,bool,float - Nested Flattening β
{"a": {"b": 1}}becomes CSV columna.bwhen exporting JSON β CSV - Proxy Support β Use Apify Proxy for fetching URLs behind firewalls
- Custom Delimiters β Comma, tab, semicolon, pipe for CSV input/output
- Pretty-Print or Minify β Configurable indentation or compact output
π Input Schema
| Parameter | Type | Default | Description |
|---|---|---|---|
conversionType | string | auto | Conversion to perform (or auto to detect) |
outputFormat | string | json | Target format when using auto-detect |
inputData | string | (sample) | Raw data to convert (paste directly) |
sourceUrls | array | [] | URLs to fetch and convert in batch |
csvDelimiter | string | , | CSV column separator |
csvHasHeader | boolean | true | Treat first CSV row as column names |
typeCast | boolean | true | Auto-cast CSV strings to native types |
flattenNested | boolean | true | Flatten nested JSON for CSV export |
flattenSeparator | string | . | Separator for flattened key names |
xmlRootTag | string | root | Root element name for XML output |
xmlListItemTag | string | item | Tag for array items in XML output |
xmlDeclaration | boolean | true | Include XML <?xml?> header |
xmlStripNamespaces | boolean | true | Remove namespace prefixes from XML tags |
htmlExtractTables | boolean | false | Extract only <table> elements from HTML |
htmlParser | string | lxml | BeautifulSoup parser engine |
yamlMultiDoc | boolean | false | Parse multi-document YAML streams |
indent | integer | 2 | Spaces for pretty-printing (0-8) |
minify | boolean | false | Compact output (overrides indent) |
outputAsString | boolean | false | Store result as raw string instead of parsed JSON |
proxyConfiguration | object | disabled | Proxy settings for URL fetching |
π Usage Examples
Example 1: Convert CSV β JSON (default)
Just run the actor with defaults β it ships with sample CSV data and auto-detects the conversion:
{"conversionType": "auto","outputFormat": "json"}
Example 2: HTML Table Scraping
{"conversionType": "html2json","inputData": "<table><tr><th>Name</th><th>Age</th></tr><tr><td>Alice</td><td>30</td></tr></table>","htmlExtractTables": true}
Example 3: Batch URL Processing
{"conversionType": "auto","outputFormat": "json","sourceUrls": [{ "url": "https://example.com/data.csv" },{ "url": "https://api.example.com/config.yaml" }]}
Example 4: JSON β CSV with Flattening
{"conversionType": "json2csv","inputData": "[{\"id\":1,\"name\":\"Alice\",\"address\":{\"city\":\"NYC\",\"zip\":\"10001\"}}]","flattenNested": true,"flattenSeparator": "."}
Example 5: XML β JSON (Strip Namespaces)
{"conversionType": "xml2json","inputData": "<?xml version='1.0'?><catalog><book id='1'><title>Hello</title></book></catalog>","xmlStripNamespaces": true}
π€ Output Format
Each converted item is stored in the dataset with this structure:
{"source": "inline_input","conversion": "csv2json","inputFormat": "csv","outputFormat": "json","timestamp": "2026-04-01T17:30:00.000Z","status": "success","error": null,"data": [ ... ]}
dataβ Parsed result (for JSON outputs)rawOutputβ Raw string result (for XML/CSV/YAML outputs, or whenoutputAsStringis true)statusβ"success"or"failed"errorβ Error message if conversion failed
Run statistics are stored in the Key-Value Store under the key RUN_STATS.
π Local Development
# Clone and installcd apify-data-converterpip install -r requirements.txt# Run locally with Apify CLIapify run --input-file=input.json
π¦ Dependencies
apifyβ Apify SDK for Pythonhttpxβ Async HTTP client for URL fetchingpyyamlβ YAML parsing and serializationbeautifulsoup4+lxmlβ HTML parsinghtml5libβ Lenient HTML parser for broken markup