Universal Data Structure Converter avatar

Universal Data Structure Converter

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Universal Data Structure Converter

Universal Data Structure Converter

A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Jamshaid Arif

Jamshaid Arif

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

3 days ago

Last modified

Share

πŸ”„ Universal Data Structure Converter β€” Apify Actor

A production-grade Apify actor that converts between HTML, XML, CSV, YAML, and JSON formats. Supports 9+ conversion types with smart auto-detection, nested JSON flattening, HTML table scraping, batch URL processing, and full customization.

🌐 Supported Conversions

#ConversionDescription
1HTML β†’ JSONParse DOM tree or extract <table> data
2XML β†’ JSONFull tree with attributes & namespaces
3CSV β†’ JSONWith auto type-casting (int/float/bool)
4YAML β†’ JSONSingle or multi-document streams
5JSON β†’ XMLCustom root/item tags, XML declaration
6JSON β†’ CSVNested object flattening to dot-columns
7JSON β†’ YAMLBlock or flow style output
8YAML β†’ XMLChained (YAML β†’ JSON β†’ XML)
9CSV β†’ XMLChained (CSV β†’ JSON β†’ XML)

✨ Key Features

  • Auto-Detection β€” Set conversion to auto and the actor detects whether input is HTML, XML, JSON, YAML, or CSV
  • URL Fetching β€” Provide a list of URLs to fetch and convert in batch
  • HTML Table Scraping β€” Extract <table> elements directly into structured JSON arrays
  • Smart Type-Casting β€” CSV values like "30", "true", "99.5" auto-cast to int, bool, float
  • Nested Flattening β€” {"a": {"b": 1}} becomes CSV column a.b when exporting JSON β†’ CSV
  • Proxy Support β€” Use Apify Proxy for fetching URLs behind firewalls
  • Custom Delimiters β€” Comma, tab, semicolon, pipe for CSV input/output
  • Pretty-Print or Minify β€” Configurable indentation or compact output

πŸ“‹ Input Schema

ParameterTypeDefaultDescription
conversionTypestringautoConversion to perform (or auto to detect)
outputFormatstringjsonTarget format when using auto-detect
inputDatastring(sample)Raw data to convert (paste directly)
sourceUrlsarray[]URLs to fetch and convert in batch
csvDelimiterstring,CSV column separator
csvHasHeaderbooleantrueTreat first CSV row as column names
typeCastbooleantrueAuto-cast CSV strings to native types
flattenNestedbooleantrueFlatten nested JSON for CSV export
flattenSeparatorstring.Separator for flattened key names
xmlRootTagstringrootRoot element name for XML output
xmlListItemTagstringitemTag for array items in XML output
xmlDeclarationbooleantrueInclude XML <?xml?> header
xmlStripNamespacesbooleantrueRemove namespace prefixes from XML tags
htmlExtractTablesbooleanfalseExtract only <table> elements from HTML
htmlParserstringlxmlBeautifulSoup parser engine
yamlMultiDocbooleanfalseParse multi-document YAML streams
indentinteger2Spaces for pretty-printing (0-8)
minifybooleanfalseCompact output (overrides indent)
outputAsStringbooleanfalseStore result as raw string instead of parsed JSON
proxyConfigurationobjectdisabledProxy settings for URL fetching

πŸš€ Usage Examples

Example 1: Convert CSV β†’ JSON (default)

Just run the actor with defaults β€” it ships with sample CSV data and auto-detects the conversion:

{
"conversionType": "auto",
"outputFormat": "json"
}

Example 2: HTML Table Scraping

{
"conversionType": "html2json",
"inputData": "<table><tr><th>Name</th><th>Age</th></tr><tr><td>Alice</td><td>30</td></tr></table>",
"htmlExtractTables": true
}

Example 3: Batch URL Processing

{
"conversionType": "auto",
"outputFormat": "json",
"sourceUrls": [
{ "url": "https://example.com/data.csv" },
{ "url": "https://api.example.com/config.yaml" }
]
}

Example 4: JSON β†’ CSV with Flattening

{
"conversionType": "json2csv",
"inputData": "[{\"id\":1,\"name\":\"Alice\",\"address\":{\"city\":\"NYC\",\"zip\":\"10001\"}}]",
"flattenNested": true,
"flattenSeparator": "."
}

Example 5: XML β†’ JSON (Strip Namespaces)

{
"conversionType": "xml2json",
"inputData": "<?xml version='1.0'?><catalog><book id='1'><title>Hello</title></book></catalog>",
"xmlStripNamespaces": true
}

πŸ“€ Output Format

Each converted item is stored in the dataset with this structure:

{
"source": "inline_input",
"conversion": "csv2json",
"inputFormat": "csv",
"outputFormat": "json",
"timestamp": "2026-04-01T17:30:00.000Z",
"status": "success",
"error": null,
"data": [ ... ]
}
  • data β€” Parsed result (for JSON outputs)
  • rawOutput β€” Raw string result (for XML/CSV/YAML outputs, or when outputAsString is true)
  • status β€” "success" or "failed"
  • error β€” Error message if conversion failed

Run statistics are stored in the Key-Value Store under the key RUN_STATS.

πŸ›  Local Development

# Clone and install
cd apify-data-converter
pip install -r requirements.txt
# Run locally with Apify CLI
apify run --input-file=input.json

πŸ“¦ Dependencies

  • apify β€” Apify SDK for Python
  • httpx β€” Async HTTP client for URL fetching
  • pyyaml β€” YAML parsing and serialization
  • beautifulsoup4 + lxml β€” HTML parsing
  • html5lib β€” Lenient HTML parser for broken markup