JSON Validator & Formatter — Bulk Parse, Pretty-Print, Stats
Pricing
Pay per usage
JSON Validator & Formatter — Bulk Parse, Pretty-Print, Stats
26 runs / 1 paying user this week. Bulk JSON QA — paste strings or URLs. Output: isValid, formatted, minified, error, errorPosition, stats. ETL validation / API contract testing. Trustpilot 963r + 32-actor portfolio (2190 lifetime). spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai
Pricing
Pay per usage
Rating
0.0
(0)
Developer
Alex
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 hours ago
Last modified
Categories
Share
JSON Validator & Formatter — Bulk JSON Parsing, Pretty-Print, Minify
Validate and format JSON from inline strings or URLs in a single Apify run. Catch syntax errors with byte-position context, get pretty-printed and minified output, and compute structural stats (object/array/depth/key counts, sizes, compression ratio) per input.
This actor uses Node's built-in JSON.parse — strict spec compliance, no schema validation, no auto-fix transforms. If you need JSON Schema validation, comment-stripping, or NDJSON streaming, request a custom build.
Use cases
- API QA — validate batches of API responses in one run, surface broken endpoints
- ETL gate-check — verify upstream JSON before downstream stages corrupt warehouse loads
- Bulk formatter — pretty-print or minify many payloads with consistent indent
- Spec drift detection — diff stats (key counts, depth) across runs to catch schema changes
Inputs
| Field | Type | Description |
|---|---|---|
jsonStrings | string[] | Inline JSON strings to validate |
urls | string[] | URLs that return JSON. Strings without scheme get https:// auto-prepended. Fetched via fetch, single attempt, no retries, NO timeout (long-hanging URLs will block the run) |
indent | integer | Spaces for pretty-printing (1–8, default 2) |
Either or both arrays can be populated. Empty inputs → no work done.
Output (per input)
| Field | Type | Notes |
|---|---|---|
source | string | URL or "input" for inline strings |
isValid | boolean | JSON.parse success |
formatted | string | null | Pretty-printed (null if invalid) |
minified | string | null | Whitespace-stripped (null if invalid) |
error | string | null | Parse error message |
errorPosition | number | null | Byte offset of first parse error. Extracted via regex /position (\d+)/ against JSON.parse's error message — the format is Node-version-dependent (v17+ reliably emits "at position N"). Older Node versions may return a different format and errorPosition will then stay null even though error carries useful text. |
errorContext | string | null | ±20 chars around errorPosition (max 40 chars) |
stats | object | null | See below — present only when isValid: true |
scrapedAt | string | ISO timestamp |
For URL-fetch failures (network error, DNS fail, refused connection), the record is abbreviated: only source, isValid: false, error, scrapedAt — no formatted/minified/stats/errorPosition/errorContext keys at all (asymmetric vs. parse-failure records). Normalize explicitly if you join.
stats object (always emitted with all 12 fields — leaf-type counters default to 0 even when none present):
| Field | Type | Description |
|---|---|---|
objects | number | Total {} nodes |
arrays | number | Total [] nodes |
strings / numbers / booleans / nulls | number | Leaf-type counts (0 if none — explicitly emitted) |
totalKeys | number | Sum of object keys across all levels |
maxDepth | number | Deepest nesting level (0 = root scalar, N for N-deep nesting) |
originalSize | number | Input length (chars) |
formattedSize | number | Pretty-printed length |
minifiedSize | number | Minified length |
compressionRatio | number | minified / formatted (typically 0–1; small docs may exceed 1 if formatted is artificially compact) |
Example input
{"jsonStrings": ["{\"name\":\"test\",\"items\":[1,2,3]}","{ broken: 'json',"],"urls": ["https://api.github.com/repos/vercel/next.js","https://httpbin.org/json"],"indent": 2}
Example output (one valid record)
{"source": "input","isValid": true,"formatted": "{\n \"name\": \"test\",\n \"items\": [1, 2, 3]\n}","minified": "{\"name\":\"test\",\"items\":[1,2,3]}","error": null,"stats": {"objects": 1,"arrays": 1,"strings": 1,"numbers": 3,"booleans": 0,"nulls": 0,"totalKeys": 2,"maxDepth": 2,"originalSize": 31,"formattedSize": 41,"minifiedSize": 31,"compressionRatio": 0.756},"scrapedAt": "2026-04-29T10:30:00.000Z"}
Limits & honest caveats
- No JSON Schema validation. Only
JSON.parsestrictness. - No auto-fix. Comments, trailing commas, single quotes, unquoted keys → invalid.
- URL fetch is single-attempt with NO timeout. Node's global
fetchhas no default timeout — a hanging URL will block the run until Apify's wall-clock kills it. No retries on transient errors. - No proxy. URL fetches go out from Apify's compute IP directly.
- No HTTP status check before parse. A 404 HTML error page is fetched as text and fed to
JSON.parse, which fails with a parse error rather than a network error. This is correct behavior for "is this URL returning valid JSON?" questions but counter-intuitive when triaging — checkerrorcontent carefully. - No NDJSON / JSON-Lines mode. Each input must be one complete JSON document.
- Big-int precision loss.
JSON.parsereturns JavaScript Numbers. Integers larger than 2^53 (9,007,199,254,740,992) lose precision silently — common with database PKs serialized as raw integers in some APIs. Pre-process such inputs to strings if precision matters. - Apify pushData record cap (~9 MB per record). Very large
formatted/minifiedoutputs from a single huge JSON input may be truncated by the platform. Split into chunks for >9 MB documents. errorPositionis best-effort. Depends on Node v17+ error-message format; will be null on older runtimes.
Apify-as-a-Service — when you need data, not infrastructure
| Tier | Price | What you get |
|---|---|---|
| Pilot | $97 | 1 custom actor, basic config, 7-day support |
| Standard | $297 | Custom actor + Slack/email alerts, 30-day support |
| Premium | $797 | Custom actor + dashboard + 90-day support + 1 modification round |
A custom build can deliver: JSON Schema (Ajv) validation with full error reporting, NDJSON / JSON-Lines streaming, comment / trailing-comma tolerant parsing (json5), big-int support via JSON.parse reviver, fetch retry+timeout+proxy, recursive URL crawl with auth headers.
Email: spinov001@gmail.com — drop the validation rules, expected schema, or sample malformed inputs.
Proof of work: 31 published Apify scrapers (78 total in portfolio) — Trustpilot 949 runs, Reddit 80+, Google News 43, Glassdoor 37, Email Extractor 36+. Recently delivered a paid 3-article series for a client in the proxy industry ($150).
More tips: t.me/scraping_ai · blog.spinov.online
Related actors
| Actor | Returns |
|---|---|
| Website Tech Stack Detector | Technology fingerprints |
| SSL Certificate Checker | TLS data |
| WHOIS Domain Lookup | RDAP metadata |
| GitHub Profile Scraper | Profile + repo metadata |
All 31 published actors → apify.com/knotless_cadence
Honest disclosure: 12-field stats always emitted (leaf counters default to 0). errorPosition depends on Node v17+ message format. Single-attempt fetch, no timeout, no retry, no proxy. URL records on fetch failure are abbreviated (no formatted/minified/stats keys). Big-int precision loss above 2^53. Apify pushData ~9 MB cap per record.