JSON Schema Generator avatar

JSON Schema Generator

Pricing

Pay per event

Go to Apify Store
JSON Schema Generator

JSON Schema Generator

Generate JSON Schema (draft-07) from sample JSON instantly. Auto-detects types, required fields, nullable values, and nested structures. Export as JSON or YAML.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

What does JSON Schema Generator do?

JSON Schema Generator instantly converts sample JSON objects into a valid JSON Schema (draft-07) — no coding required. Paste one or more JSON examples and get a complete, ready-to-use schema with correct types, required fields, nullable values, and nested structures.

It handles everything automatically: objects, arrays, nested structures, mixed types, null values, optional fields, and common string formats like email addresses, URLs, dates, and ISO timestamps. You can merge multiple samples into one unified schema (so optional fields are properly detected), or generate a separate schema for each sample.

The easiest way to try it: open the actor on Apify Store, paste your JSON into the input form, and click Run.

Who is JSON Schema Generator for?

Backend developers validating API responses — your service returns JSON from a third-party API you don't control. Paste a few example responses and get a schema to validate future payloads, catch breaking changes, and document the contract.

Frontend developers documenting data shapes — you're working with a new dataset or API and need to understand its structure. Feed the actor 10-20 real samples and instantly see which fields are always present vs optional, what types they are, and how nested objects are shaped.

Data engineers building pipelines — you receive raw JSON from event streams or webhooks and need to schema-validate before loading into a database. Generate a draft-07 schema in one run, then use it with ajv, jsonschema, or any validator.

QA engineers writing test fixtures — use the generated schema to verify test data is structurally correct, catch regressions when upstream data changes shape, and document expected output formats in CI.

No-code users exploring JSON datasets — if someone sent you a JSON file and you want to understand its structure without writing code, this actor produces a human-readable summary of every field, its type, and whether it's required.

Why use JSON Schema Generator?

  • 🚀 Instant results — generates schemas in under 5 seconds regardless of input complexity
  • 🔀 Multi-sample merging — feed multiple JSON examples and optional fields are correctly detected (fields missing from some samples get ["type", "null"] type)
  • 🏷️ Auto format detection — strings are automatically tagged with format: email, format: uri, format: date, format: date-time where applicable
  • 🎯 Correct required inference — a field is required only if it appears in ALL your samples AND is never null
  • 📦 Nested object and array support — recursively infers types for objects inside objects, arrays of primitives, and arrays of objects
  • 📝 JSON + YAML output — get the schema in JSON (default) or YAML format
  • 🔒 additionalProperties control — optionally generate strict schemas that reject unknown keys
  • 🌐 Runs via API — automate schema generation in your CI/CD pipeline or data validation workflow
  • ☁️ No software to install — runs on Apify cloud, works from any device

What data can you extract?

Each schema result in the dataset includes:

FieldTypeDescription
sampleIndexintegerSchema index (0 for merged, 1-N for per-sample)
schemaTitlestringTitle embedded in the schema (from your input or auto-generated)
fieldCountintegerNumber of top-level properties in the schema
requiredCountintegerNumber of fields marked as required
outputFormatstringjson or yaml
schemastringThe generated schema as a formatted string (JSON or YAML)
schemaObjectobjectThe schema as a raw JSON object (useful for downstream processing)

Supported schema features:

  • 🔤 Type inference: string, integer, number, boolean, array, object, null
  • 📋 Required field detection from multi-sample overlap
  • 🔗 String formats: email, uri, date, date-time
  • 🏗️ Nested objects and arrays (recursive, any depth)
  • ⚙️ additionalProperties: false for strict schemas
  • 🎛️ Custom title and description in the schema root

How much does it cost to generate JSON schemas?

JSON Schema Generator uses Pay-Per-Event pricing — you only pay for what you generate. Prices scale down at higher Apify subscription tiers.

EventFree planPaid plans (from)
Run start$0.005 (one-time per run)$0.005
Schema generated$0.0023 per schemafrom $0.00056

Real-world cost examples (Free plan):

ScenarioSchemasCost
Single sample → 1 merged schema1~$0.0073
10 samples merged1~$0.0073
50 samples, per-sample mode50~$0.1200
1000 samples, per-sample mode1000~$2.305

Free plan estimate: Apify gives new users $5 in free credits. That's enough to generate ~2,150 schemas on the free plan — more than enough to evaluate the actor and set up your workflow.

Pricing is the same whether you run via the Apify Console or the API.

How to generate a JSON Schema from sample data

  1. Go to JSON Schema Generator on Apify Store
  2. Click Try for free (no credit card required for the free tier)
  3. In the JSON samples field, paste one or more JSON objects (as a JSON array)
  4. Optionally set a Schema title and choose JSON or YAML output format
  5. Configure options: toggle Infer required fields, Allow additional properties, and Merge all samples
  6. Click Start — the run completes in a few seconds
  7. Open the Dataset tab to download your schema as JSON, CSV, or Excel

Input JSON example (two user objects):

[
{"id": 1, "name": "Alice", "email": "alice@example.com", "age": 30, "isActive": true},
{"id": 2, "name": "Bob", "age": null, "isActive": false}
]

Generated schema output:

{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "User",
"type": "object",
"properties": {
"id": { "type": "integer" },
"name": { "type": "string" },
"email": { "type": ["string", "null"], "format": "email" },
"age": { "type": ["integer", "null"] },
"isActive": { "type": "boolean" }
},
"required": ["id", "name", "isActive"]
}

Note how email is optional (only in sample 1) and age is nullable (null in sample 2) — both correctly reflected as ["type", "null"].

Input parameters

ParameterTypeDefaultDescription
samplesarrayRequired. One or more JSON objects to generate a schema from
schemaTitlestring""Optional title for the "title" field in the generated schema
schemaDescriptionstring""Optional description for the "description" field
outputFormatstring"json"Output format: "json" or "yaml"
inferRequiredbooleantrueMark fields as required if they appear in ALL samples and are never null
allowAdditionalPropertiesbooleantrueWhen false, adds "additionalProperties": false to object schemas
mergeAllSamplesbooleantrueMerge all samples into one schema (recommended); when false, generates one schema per sample

Multiple samples input format:

{
"samples": [
{"field1": "value1", "field2": 42},
{"field1": "value2", "field3": true}
],
"schemaTitle": "MySchema",
"inferRequired": true,
"mergeAllSamples": true
}

Output examples

JSON output (default):

{
"sampleIndex": 0,
"schemaTitle": "Product",
"fieldCount": 5,
"requiredCount": 4,
"outputFormat": "json",
"schema": "{\n \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n ...\n}",
"schemaObject": {
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Product",
"type": "object",
"properties": {
"url": { "type": "string", "format": "uri" },
"price": { "type": "number" },
"inStock": { "type": "boolean" },
"tags": { "type": "array", "items": { "type": "string" } },
"sku": { "type": ["string", "null"] }
},
"required": ["url", "price", "inStock", "tags"]
}
}

YAML output:

$schema: "http://json-schema.org/draft-07/schema#"
title: Product
type: object
properties:
url:
type: string
format: uri
price:
type: number
inStock:
type: boolean
required:
- url
- price
- inStock

Tips for best results

  • 🔢 Use at least 2-3 samples — a single sample won't distinguish required from optional fields. The more samples you provide, the more accurate the "required" inference.
  • 🎯 Include edge cases — if some records have null values, missing fields, or different types for the same field, include those samples. The schema will reflect the full range of observed data.
  • 🔗 Real URLs get format: uri — if your string field contains URLs in the samples, the generated schema will automatically include "format": "uri".
  • ⚠️ Large array items — when arrays contain many different object structures, the actor merges all item schemas. This is usually what you want, but if you have highly polymorphic arrays, consider separating them.
  • 🔒 Strict schemas — set allowAdditionalProperties: false when you want to reject unknown keys in validated data (e.g., API contract validation). Leave it true for lenient validation.
  • 📊 Per-sample mode — when you want to see differences across sample versions (e.g., comparing API v1 vs v2 responses), use mergeAllSamples: false to get one schema per record.
  • ⬇️ Download as JSON — the schemaObject field in each result is the raw JSON you can copy directly into your project.

Integrations

JSON Schema Generator → GitHub Actions (CI schema validation)

Generate schemas from production data samples, commit them to your repo, and run validation in CI. Every time your upstream API changes shape, your tests catch it:

  1. Run JSON Schema Generator on a set of real API response samples
  2. Download the schemaObject from the dataset
  3. Commit it to your repo as schemas/api-response.json
  4. In CI, validate new responses against the stored schema with ajv

JSON Schema Generator → Google Sheets (documentation workflow)

Teams maintaining API documentation use this actor to auto-generate field documentation:

  1. Paste 10-20 API response examples into the actor
  2. Export the dataset to Google Sheets via Apify's Google Sheets integration
  3. The fieldCount, requiredCount, and formatted schema columns give a clear table of what changed across schema versions

JSON Schema Generator → Make / Zapier (automated schema monitoring)

Monitor a webhook payload for schema drift:

  1. Schedule the actor to run daily with fresh samples
  2. Connect the Apify trigger in Make or Zapier
  3. Compare fieldCount and requiredCount against baseline — alert via Slack when they change

JSON Schema Generator → Apify Dataset → Your database

Use the schemaObject field directly from the Apify dataset API to automate schema ingestion into your app:

const dataset = await client.dataset(datasetId).listItems();
const schema = dataset.items[0].schemaObject;
await db.schemas.insert({ name: 'my-schema', schema });

Using the Apify API

Run JSON Schema Generator programmatically from your code — no browser needed.

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_APIFY_TOKEN' });
const run = await client.actor('automation-lab/json-schema-generator').call({
samples: [
{ id: 1, name: 'Alice', email: 'alice@example.com', age: 30 },
{ id: 2, name: 'Bob', age: null },
],
schemaTitle: 'User',
outputFormat: 'json',
inferRequired: true,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items[0].schemaObject);

Python

from apify_client import ApifyClient
client = ApifyClient(token='YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/json-schema-generator').call(run_input={
'samples': [
{'id': 1, 'name': 'Alice', 'email': 'alice@example.com', 'age': 30},
{'id': 2, 'name': 'Bob', 'age': None},
],
'schemaTitle': 'User',
'outputFormat': 'json',
'inferRequired': True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items[0]['schemaObject'])

cURL

curl -s -X POST \
"https://api.apify.com/v2/acts/automation-lab~json-schema-generator/runs?token=YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"samples": [
{"id": 1, "name": "Alice", "email": "alice@example.com"},
{"id": 2, "name": "Bob"}
],
"schemaTitle": "User",
"outputFormat": "json"
}'

Then poll GET /v2/actor-runs/{runId} for status and fetch results from the default dataset.

Use with AI agents via MCP

JSON Schema Generator is available as a tool for AI assistants that support the Model Context Protocol (MCP).

Add the Apify MCP server to your AI client — this gives you access to all Apify actors, including this one:

Setup for Claude Code

$claude mcp add --transport http apify "https://mcp.apify.com"

Setup for Claude Desktop, Cursor, or VS Code

Add this to your MCP config file:

{
"mcpServers": {
"apify": {
"url": "https://mcp.apify.com"
}
}
}

Your AI assistant will use OAuth to authenticate with your Apify account on first use.

Example prompts

Once connected, try asking your AI assistant:

  • "Use automation-lab/json-schema-generator to generate a draft-07 schema from these API response samples: [paste JSON]"
  • "I have these webhook payloads — generate a JSON Schema so I can validate future payloads against it"
  • "Generate a strict schema (no additional properties) from these 5 sample records and show me which fields are required"

Learn more in the Apify MCP documentation.

Yes. JSON Schema Generator performs pure local computation — it does not scrape any website, call any external API, or access any third-party data. It takes JSON data you already have and transforms it mathematically into a schema. There are no Terms of Service concerns, no GDPR issues with third-party data, and no ethical considerations around scraping.

You are responsible for ensuring the JSON samples you provide do not contain personal data you're not authorized to process.

FAQ

How fast is JSON Schema Generator? Typically under 5 seconds for any input size. The actor performs pure in-memory computation — there's no network I/O, no browser, no proxy. Even 1,000 samples completes in seconds.

How much does it cost? $0.005 to start + $0.0023 per schema generated (free plan). Generating a single merged schema from any number of samples costs ~$0.0073 total. The Apify free tier ($5 credits) covers ~2,150 schemas.

How is this different from running a JSON Schema library locally? Running jsonschema-gen or similar locally requires installing Node.js or Python, writing code, and handling I/O. This actor gives you the same capability via a web UI, a REST API, and Zapier/Make integrations — no setup, any language, any device. It's also pre-integrated with the Apify ecosystem for scheduling, webhooks, and data export.

Why are some fields ["string", "null"] instead of just "string"? A field gets a nullable type (["string", "null"]) in two cases: (1) the field is missing from at least one sample (so it could be absent/null in new data), or (2) the field is present in all samples but had a null value in at least one. This is correct draft-07 behavior for optional/nullable fields.

Why are my required fields fewer than expected? A field is only marked required if it appears in every sample AND is never null in any sample. If you want more required fields, provide more samples that consistently include those fields with non-null values, or disable inferRequired and manually define requirements.

The schema shows "type": "integer" but I expected "number" — why? Integer detection is automatic: if every observed value for a field has no decimal component (Number.isInteger(value) is true), the type is integer. If any value has a decimal (like 4.5), the type becomes number. To force number, include at least one decimal value in your samples.

Output shows an empty items: {} for an array — what does that mean? This happens when an array field was empty ([]) in all samples, so no item type could be inferred. Provide at least one sample where that array has items and re-run to get a proper items schema.

Other developer tools

Explore other utility actors from automation-lab: