Actor Schema Validator avatar

Actor Schema Validator

Pricing

$350.00 / 1,000 schema validations

Go to Apify Store
Actor Schema Validator

Actor Schema Validator

Actor Schema Validator. Available on the Apify Store with pay-per-event pricing.

Pricing

$350.00 / 1,000 schema validations

Rating

0.0

(0)

Developer

ryan clinton

ryan clinton

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

5 hours ago

Last modified

Share

Schema Validator -- Actor Output Schema Compliance Checker

Validate any Apify actor's output against its declared dataset schema. Schema Validator runs the target actor, compares every output field against the schema definition, and produces a detailed compliance report with a 0-100 score. Catch type mismatches, missing fields, undeclared fields, and null handling issues before they trigger maintenance flags or break downstream pipelines.

Schema Validator fetches the dataset schema from the target actor's latest build, executes the actor with your test input, then performs field-by-field comparison: declared types vs. actual types, required fields vs. present fields, nullable declarations vs. null values. The result is a single compliance report with a score, pass/fail verdict, and actionable details for every mismatch found.

What data can you extract?

Data PointSourceExample
Compliance scoreWeighted analysis85 (out of 100)
Pass/Fail verdictError countFAIL (1 type mismatch)
Type mismatchesSchema vs. output comparisonfield 'price': expected number, got string
Undeclared fieldsOutput fields not in schema['_debug', 'scrapedAt']
Missing requiredSchema fields absent from output['phoneNumber']
Nullable issuesNull values where schema says non-nullfield 'email' has null values
Type inconsistenciesFields with mixed types across itemsfield 'rating': string, number
Run durationMeasured start-to-finish12.4 seconds

Why use Schema Validator?

An actor can run successfully and produce plausible-looking data while silently violating its own schema. A field declared as number returns "N/A" for some results. A field listed in the schema is missing from 30% of items. An undeclared _internal field leaks into production output. These issues don't crash the actor -- they corrupt your data.

Schema Validator catches these problems automatically. Run it before every publish, after every code change, or on a schedule. The 0-100 score gives you an at-a-glance quality metric. The detailed mismatch report tells you exactly what to fix.

Why an actor instead of a script?

  • Runs in the same environment as your actors -- catches issues that only appear in Apify's cloud (Docker, memory, network)
  • No local setup -- works from the Apify console, API, or any CI/CD pipeline
  • Scheduling -- run nightly validations against all your actors
  • Pay-per-event -- $0.35 per validation, no subscription

Features

  • Schema-aware type checking that maps declared schema types (string, integer, number, boolean, array, object) to JavaScript runtime types and flags every mismatch with the exact field path, expected type, and actual type
  • Nullable detection that identifies fields with null values where the schema does not declare nullable: true -- a common source of downstream errors
  • Undeclared field detection that lists every output field not defined in the dataset schema -- these fields won't appear in the Store's table view and may indicate data leaks
  • Missing field detection that checks every schema-declared field exists in at least one output item -- missing fields indicate broken extractors or changed data sources
  • Type consistency analysis that flags fields with mixed types across items (e.g., rating is sometimes string, sometimes number) regardless of schema -- structural issues that break type-safe consumers
  • 0-100 compliance score weighted by severity: errors (-10 points), warnings (-3 points), undeclared fields (-2 points), type inconsistencies (-5 points)
  • Works without a schema -- if the target actor has no declared dataset schema, Schema Validator still performs structural analysis (type consistency, field presence) and reports findings

Use cases for schema validation

Pre-publish quality gate

Run Schema Validator before every apify push or Store publish. A score below 90 means your output doesn't match your declared schema -- fix the mismatches before users see inconsistent data in the Store table view.

Continuous schema compliance

Schedule Schema Validator to run daily against your critical actors. When an upstream data source changes (website redesign, API update), field types and presence shift. Catch these drifts early.

Dataset schema development

Building a new dataset schema? Run Schema Validator iteratively: add fields to your schema, validate, fix mismatches, repeat. The undeclared fields list shows you exactly which output fields still need schema definitions.

Third-party actor evaluation

Before integrating a Store actor into your pipeline, run Schema Validator against it. Check if the declared schema matches actual output -- some actors have stale or incomplete schemas.

Migration validation

Replacing one actor with another? Run Schema Validator on both with the same input. Compare scores and mismatch lists to verify the replacement produces schema-compliant output.

How to run a schema validation

  1. Enter the target actor -- Provide the actor ID or username/actor-name slug.
  2. Provide test input -- Enter a minimal JSON input that produces representative output. Keep it small (3-5 results) to minimize compute costs.
  3. Run the validation -- Click "Start" and wait for the target actor to complete.
  4. Review the report -- Check the score, pass/fail verdict, and detailed mismatch list in the Dataset tab.

Input parameters

ParameterTypeRequiredDefaultDescription
targetActorIdstringYes--Actor ID or username/actor-name slug to validate (e.g., ryanclinton/website-contact-scraper).
testInputobjectNo{}JSON input to pass to the target actor. Use a small, representative input.
timeoutintegerNo120Maximum seconds to wait for the target actor run.
memoryintegerNo512Memory in MB for the target actor run.

Input examples

Validate a scraper:

{
"targetActorId": "ryanclinton/website-contact-scraper",
"testInput": {
"urls": ["https://example.com"],
"maxPagesPerDomain": 3
}
}

Validate an API wrapper:

{
"targetActorId": "ryanclinton/fred-economic-data",
"testInput": {
"seriesId": "GDP",
"limit": 5
},
"timeout": 60,
"memory": 256
}

Input tips

  • Use minimal inputs -- 3-5 results are enough for schema validation. Don't scrape 1,000 pages just to check types.
  • Test with inputs that produce diverse output -- if some fields are only present for certain inputs, use an input that exercises those fields.
  • Match memory to the actor -- browser-based scrapers need 4096 MB, API wrappers work with 256 MB.

Output example

Each validation produces one report in the dataset:

{
"actorName": "ryanclinton/website-contact-scraper",
"actorId": "abc123def456",
"schemaFound": true,
"schemaFields": 12,
"outputFields": 15,
"totalItems": 3,
"mismatches": [
{
"path": "price",
"expected": "number",
"actual": "string",
"severity": "error"
},
{
"path": "email",
"expected": "non-null",
"actual": "null values found",
"severity": "warning"
}
],
"undeclaredFields": ["_debug", "scrapedAt", "rawHtml"],
"missingRequired": ["phoneNumber"],
"nullableIssues": ["email"],
"typeConsistency": [
{ "field": "rating", "types": ["string", "number"] }
],
"score": 72,
"passed": false,
"runDuration": 12.4,
"validatedAt": "2026-03-18T14:30:00.000Z"
}

Output fields

FieldTypeDescription
actorNamestringDisplay name of the validated actor (e.g., ryanclinton/actor-name)
actorIdstringApify actor ID
schemaFoundbooleanWhether the target actor has a declared dataset schema
schemaFieldsnumberNumber of fields defined in the dataset schema (0 if no schema)
outputFieldsnumberNumber of unique fields found across all output items
totalItemsnumberNumber of items in the output dataset
mismatchesarrayList of mismatches with path, expected, actual, and severity (error or warning)
undeclaredFieldsarrayField names present in output but not declared in the schema
missingRequiredarrayField names declared in schema but missing from all output items
nullableIssuesarrayField names with null values where schema doesn't declare nullable
typeConsistencyarrayFields with mixed types across items: { field, types[] }
scorenumberCompliance score 0-100. Errors: -10, warnings: -3, undeclared: -2, inconsistencies: -5
passedbooleantrue if zero errors (warnings and undeclared fields don't cause failure)
runDurationnumberTotal execution time in seconds
validatedAtstringISO 8601 timestamp

How the score is calculated

The score starts at 100 and deducts points by severity:

Issue TypePoints DeductedExample
Type mismatch (error)-10 per mismatchprice: expected number, got string
Missing schema field (error)-10 per fieldphoneNumber missing from all items
Nullable violation (warning)-3 per fieldemail has null values, schema says non-null
Undeclared field-2 per field_debug not in schema
Type inconsistency-5 per fieldrating is sometimes string, sometimes number

A score of 90+ means minor issues only. Below 70 indicates serious schema violations that should be fixed before publishing.

How much does it cost?

Schema Validator uses pay-per-event pricing at $0.35 per validation. The target actor run is billed separately at its own rate.

ScenarioValidationsOrchestration Cost
One-off check1$0.35
Weekly checks (4/mo)4$1.40
Daily CI/CD (30/mo)30$10.50
Fleet-wide nightly (200 actors)200$70.00

The Apify Free plan ($5/mo credits) covers approximately 14 validations.

Tip: Use small test inputs (3-5 results) to minimize target actor costs.

Run schema validation using the API

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/actor-schema-validator").call(run_input={
"targetActorId": "ryanclinton/website-contact-scraper",
"testInput": {
"urls": ["https://example.com"],
"maxPagesPerDomain": 3,
},
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"Score: {item['score']}/100 — {'PASS' if item['passed'] else 'FAIL'}")
print(f"Schema fields: {item['schemaFields']} | Output fields: {item['outputFields']}")
if item["mismatches"]:
for m in item["mismatches"]:
print(f" [{m['severity'].upper()}] {m['path']}: expected {m['expected']}, got {m['actual']}")
if item["undeclaredFields"]:
print(f" Undeclared: {', '.join(item['undeclaredFields'])}")

JavaScript

import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/actor-schema-validator").call({
targetActorId: "ryanclinton/website-contact-scraper",
testInput: { urls: ["https://example.com"], maxPagesPerDomain: 3 },
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
const report = items[0];
console.log(`Score: ${report.score}/100 — ${report.passed ? "PASS" : "FAIL"}`);
report.mismatches.forEach(m => {
console.log(` [${m.severity}] ${m.path}: expected ${m.expected}, got ${m.actual}`);
});

cURL

curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-schema-validator/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"targetActorId": "ryanclinton/website-contact-scraper",
"testInput": {"urls": ["https://example.com"], "maxPagesPerDomain": 3}
}'

FAQ

What's the difference between Schema Validator and Cloud Staging Test? Cloud Staging Test checks overall run health (status, output volume, custom assertions). Schema Validator focuses specifically on schema compliance -- type mismatches, undeclared fields, nullable violations. Use Cloud Staging Test for broad quality checks, Schema Validator for precise schema debugging.

What if my actor doesn't have a dataset schema? Schema Validator still runs structural analysis: type consistency across items, field presence consistency, and mixed-type detection. You'll get a report without schema-specific checks. The score reflects structural quality only.

Does it validate nested objects? Currently, Schema Validator checks top-level fields only. Nested object structures are typed as object without inspecting their internal fields. Deep validation for nested schemas is planned.

What counts as "passed"? Pass means zero errors. Warnings (nullable violations) and undeclared fields lower the score but don't cause failure. Only type mismatches and missing required fields are errors.

Can I validate multiple actors at once? Not in a single run. Run Schema Validator separately for each actor. Use the API to batch validations in parallel.

How do I fix undeclared fields? Add the undeclared fields to your actor's .actor/actor.json dataset schema under storages.dataset.fields. Push the updated actor and re-validate.

How do I fix nullable issues? Either add "nullable": true to the field's schema definition, or fix your actor to never return null for that field.

ActorHow to combine
Cloud Staging TestBroad quality validation with custom assertions. Use Schema Validator for schema-specific debugging, Cloud Staging Test for overall output health.
Actor Test RunnerRun multi-case test suites. Schema Validator checks schema compliance; Test Runner checks functional correctness with multiple inputs.
Actor Regression SuiteDetect regressions between builds. Use Schema Validator after each push, Regression Suite for historical comparison.
Output Completeness MonitorTrack output volume trends. Schema Validator checks structure; Completeness Monitor checks quantity.
Actor Quality AuditOverall Store listing quality. Schema Validator checks output schema compliance; Quality Audit checks metadata, README, and input schema quality.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page.