Actor Schema Validator
Pricing
$350.00 / 1,000 schema validations
Actor Schema Validator
Actor Schema Validator. Available on the Apify Store with pay-per-event pricing.
Pricing
$350.00 / 1,000 schema validations
Rating
0.0
(0)
Developer
ryan clinton
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
5 hours ago
Last modified
Categories
Share
Schema Validator -- Actor Output Schema Compliance Checker
Validate any Apify actor's output against its declared dataset schema. Schema Validator runs the target actor, compares every output field against the schema definition, and produces a detailed compliance report with a 0-100 score. Catch type mismatches, missing fields, undeclared fields, and null handling issues before they trigger maintenance flags or break downstream pipelines.
Schema Validator fetches the dataset schema from the target actor's latest build, executes the actor with your test input, then performs field-by-field comparison: declared types vs. actual types, required fields vs. present fields, nullable declarations vs. null values. The result is a single compliance report with a score, pass/fail verdict, and actionable details for every mismatch found.
What data can you extract?
| Data Point | Source | Example |
|---|---|---|
| Compliance score | Weighted analysis | 85 (out of 100) |
| Pass/Fail verdict | Error count | FAIL (1 type mismatch) |
| Type mismatches | Schema vs. output comparison | field 'price': expected number, got string |
| Undeclared fields | Output fields not in schema | ['_debug', 'scrapedAt'] |
| Missing required | Schema fields absent from output | ['phoneNumber'] |
| Nullable issues | Null values where schema says non-null | field 'email' has null values |
| Type inconsistencies | Fields with mixed types across items | field 'rating': string, number |
| Run duration | Measured start-to-finish | 12.4 seconds |
Why use Schema Validator?
An actor can run successfully and produce plausible-looking data while silently violating its own schema. A field declared as number returns "N/A" for some results. A field listed in the schema is missing from 30% of items. An undeclared _internal field leaks into production output. These issues don't crash the actor -- they corrupt your data.
Schema Validator catches these problems automatically. Run it before every publish, after every code change, or on a schedule. The 0-100 score gives you an at-a-glance quality metric. The detailed mismatch report tells you exactly what to fix.
Why an actor instead of a script?
- Runs in the same environment as your actors -- catches issues that only appear in Apify's cloud (Docker, memory, network)
- No local setup -- works from the Apify console, API, or any CI/CD pipeline
- Scheduling -- run nightly validations against all your actors
- Pay-per-event -- $0.35 per validation, no subscription
Features
- Schema-aware type checking that maps declared schema types (
string,integer,number,boolean,array,object) to JavaScript runtime types and flags every mismatch with the exact field path, expected type, and actual type - Nullable detection that identifies fields with null values where the schema does not declare
nullable: true-- a common source of downstream errors - Undeclared field detection that lists every output field not defined in the dataset schema -- these fields won't appear in the Store's table view and may indicate data leaks
- Missing field detection that checks every schema-declared field exists in at least one output item -- missing fields indicate broken extractors or changed data sources
- Type consistency analysis that flags fields with mixed types across items (e.g.,
ratingis sometimesstring, sometimesnumber) regardless of schema -- structural issues that break type-safe consumers - 0-100 compliance score weighted by severity: errors (-10 points), warnings (-3 points), undeclared fields (-2 points), type inconsistencies (-5 points)
- Works without a schema -- if the target actor has no declared dataset schema, Schema Validator still performs structural analysis (type consistency, field presence) and reports findings
Use cases for schema validation
Pre-publish quality gate
Run Schema Validator before every apify push or Store publish. A score below 90 means your output doesn't match your declared schema -- fix the mismatches before users see inconsistent data in the Store table view.
Continuous schema compliance
Schedule Schema Validator to run daily against your critical actors. When an upstream data source changes (website redesign, API update), field types and presence shift. Catch these drifts early.
Dataset schema development
Building a new dataset schema? Run Schema Validator iteratively: add fields to your schema, validate, fix mismatches, repeat. The undeclared fields list shows you exactly which output fields still need schema definitions.
Third-party actor evaluation
Before integrating a Store actor into your pipeline, run Schema Validator against it. Check if the declared schema matches actual output -- some actors have stale or incomplete schemas.
Migration validation
Replacing one actor with another? Run Schema Validator on both with the same input. Compare scores and mismatch lists to verify the replacement produces schema-compliant output.
How to run a schema validation
- Enter the target actor -- Provide the actor ID or
username/actor-nameslug. - Provide test input -- Enter a minimal JSON input that produces representative output. Keep it small (3-5 results) to minimize compute costs.
- Run the validation -- Click "Start" and wait for the target actor to complete.
- Review the report -- Check the score, pass/fail verdict, and detailed mismatch list in the Dataset tab.
Input parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
targetActorId | string | Yes | -- | Actor ID or username/actor-name slug to validate (e.g., ryanclinton/website-contact-scraper). |
testInput | object | No | {} | JSON input to pass to the target actor. Use a small, representative input. |
timeout | integer | No | 120 | Maximum seconds to wait for the target actor run. |
memory | integer | No | 512 | Memory in MB for the target actor run. |
Input examples
Validate a scraper:
{"targetActorId": "ryanclinton/website-contact-scraper","testInput": {"urls": ["https://example.com"],"maxPagesPerDomain": 3}}
Validate an API wrapper:
{"targetActorId": "ryanclinton/fred-economic-data","testInput": {"seriesId": "GDP","limit": 5},"timeout": 60,"memory": 256}
Input tips
- Use minimal inputs -- 3-5 results are enough for schema validation. Don't scrape 1,000 pages just to check types.
- Test with inputs that produce diverse output -- if some fields are only present for certain inputs, use an input that exercises those fields.
- Match memory to the actor -- browser-based scrapers need 4096 MB, API wrappers work with 256 MB.
Output example
Each validation produces one report in the dataset:
{"actorName": "ryanclinton/website-contact-scraper","actorId": "abc123def456","schemaFound": true,"schemaFields": 12,"outputFields": 15,"totalItems": 3,"mismatches": [{"path": "price","expected": "number","actual": "string","severity": "error"},{"path": "email","expected": "non-null","actual": "null values found","severity": "warning"}],"undeclaredFields": ["_debug", "scrapedAt", "rawHtml"],"missingRequired": ["phoneNumber"],"nullableIssues": ["email"],"typeConsistency": [{ "field": "rating", "types": ["string", "number"] }],"score": 72,"passed": false,"runDuration": 12.4,"validatedAt": "2026-03-18T14:30:00.000Z"}
Output fields
| Field | Type | Description |
|---|---|---|
actorName | string | Display name of the validated actor (e.g., ryanclinton/actor-name) |
actorId | string | Apify actor ID |
schemaFound | boolean | Whether the target actor has a declared dataset schema |
schemaFields | number | Number of fields defined in the dataset schema (0 if no schema) |
outputFields | number | Number of unique fields found across all output items |
totalItems | number | Number of items in the output dataset |
mismatches | array | List of mismatches with path, expected, actual, and severity (error or warning) |
undeclaredFields | array | Field names present in output but not declared in the schema |
missingRequired | array | Field names declared in schema but missing from all output items |
nullableIssues | array | Field names with null values where schema doesn't declare nullable |
typeConsistency | array | Fields with mixed types across items: { field, types[] } |
score | number | Compliance score 0-100. Errors: -10, warnings: -3, undeclared: -2, inconsistencies: -5 |
passed | boolean | true if zero errors (warnings and undeclared fields don't cause failure) |
runDuration | number | Total execution time in seconds |
validatedAt | string | ISO 8601 timestamp |
How the score is calculated
The score starts at 100 and deducts points by severity:
| Issue Type | Points Deducted | Example |
|---|---|---|
| Type mismatch (error) | -10 per mismatch | price: expected number, got string |
| Missing schema field (error) | -10 per field | phoneNumber missing from all items |
| Nullable violation (warning) | -3 per field | email has null values, schema says non-null |
| Undeclared field | -2 per field | _debug not in schema |
| Type inconsistency | -5 per field | rating is sometimes string, sometimes number |
A score of 90+ means minor issues only. Below 70 indicates serious schema violations that should be fixed before publishing.
How much does it cost?
Schema Validator uses pay-per-event pricing at $0.35 per validation. The target actor run is billed separately at its own rate.
| Scenario | Validations | Orchestration Cost |
|---|---|---|
| One-off check | 1 | $0.35 |
| Weekly checks (4/mo) | 4 | $1.40 |
| Daily CI/CD (30/mo) | 30 | $10.50 |
| Fleet-wide nightly (200 actors) | 200 | $70.00 |
The Apify Free plan ($5/mo credits) covers approximately 14 validations.
Tip: Use small test inputs (3-5 results) to minimize target actor costs.
Run schema validation using the API
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("ryanclinton/actor-schema-validator").call(run_input={"targetActorId": "ryanclinton/website-contact-scraper","testInput": {"urls": ["https://example.com"],"maxPagesPerDomain": 3,},})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"Score: {item['score']}/100 — {'PASS' if item['passed'] else 'FAIL'}")print(f"Schema fields: {item['schemaFields']} | Output fields: {item['outputFields']}")if item["mismatches"]:for m in item["mismatches"]:print(f" [{m['severity'].upper()}] {m['path']}: expected {m['expected']}, got {m['actual']}")if item["undeclaredFields"]:print(f" Undeclared: {', '.join(item['undeclaredFields'])}")
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("ryanclinton/actor-schema-validator").call({targetActorId: "ryanclinton/website-contact-scraper",testInput: { urls: ["https://example.com"], maxPagesPerDomain: 3 },});const { items } = await client.dataset(run.defaultDatasetId).listItems();const report = items[0];console.log(`Score: ${report.score}/100 — ${report.passed ? "PASS" : "FAIL"}`);report.mismatches.forEach(m => {console.log(` [${m.severity}] ${m.path}: expected ${m.expected}, got ${m.actual}`);});
cURL
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-schema-validator/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"targetActorId": "ryanclinton/website-contact-scraper","testInput": {"urls": ["https://example.com"], "maxPagesPerDomain": 3}}'
FAQ
What's the difference between Schema Validator and Cloud Staging Test? Cloud Staging Test checks overall run health (status, output volume, custom assertions). Schema Validator focuses specifically on schema compliance -- type mismatches, undeclared fields, nullable violations. Use Cloud Staging Test for broad quality checks, Schema Validator for precise schema debugging.
What if my actor doesn't have a dataset schema? Schema Validator still runs structural analysis: type consistency across items, field presence consistency, and mixed-type detection. You'll get a report without schema-specific checks. The score reflects structural quality only.
Does it validate nested objects?
Currently, Schema Validator checks top-level fields only. Nested object structures are typed as object without inspecting their internal fields. Deep validation for nested schemas is planned.
What counts as "passed"? Pass means zero errors. Warnings (nullable violations) and undeclared fields lower the score but don't cause failure. Only type mismatches and missing required fields are errors.
Can I validate multiple actors at once? Not in a single run. Run Schema Validator separately for each actor. Use the API to batch validations in parallel.
How do I fix undeclared fields?
Add the undeclared fields to your actor's .actor/actor.json dataset schema under storages.dataset.fields. Push the updated actor and re-validate.
How do I fix nullable issues?
Either add "nullable": true to the field's schema definition, or fix your actor to never return null for that field.
Related actors
| Actor | How to combine |
|---|---|
| Cloud Staging Test | Broad quality validation with custom assertions. Use Schema Validator for schema-specific debugging, Cloud Staging Test for overall output health. |
| Actor Test Runner | Run multi-case test suites. Schema Validator checks schema compliance; Test Runner checks functional correctness with multiple inputs. |
| Actor Regression Suite | Detect regressions between builds. Use Schema Validator after each push, Regression Suite for historical comparison. |
| Output Completeness Monitor | Track output volume trends. Schema Validator checks structure; Completeness Monitor checks quantity. |
| Actor Quality Audit | Overall Store listing quality. Schema Validator checks output schema compliance; Quality Audit checks metadata, README, and input schema quality. |
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page.