Validate Dataset(s) with JSON Schema avatar
Validate Dataset(s) with JSON Schema

Pricing

Pay per usage

Go to Store
Validate Dataset(s) with JSON Schema

Validate Dataset(s) with JSON Schema

Developed by

Jaroslav Hejlek

Maintained by Community

This Actor validates items in one or more datasets against a provided JSON Schema. Use it if you planning to add a dataset validation schema to your actor and you want test it.

0.0 (0)

Pricing

Pay per usage

0

Monthly users

1

Runs succeeded

83%

Last modified

13 days ago

This Apify Actor validates items in one or more datasets against a provided JSON Schema. It helps identify invalid items and provides detailed validation errors for each item that doesn't match the schema.

Features

  • Validate multiple datasets in a single run
  • Support for both Dataset IDs and Run IDs
  • Detailed validation error reporting
  • Uses JSON Schema Draft-07

Input

The actor accepts the following input parameters:

1{
2    "datasetIds": ["datasetId1", "datasetId2"], // Array of Dataset IDs or Run IDs
3    "schema": {                                  // JSON Schema to validate against
4        "$schema": "http://json-schema.org/draft-07/schema#",
5        "type": "object",
6        "properties": {
7            // Your schema properties here
8        },
9        "required": []
10    }
11}

Input Parameters Details

  • datasetIds (required, array of strings)
    • List of Dataset IDs or Run IDs to validate
    • You can use either Dataset ID (e.g., "1234567890") or Run ID (e.g., "yourRunId") ˇ
  • schema (required, object)
    • JSON Schema definition that describes the expected structure of items
    • Must be a valid JSON Schema (Draft-07)
    • Provided as a object in the input

Output

The actor stores validation results in its default dataset. Each record in the output dataset has the following structure:

1{
2    "datasetId": "string",      // ID of the dataset being validated
3    "itemPosition": "number",      // Position of the invalid item in the dataset (0-based)
4    "validationErrors": [        // Validation errors from AJV validator
5        // Detailed error information for each error
6    ]
7}

Only invalid items (those that don't match the schema) are included in the output.

Usage Example

1{
2    "datasetIds": ["abc123xyz789"],
3    "schema": {
4        "$schema": "http://json-schema.org/draft-07/schema#",
5        "type": "object",
6        "properties": {
7            "url": { "type": "string", "format": "uri" },
8            "title": { "type": "string" },
9            "price": { "type": "number" }
10        },
11        "required": ["url", "title"]
12    }
13}

Limitations

  • Can be slower for very large datasets since validation is done sequentially one item at a time
  • Maximum of 1000 validation errors are stored in memory before being pushed to the output dataset
  • The actor validates against JSON Schema Draft-07
  • Input schema must be a valid JSON schema

Dependencies

  • Node.js 20+
  • Ajv for JSON Schema validation
  • Apify SDK for Apify platform integration

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.