Dataset Toolbox avatar
Dataset Toolbox
Try for free

No credit card required

View all Actors
Dataset Toolbox

Dataset Toolbox

cyberfly/dataset-toolbox
Try for free

No credit card required

Perform common actions on datasets - merge, unify, validate, transform, order fields etc.

Input

FieldTypeOptionalDescription
Actor IDsArraydependsLoad latest default datasets
Dataset IDsArraydependsLoad the specified datasets

Access 3rd party datasets

Define secret environment variable to access other users' datasets: CUSTOM_SOURCE_APIFY_TOKEN

Features

Dataset unification

Use features described below to produce a single uniform dataset from datasets sharing a single common output schema and expected output structure

Latest dataset detection

Automatically detects and uses default datasets of the latest actor runs when:

  • Actor ID(s) are specified AND
  • Dataset ID(s) are not specified

Output fields management

Produces a download link for obtaining the resulting dataset with top level fields sorted and filtered based on the list of fields provided on input. This link is stored in default KV store:

DATASET_DOWNLOAD-CUSTOM_FIELD_ORDER-{selected file type}

Filter fields

Filter and pick only certain fields from source dataset(s)

Order fields

Apply custom order to top level fields in custom order instead of alphabetical (default)

Dataset post-processing

Apply custom javascript function to every item from source dataset(s) before saving to result dataset

Output schema validation

Validate schema

Validate every item against JSON schema specified on input and filter out invalid items before saving to result dataset

Reuse invalid items

Invalid items are captured in separate requestListSources saved in KV store.

Developer
Maintained by Community
Actor metrics
  • 2 monthly users
  • Created in Mar 2020
  • Modified over 1 year ago
Categories