Duplications Checker avatar

Duplications Checker

Try for free

No credit card required

View all Actors
Duplications Checker

Duplications Checker

lukaskrivka/duplications-checker
Try for free

No credit card required

Check your dataset for duplications. Accept only the highest quality data!

Dataset ID

datasetIdstringOptional

Id of dataset where the data are located. If you need to use other input types like Key value store or raw JSON, look at Other data sources

Check only clean dataset items

checkOnlyCleanItemsbooleanOptional

Only clean dataset items will be loaded and use for duplications checking if datasetId option is provided.

Default value of this property is false

Fields

fieldsarrayRequired

List of fields in each item that will be checked for duplicates. Each given field must not be nested and it should contain only simple value (string or number). You can prepare your data with preCheckFunction.

Default value of this property is []

Pre-check function

preCheckFunctionstringOptional

You can specify which fields should display in the debug OUTPUT to identify bad items. By default it shows all fields which may make it unnecessary big.

Minimum duplications

minDuplicationsintegerOptional

Minimum occurences to be included in the report. Defaults to 2

Default value of this property is 2

Show indexes

showIndexesbooleanOptional

Indexes of the duplicate items will be shown in the OUTPUT report. Set to false if you don't need them.

Default value of this property is true

Show items

showItemsbooleanOptional

Duplicate items will be pushed to a dataset. Set to false if you don't need them.

Default value of this property is true

Show missing fields

showMissingbooleanOptional

Items where the values for the field is missing or is null or '' will be included in the report.

Default value of this property is true

Limit

limitintegerOptional

How many items will be checked. Default is all

Offset

offsetintegerOptional

From which item the checking will start. Use with limit to check specific items.

Batch Size

batchSizeintegerOptional

You can change number of loaded and processed items in each batch. This is only needed if you have really huge items.

Default value of this property is 1000

Key value store Record

keyValueStoreRecordstringOptional

ID and record key if you want to load from KV store. Format is {keyValueStoreId}+{recordKey}, e.g. s5NJ77qFv8b4osiGR+MY-KEY

Raw Data

rawDataarrayOptional

Raw JSON array you want to check.

Developer
Maintained by Community

Actor Metrics

  • 5 monthly users

  • 11 stars

  • >99% runs succeeded

  • Created in Aug 2019

  • Modified 4 years ago

Categories