Merge, Dedup & Transform Datasets avatar
Merge, Dedup & Transform Datasets
Try for free

No credit card required

View all Actors
Merge, Dedup & Transform Datasets

Merge, Dedup & Transform Datasets

lukaskrivka/dedup-datasets
Try for free

No credit card required

The ultimate dataset processor. Extremely fast merging, deduplications & transformations all in a single run.

2023-07-13

Features

  • Add customInputData object to input for easy passing of custom values into preDedupTransformFunction and postDedupTransformFunction. It is part of the 2nd parameter object.

2021-01-24

Featues

  • Added fieldsToLoad to input to increase speed and reducem meory if you don't need full items in output
  • Added limit and offset to input to be able to process only slices of dataset
  • Removed uploadSleepMs as the platform can now handle much higher load of upload

2021-01-14

Features

  • outputDatasetId can now also use dataset name. If dataset with that name doesn't exist, a new dataset is created.

2020-07-10

Fixes:

  • dedup-as-loading mode now works correctly with actor migrations. This means that this actor can finally be used for huge datasets with lower memory!

Features:

  • fields are now optional which means the actor does not need to perform deduplication

Previous updates

Previous updates were not tracked, see GitHub commits if you need to find past changes or ask in Issues or Discord.

Developer
Maintained by Apify
Actor metrics
  • 234 monthly users
  • 97.5% runs succeeded
  • 0.4 days response time
  • Created in Apr 2020
  • Modified 3 months ago