Merge, Dedup & Transform Datasets
Try for free
No credit card required
View all Actors
Merge, Dedup & Transform Datasets
lukaskrivka/dedup-datasets
Try for free
No credit card required
The ultimate dataset processor. Extremely fast merging, deduplications & transformations all in a single run.
2023-07-13
Features
- Add
customInputData
object to input for easy passing of custom values intopreDedupTransformFunction
andpostDedupTransformFunction
. It is part of the 2nd parameter object.
2021-01-24
Featues
- Added
fieldsToLoad
to input to increase speed and reducem meory if you don't need full items in output - Added
limit
andoffset
to input to be able to process only slices of dataset - Removed
uploadSleepMs
as the platform can now handle much higher load of upload
2021-01-14
Features
outputDatasetId
can now also use dataset name. If dataset with that name doesn't exist, a new dataset is created.
2020-07-10
Fixes:
dedup-as-loading
mode now works correctly with actor migrations. This means that this actor can finally be used for huge datasets with lower memory!
Features:
fields
are now optional which means the actor does not need to perform deduplication
Previous updates
Previous updates were not tracked, see GitHub commits if you need to find past changes or ask in Issues or Discord.
Developer
Maintained by Apify
Actor metrics
- 231 monthly users
- 97.6% runs succeeded
- 0.8 days response time
- Created in Apr 2020
- Modified 9 days ago
Categories