
Diff Datasets
Pricing
Pay per usage
Go to Store
0

Diff Datasets
Take one dataset on Apify platform, compare it to another, and output the missing ones. This can also be used to output only changed items, using a compound key
0.0 (0)
Pricing
Pay per usage
3
Total users
13
Monthly users
2
Last modified
3 years ago
Take one dataset on Apify platform, compare to another, and output the missing ones. This can also be used to output only changed items, using a compound key.
Supports using whole nested objects as value, they are JSON.stringify
'd before being turned
into a small non-cryptographic space efficient hash
Example
1await Apify.call('pocesar/diff-datasets', { 2 baseDatasetId: 'LdNAlaOY1aKGhwAah', // place the datasets here. The order of "base" and "other" matters 3 otherDatasetId: 'Bzu1pgOjenN43VhPY', // existing items in "base" are not output from "other" 4 uniqueFields: [ 5 // simple primitive field value, like string, number, boolean 6 "pageUrl", 7 8 // you can use lodash.get notation to get nested items, 9 // in this case `sub.fields.0` works like `sub.fields[0]` and the object looks like 10 // { 11 // pageUrl: "https//pageurl", 12 // sub: { 13 // fields: [ 14 // {...}, 15 // {...} 16 // ] 17 // } 18 // } 19 "sub.fields.0", 20 21 // you can also use .length to count arrays or string characters, as in 22 "sub.fields.length", 23 "pageUrl.length" 24 ], 25});
Limitations
- Every value is kept in memory while reading from the
base
dataset, more items more memory needed. - The key value store might choke when trying to save the in-memory
Set
with too many items
License
Apache 2.0