Merge, Dedup & Transform Datasets avatar
Merge, Dedup & Transform Datasets

Pricing

Pay per usage

Go to Store
Merge, Dedup & Transform Datasets

Merge, Dedup & Transform Datasets

lukaskrivka/dedup-datasets

Developed by

Lukáš Křivka

Maintained by Community

The ultimate dataset processor. Extremely fast merging, deduplications & transformations all in a single run.

0.0 (0)

Pricing

Pay per usage

73

Monthly users

142

Runs succeeded

97%

Response time

9.8 days

Last modified

2 months ago

NV

Select contact details

Closed
nico_v opened this issue
a year ago

Hello,

Given the (very) large amount of column the contact details output for this run (1,070 columns in total), I'd like only the first 2 Contact Details of each sub-category (e.g. emails/0 + emails/1 ; Facebooks/0 + Facebooks/1, ...).

How exactly can I make this happen please? I've tried in the JSON input, using the Fileds to Load as e.g. 'contactDetails/facebooks/0',, but it wouldn't output any of these.

Happy to hear from you on that, thanks.

lukaskrivka avatar

Hello,

I can do a small script for you to do this in this actor. Another option would be to join all emails to one column separated by ;, e.g. instead of emails/0 = john@apify.com and emails/1 = peter@apify.com, it would be emails = john@apify.com; peter@apify.com. And it could join any number (you can decide a limit). Would you prefer that or your original idea?

NV

nico_v

a year ago

Hello Lukáš,

Thank you for the suggestions. It would be great if you could indeed have contactDetails 0 AND 1 in one column for each contactDetails type (e.g. Facebook 0 + Facebook 1 into one column, separated by a comma).

How should I proceed?

Also, I would need this not for this run in partuclar, but anytime (as I plan many different datasets merges).

Thanks for your help !

lukaskrivka avatar

Hello,

I created a task "Merge Google Maps - Trim to 2 contacts" for you. There is just a simple transform function that cleans the contacts. See the run below. https://console.apify.com/actors/tasks/WyOhdrUQvylK2G1f6/runs/wONYnadbFx2JeWNY3#output

I also added placeId to deduplication fields so you get rid of duplicate places, it goes down to 31053 places now.

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.