Easy Data Processor: Merge, Clean, and Transform Your Data

Deprecated

Pricing

$9.99/month + usage

See alternative Actors

Go to Apify Store

Easy Data Processor: Merge, Clean, and Transform Your Data

Deprecated

See alternative Actors

Developed by

codemaster devops

Maintained by Community

Meet the Ultimate Data Processor, a human-friendly tool that simplifies your data tasks. With this Apify actor, you can merge datasets, remove duplicates, and transform data quickly and effortlessly, all in one go. Say goodbye to complex processes and hello to streamlined data management

0.0 (0)

Pricing

$9.99/month + usage

Last modified

a year ago

Automation

Integrations

The ultimate dataset processing actor - merge, dedup & transform

Refined and optimized dataset processing actor for large scale merging, deduplications and transformation

Why to use this actor

Extremely fast data processing thanks for parallelizing workloads (easily 20x faster than default loading/pushing datasets)
Allows reading from multiple datasets silmutanesously, ideal for merging after scraping with many runs
Actor migration proof - All steps that can be persisted are persisted => work is not repeated and no duplicated data pushed
Dedup as loading mode allows for near constant memory processing even for huge datasets (think 10M+)
Deduplication allows for combination of many fields and even nested objects/arrays (those are JSON.stringified for deep equality check)
Allows for storing into KV store records
Allows super fast blank runs that count duplicates

Merging

You can provide more than one dataset. In that case all items are merged into single dataset or key value store output. If you use the Dedup after load mode, the order of items will retain the order of datasets provided.

Deduplication

If you optionally provide deduplication fields, this actor will deduplicate the dataset items. The deduplication process check the values of each field for equality and only return the first unique one (the first item that has a unique value for that field).

You can provide more than one field. In that case a combined string of that fields is checked, e.g. "name": "Adidas Shoes, "id": "12345" gets converted into "Adidas Shoes12345" for the checking purpose. So only items that have both fields the same are considered duplicates. This means the more fields you add, the less duplicates will be found.

Fields that are objects or arrays are also deeply compared via JSON.stringify. Just be aware that doing this for very large structures might have performance implications.

Transformation

This actor enables you to do arbitrary data transformations before and after deduplication via preDedupTransformFunction and postDedupTransformFunction.

These functions simply take the array of items and should return array of items. You don't need to necessarily return the same amount of items (can filter some out or add new ones).

You can access an object with helper variables, currently containing the Apify SDK reference

The default transformation does nothing with the items:

(items, { Apify, customInputData }) => {
    return items;
}

In case of dedup-as-loading mode, you only have access to the items of the specific batch. But you can also access datasetId and datasetOffset parameters as each batch is only from one dataset.

(items, { Apify, datasetId, datasetOffset, customInputData }) => {
    return items;
}

Input

Detailed INPUT table with description can be found on the actor's public page.

Changelog

Check the list of past updates here

On this page

The ultimate dataset processing actor - merge, dedup & transform

Share Actor:

🔥 Power Data Transformer

wiseek/power-data-transformer

🔥 Unlock your scraped data—clean, merge, split, deduplicate, filter, standardize, validate, enrich and sync—using built-in transformations and powerful SQL pipelines for ETL/ELT workflows. Seamlessly integrate processed datasets with automation platforms like n8n, Make.com, and Zapier.

wiseek

MakeMyTrip Hotel Details Scraper

ecomscrape/makemytrip-hotel-details-scraper

Advanced MakeMyTrip hotel scraper that extracts detailed property information including room details, pricing, images, videos, user-generated content, and 360° views. Perfect for travel industry professionals, researchers, and businesses needing comprehensive hotel data analysis.

ecomscrape

✨ Google Autocomplete Apify

damilo/google-autocomplete-apify

🔍✨ Instantly grab live Google Autocomplete ideas—no proxies, no browser! 🌍 Supports 250 + countries & 100 + languages, returns clean JSON (value, relevance) and a ⚡ “cursor-before” boost for hidden gems. Perfect for SEO, PPC, content brainstorming & trend spotting.

Imad

5.0

(1)

Web Scraper and AI processor

scraping_samurai/web-scraper-and-ai-processor

Adaptive AI controller classifies page quality from fast HTTP fetches and selectively triggers headless rendering, then converts raw text into structured JSON from natural-language extraction prompts. Optimizes cost vs. accuracy with AI-guided escalation, retry, and thin/blocked content heuristics.

Scraping Samurai

Intercom Conversation Retriever

tomas.nosek/intercom-conversation-retriever

Retrieve Intercom conversations for contacts based on any field criteria

Tomas Nosek

Tweet Scraper|$0.25/1K Tweets | Pay-Per Result | No Rate Limits

kaitoeasyapi/twitter-x-data-tweet-scraper-pay-per-result-cheapest

Only $0.25/1000 tweets for Twitter scraping, 100% reliability, swift data retrieval.This incredible low price is almost too good to be true.Thanks to our large-scale operations and efficient servers, we can offer you rock-bottom prices that no competitors can match. Don't miss this opportunity !

twlo low

5.1K

4.1

(36)

Cex Product Scraper (uk.webuy.com)

sync-network/cex-product-scraper-uk-webuy-com

Extracts detailed product data from CEX (uk.webuy.com), including prices, trade-in values, and stock status. Features customizable search parameters for targeted collection. Ideal for market analysis, price comparison, and inventory tracking in the second-hand electronics and entertainment market.

Alam

Fast Property24 | Search | Property | Scraper (Richest output)

memo23/property24-scraper

Extract comprehensive South African property data including detailed listings, agent info, high-res images, and market trends. Get structured JSON output with fields for pricing, location, features, and historical data. Perfect for real estate analysis and investment research.

Muhamed Didovic

Businessesforsale Scraper ($1 / 1K) (Pay-Per-Result)

memo23/businessesforsale-scraper

Scrapes BusinessesForSale.com to extract business listings with JSON-LD data, search metadata (criteria, tags, age filters, listings count), and search context. Built with TypeScript/Cheerio for reliable data extraction and market research.

Muhamed Didovic

Target Product Search Scraper

ecomscrape/target-product-search-scraper

The Target.com Product Search Scraper extracts detailed product data from Target.com, including name, title, brand, description, price, rating score, etc., using search query URLs. This tool is perfect for market research, trend analysis, lead generation, and campaign planning.

ecomscrape

Redfin Property Details Scraper

ecomscrape/redfin-com-property-details-page-scraper

The Redfin.com Property Details Scraper allows easy extraction of detailed property data. Simply provide the Property Page URLs, and it will gather key details like price, address, features, and coordinates, ensuring smooth integration and efficient use.

ecomscrape