Deprecated

Pricing

$5.00/month + usage

See alternative Actors

Go to Store

Deduplicator

Deprecated

See alternative Actors

Developed by

Mustafa Irshaid

Filters out duplicates from previous runs and outputs only new data. Perfect for scheduled scrapers or chained actors to ensure you process fresh results every time. Seamlessly integrates with any Apify actor using default dataset ID.

0.0 (0)

Pricing

$5.00/month + usage

Total users

Monthly users

Runs succeeded

Last modified

2 months ago

Integrations

Developer tools

🔁 Deduplicator Actor – Output New Data Only

This actor filters out duplicates from previous runs and emits only new data.
Designed to be used as a post-processing layer after any Apify Actor, it automatically detects and removes duplicate items by comparing against cached results stored in a persistent key-value store.

🔧 How It Works

Accepts input from another actor via integration (no manual input required).
Reads from the defaultDatasetId in the incoming payload.
Hashes each item based on its content.
Compares against previously stored hashes in a key-value store.
Outputs only new, unseen items to this actor’s default dataset.

✅ Features

🧠 Smart deduplication across multiple runs.
💾 Persistent cache using Apify key-value store.
🔗 Seamless integration with any actor (no input config needed).
⚡ Zero configuration – just plug it in and run.

🚀 Usage

Integrate after any data-producing actor
Use Apify’s Actor-to-Actor Integration or a webhook triggered on ACTOR.RUN.SUCCEEDED.
Let it process automatically
It detects the dataset from the actor run and filters out known items.
Consume the results
New data will be available in this actor’s default dataset.

📤 Output

Dataset containing only new items not seen in previous runs.
Duplicates are skipped silently.
Logs include summary of processed, new, and skipped entries.

🧪 Example Scenario

You're scraping job listings daily. Most results stay the same.
By integrating this actor, only newly discovered jobs are pushed forward to your database, notification system, or data pipeline.

📄 License

MIT

On this page

🔁 Deduplicator Actor – Output New Data Only

Share Actor:

Contact Details Merge & Deduplicate

lukaskrivka/contact-details-merge-deduplicate

Merge and deduplicate all contacts extracted by Contact Details Scraper. Works with multiple datasets. One row per domain.

Lukáš Křivka

Error Messages Deduplication

petrpatek/get-debug-items-from-dataset

Filter items from the dataset with `#debug` fieldName and saves them to the dataset and deduplicates `errorMessages` so you don't have to go through all the errors.

Petr Pátek

Real Estate Aggregator

tri_angle/real-estate-aggregator

Multi-source scraper for real estate data. Supports rent and sale listings, deduplication, and consistent output from Zillow, Realtor, Zumper, Apartments.com, and Rightmove.

Tri⟁angle

RSS Feed Aggregator

eloquent_mountain/rss-feed-aggregator

RSS Feed Aggregator Collect and consolidate multiple RSS feeds effortlessly with this Apify actor. Fetch items concurrently from various feeds, deduplicate entries, and select specific fields for a customized output. Ideal for news aggregation and content curation.

Paco

PagineGialle Extractor

data2b/paginegialle-extrator

Extract precise Italian business data from paginegialle.it. This Actor uses an optimized API to fetch Multisearch JSON, capturing names, addresses, phones, websites, emails, social links, ratings, & categories. With pagination, rate limiting, & unique ID deduplication, it ensures reliable data.

DATA2B

5.0

🔥 Power Data Transformer

wiseek/power-data-transformer

Automate your entire data workflow: clean, merge, filter, deduplicate, enrich, and reshape your datasets using built-in transformation or powerful SQL pipelines — seamlessly integrated with automation platforms like n8n, Make.com, and Zapier.

wiseek

Autotrader.co.uk cheap and fast scraper

dr.ollie/autotrader-co-uk-cheap-and-fast-scraper

Scrapes Autotrader UK car listings by postcode, make, price, and more. Extracts title, details, price, and link for each car. Enforces a max results limit, deduplicates listings, and outputs clean structured data for CSV or Apify dataset export.

Dr. Ollie

Eventbride + Ticketmaster

dr.ollie/eventbride-ticketmaster

Integrated scraper for Eventbrite & Meetup events. Filters by city, radius, and keywords. Outputs deduplicated results in HTML, Excel, CSV, JSON, and ICS calendar formats. Fast, browser-optimized. Ideal for users, analysts, marketers, and app builders.

Dr. Ollie

Google Maps Scraper

compass/crawler-google-places

Extract data from thousands of Google Maps locations and businesses, including reviews, reviewer details, images, contact info, opening hours, location, prices & more. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Compass

130K

4.2

Contact Details Scraper

vdrmota/contact-info-scraper

Free email extractor and lead scraper to extract and download emails, phone numbers, Facebook, Twitter, LinkedIn, and Instagram profiles from any website. Extract contact information at scale from lists of URLs and download the data as Excel, CSV, JSON, HTML, and XML.

Vojta Drmota

34K

3.8

Rightmove Scraper

dhrumil/rightmove-scraper

Scrape rightmove.co.uk to crawl millions of sale/rent real estate properties from United Kingdom. Our real estate scraper also lets you monitor specific listing for new updates/listing. You can provide multiple search result listings to scrape/monitor.