Duplicate Run Guardian avatar
Duplicate Run Guardian
Under maintenance

Pricing

Pay per event

Go to Apify Store
Duplicate Run Guardian

Duplicate Run Guardian

Under maintenance

Save costs by automatically aborting duplicate Actor runs. The essential integration for every scraping workflow

Pricing

Pay per event

Rating

0.0

(0)

Developer

Tomáš Gabík

Tomáš Gabík

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

4 days ago

Last modified

Categories

Share

Duplicate Run Guardian

The essential integration for cost-efficient scraping.

Duplicate Run Guardian is a "set and forget" integration that protects your Apify account from wasted spend. It automatically detects when a scraper is triggered with duplicate inputs and aborts the run immediately.

🌟 Why you need this

In complex data pipelines, it's easy to accidentally trigger the same job twice—whether due to scheduler glitches, retry logic, or human error.

Without protection:

  • 💸 You pay double for the same data.
  • 📉 You risk rate limits on target sites.
  • 🗑️ You get duplicate records in your dataset.

Duplicate Run Guardian solves this. It acts as a middleware that runs alongside your scraper, verifying its input against history before allowing it to proceed.

This Actor is designed to be triggered automatically whenever your Target Actor starts.

Step 1: Add the Integration

  1. Go to your Target Actor (the scraper).
  2. Navigate to Integrations -> Duplicate Run Guardian.
  3. Event: Select Run created.
  4. Input:
    • targetActorId: The ID of the scraper you want to protect.
    • action: Set to CANCEL (to abort duplicates) or ALERT (to just notify).
    • timeWindowHours: e.g., 24 (check for duplicates in the last 24 hours).
  5. Save these settings.

Step 2: Relax

That's it! Now, every time your scraper starts:

  1. Apify triggers the Guardian.
  2. The Guardian checks if this input has been seen recently.
  3. If it's a duplicate, the Guardian aborts the scraper run or alerts you instantly. 🛡️

Features

  • 🔍 Exact Match Detection: Compares normalized JSON inputs to ensure 100% accuracy.
  • ⚡ Zero-Config Standby: Runs as a standard Actor (no server needed), keeping costs extremely low.
  • 🛡️ Flexible Actions:
    • CANCEL: Abort the run (Save money).
    • ALERT: Send Slack/Email notifications (Be informed).
  • 📊 Detailed Logging: Clearly sees which run was a duplicate of which original run.

Configuration Options

FieldTypeDescription
targetActorIdStringRequired. The ID of the Actor to monitor.
timeWindowHoursIntegerHow far back to check for duplicates (default: 24 hours).
actionStringCANCEL (recommended) or ALERT.
slackWebhookUrlStringOptional. Slack Webhook URL for alerts.
emailAddressStringOptional. Email address for alerts.