Duplicate Run Guardian avatar
Duplicate Run Guardian
Under maintenance

Pricing

$0.02 / actor start

Go to Apify Store
Duplicate Run Guardian

Duplicate Run Guardian

Under maintenance

Save costs by automatically aborting duplicate Actor runs. The essential integration for every scraping workflow

Pricing

$0.02 / actor start

Rating

0.0

(0)

Developer

Tomáš Gabík

Tomáš Gabík

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

25 days ago

Last modified

Categories

Share

Duplicate Run Guardian

The essential integration for cost-efficient scraping.

Duplicate Run Guardian is a "set and forget" integration that protects your Apify account from wasted spend. It automatically detects when a scraper is triggered with duplicate inputs and aborts the run immediately.

🌟 Why you need this

In complex data pipelines, it's easy to accidentally trigger the same job twice—whether due to scheduler glitches, retry logic, or human error.

Without protection:

  • 💸 You pay double for the same data.
  • 📉 You risk rate limits on target sites.
  • 🗑️ You get duplicate records in your dataset.

Duplicate Run Guardian solves this. It acts as a middleware that runs alongside your scraper, verifying its input against history before allowing it to proceed.

This Actor is designed to be triggered automatically whenever your Target Actor starts.

Step 1: Add the Integration

  1. Go to your Target Actor (the scraper).
  2. Navigate to Integrations -> Duplicate Run Guardian.
  3. Event: Select Run created.
  4. Input:
    • targetActorId: The ID of the scraper you want to protect.
    • action: Set to CANCEL (to abort duplicates) or ALERT (to just notify).
    • timeWindowHours: e.g., 24 (check for duplicates in the last 24 hours).
  5. Save these settings.

Step 2: Relax

That's it! Now, every time your scraper starts:

  1. Apify triggers the Guardian.
  2. The Guardian checks if this input has been seen recently.
  3. If it's a duplicate, the Guardian aborts the scraper run or alerts you instantly. 🛡️

Features

  • 🔍 Exact Match Detection: Compares normalized JSON inputs to ensure 100% accuracy.
  • ⚡ Zero-Config Standby: Runs as a standard Actor (no server needed), keeping costs extremely low.
  • 🛡️ Flexible Actions:
    • CANCEL: Abort the run (Save money).
    • ALERT: Send Slack/Email notifications (Be informed).
  • 📊 Detailed Logging: Clearly sees which run was a duplicate of which original run.

Configuration Options

FieldTypeDescription
targetActorIdStringRequired. The ID of the Actor to monitor.
timeWindowHoursIntegerHow far back to check for duplicates (default: 24 hours).
actionStringCANCEL (recommended) or ALERT.
slackWebhookUrlStringOptional. Slack Webhook URL for alerts.
emailAddressStringOptional. Email address for alerts.