Duplicate Run Guardian
Pricing
$0.02 / actor start
Duplicate Run Guardian
Save costs by automatically aborting duplicate Actor runs. The essential integration for every scraping workflow
Pricing
$0.02 / actor start
Rating
0.0
(0)
Developer

Tomáš Gabík
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
25 days ago
Last modified
Categories
Share
Duplicate Run Guardian
The essential integration for cost-efficient scraping.
Duplicate Run Guardian is a "set and forget" integration that protects your Apify account from wasted spend. It automatically detects when a scraper is triggered with duplicate inputs and aborts the run immediately.
🌟 Why you need this
In complex data pipelines, it's easy to accidentally trigger the same job twice—whether due to scheduler glitches, retry logic, or human error.
Without protection:
- 💸 You pay double for the same data.
- 📉 You risk rate limits on target sites.
- 🗑️ You get duplicate records in your dataset.
Duplicate Run Guardian solves this. It acts as a middleware that runs alongside your scraper, verifying its input against history before allowing it to proceed.
🚀 Integration Setup (Recommended)
This Actor is designed to be triggered automatically whenever your Target Actor starts.
Step 1: Add the Integration
- Go to your Target Actor (the scraper).
- Navigate to Integrations -> Duplicate Run Guardian.
- Event: Select
Run created. - Input:
targetActorId: The ID of the scraper you want to protect.action: Set toCANCEL(to abort duplicates) orALERT(to just notify).timeWindowHours: e.g.,24(check for duplicates in the last 24 hours).
- Save these settings.
Step 2: Relax
That's it! Now, every time your scraper starts:
- Apify triggers the Guardian.
- The Guardian checks if this input has been seen recently.
- If it's a duplicate, the Guardian aborts the scraper run or alerts you instantly. 🛡️
Features
- 🔍 Exact Match Detection: Compares normalized JSON inputs to ensure 100% accuracy.
- ⚡ Zero-Config Standby: Runs as a standard Actor (no server needed), keeping costs extremely low.
- 🛡️ Flexible Actions:
- CANCEL: Abort the run (Save money).
- ALERT: Send Slack/Email notifications (Be informed).
- 📊 Detailed Logging: Clearly sees which run was a duplicate of which original run.
Configuration Options
| Field | Type | Description |
|---|---|---|
targetActorId | String | Required. The ID of the Actor to monitor. |
timeWindowHours | Integer | How far back to check for duplicates (default: 24 hours). |
action | String | CANCEL (recommended) or ALERT. |
slackWebhookUrl | String | Optional. Slack Webhook URL for alerts. |
emailAddress | String | Optional. Email address for alerts. |


