Duplicate Run Guardian
Pricing
Pay per event
Duplicate Run Guardian
Save costs by automatically aborting duplicate Actor runs. The essential integration for every scraping workflow
Pricing
Pay per event
Rating
0.0
(0)
Developer

Tomáš Gabík
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
4 days ago
Last modified
Categories
Share
Duplicate Run Guardian
The essential integration for cost-efficient scraping.
Duplicate Run Guardian is a "set and forget" integration that protects your Apify account from wasted spend. It automatically detects when a scraper is triggered with duplicate inputs and aborts the run immediately.
🌟 Why you need this
In complex data pipelines, it's easy to accidentally trigger the same job twice—whether due to scheduler glitches, retry logic, or human error.
Without protection:
- 💸 You pay double for the same data.
- 📉 You risk rate limits on target sites.
- 🗑️ You get duplicate records in your dataset.
Duplicate Run Guardian solves this. It acts as a middleware that runs alongside your scraper, verifying its input against history before allowing it to proceed.
🚀 Integration Setup (Recommended)
This Actor is designed to be triggered automatically whenever your Target Actor starts.
Step 1: Add the Integration
- Go to your Target Actor (the scraper).
- Navigate to Integrations -> Duplicate Run Guardian.
- Event: Select
Run created. - Input:
targetActorId: The ID of the scraper you want to protect.action: Set toCANCEL(to abort duplicates) orALERT(to just notify).timeWindowHours: e.g.,24(check for duplicates in the last 24 hours).
- Save these settings.
Step 2: Relax
That's it! Now, every time your scraper starts:
- Apify triggers the Guardian.
- The Guardian checks if this input has been seen recently.
- If it's a duplicate, the Guardian aborts the scraper run or alerts you instantly. 🛡️
Features
- 🔍 Exact Match Detection: Compares normalized JSON inputs to ensure 100% accuracy.
- ⚡ Zero-Config Standby: Runs as a standard Actor (no server needed), keeping costs extremely low.
- 🛡️ Flexible Actions:
- CANCEL: Abort the run (Save money).
- ALERT: Send Slack/Email notifications (Be informed).
- 📊 Detailed Logging: Clearly sees which run was a duplicate of which original run.
Configuration Options
| Field | Type | Description |
|---|---|---|
targetActorId | String | Required. The ID of the Actor to monitor. |
timeWindowHours | Integer | How far back to check for duplicates (default: 24 hours). |
action | String | CANCEL (recommended) or ALERT. |
slackWebhookUrl | String | Optional. Slack Webhook URL for alerts. |
emailAddress | String | Optional. Email address for alerts. |