Actor Health Monitor — Failures, Trends & Revenue
Pricing
$50.00 / 1,000 health checks
Actor Health Monitor — Failures, Trends & Revenue
Actor Health Monitor. Available on the Apify Store with pay-per-event pricing.
Pricing
$50.00 / 1,000 health checks
Rating
0.0
(0)
Developer
ryan clinton
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
10 days ago
Last modified
Categories
Share
Actor Health Monitor
Monitor your entire Apify actor fleet from a single run — get failure diagnosis, trend tracking, revenue impact estimates, and actionable fix recommendations for every actor that has issues. Instead of manually clicking through run logs in the Console, this actor checks all your actors at once, reads failed run logs, categorizes the root cause, and tells you exactly what to fix.
Why use this over checking the Apify Console manually? The Console shows you that a run failed. This actor tells you why it failed, whether it's getting worse, how much money you're losing, and what to do about it. One run covers your entire fleet — 10 actors or 500.
Features
- Automatic failure diagnosis — Reads the last 500 characters of each failed run's log and categorizes the root cause: timeout, schema mismatch, credential error, memory exceeded, input validation, API down, or unknown
- Trend tracking — Compares current failure rate against the previous period of equal length to detect whether issues are improving, worsening, stable, or new
- Revenue impact estimation — Calculates estimated revenue loss using your PPE pricing and average results per successful run
- Actionable recommendations — Generates specific fix suggestions per actor based on failure categories and severity
- Webhook alerts — Optionally POST a JSON alert summary to Slack, Discord, Teams, Zapier, or any webhook endpoint when actors exceed your failure threshold
- Fleet-wide summary — Single summary record with total actors, failure rate, top issue, and total revenue impact
Use cases
Daily fleet monitoring
Schedule this actor to run every morning. Get a health report of your entire fleet delivered to Slack. Catch problems before your users report them.
Post-deployment validation
Run after pushing updates to verify nothing broke. Compare failure rates before and after deployment by adjusting the hoursBack window.
Revenue protection
If you run PPE-priced actors, failures mean lost revenue. This actor quantifies exactly how much money failed runs cost you, so you can prioritize fixes by business impact.
SLA monitoring
Set a low failure threshold (e.g., 5%) and connect a webhook to your alerting system. Get notified immediately when any actor degrades below your quality bar.
Multi-actor debugging
When you have hundreds of actors, finding the one that's failing is needle-in-a-haystack. This actor surfaces all problems in one dataset, sorted by severity.
Input
| Field | Type | Description | Default |
|---|---|---|---|
apiToken | String (secret) | Your Apify API token (required) | — |
hoursBack | Integer (1-720) | How far back to check run history | 24 |
webhookUrl | String | URL to POST alert summary to | — |
failureThreshold | Number (0-1) | Alert if failure rate exceeds this (0.1 = 10%) | 0.1 |
Example input
{"apiToken": "apify_api_YOUR_TOKEN_HERE","hoursBack": 24,"webhookUrl": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK","failureThreshold": 0.1}
Output
Each actor with failures produces one record in the dataset. The final record is always a fleet summary.
Actor report example
{"actorName": "website-contact-scraper","actorId": "BCq991ez5HObhS5n0","status": "WARNING","failedRuns": 3,"totalRuns": 50,"failureRate": 0.06,"trend": "improving","trendDetail": "6% now vs 12% previous period","failureCategories": {"timeout": 2,"schema_mismatch": 1},"estimatedRevenueLoss": 0.45,"latestFailureLog": "Error: pushData validation failed — field 'emails' expected array but got string...","recommendations": ["2 runs timed out — consider increasing timeoutSecs in run configuration or reducing workload per run","1 schema mismatch — check that pushData output matches your dataset schema, or validate with apifyforge-validate before pushing"],"checkedAt": "2026-03-18T14:30:00.000Z"}
Fleet summary example
{"type": "summary","totalActors": 294,"actorsWithIssues": 5,"overallFailureRate": 0.03,"estimatedTotalRevenueLoss": 2.30,"topIssue": "timeout (affects 3 actors)","alertSent": true}
Output fields — Actor report
| Field | Type | Description |
|---|---|---|
actorName | String | Actor name |
actorId | String | Apify actor ID |
status | String | HEALTHY, WARNING, or CRITICAL |
failedRuns | Integer | Number of failed runs in the period |
totalRuns | Integer | Total runs in the period |
failureRate | Number | Failed/total ratio (0.06 = 6%) |
trend | String | improving, worsening, stable, or new_issue |
trendDetail | String | Human-readable trend comparison |
failureCategories | Object | Count of failures by diagnosed category |
estimatedRevenueLoss | Number | Estimated USD lost from failed runs |
latestFailureLog | String | Last 500 characters of most recent failed run's log |
recommendations | Array | Actionable fix suggestions |
checkedAt | String | ISO timestamp |
Output fields — Fleet summary
| Field | Type | Description |
|---|---|---|
type | String | Always "summary" |
totalActors | Integer | Total actors in your account |
actorsWithIssues | Integer | Actors with at least one failure |
overallFailureRate | Number | Fleet-wide failure ratio |
estimatedTotalRevenueLoss | Number | Total estimated USD lost |
topIssue | String | Most common failure category |
alertSent | Boolean | Whether webhook alert was sent |
How to use the API
Python
from apify_client import ApifyClientclient = ApifyClient(token="YOUR_API_TOKEN")run = client.actor("ryanclinton/actor-health-monitor").call(run_input={"apiToken": "YOUR_API_TOKEN","hoursBack": 24,"failureThreshold": 0.1,})for item in client.dataset(run["defaultDatasetId"]).iterate_items():if item.get("type") == "summary":print(f"Fleet: {item['totalActors']} actors, {item['actorsWithIssues']} with issues")print(f"Revenue impact: ${item['estimatedTotalRevenueLoss']:.2f}")else:print(f"[{item['status']}] {item['actorName']}: {item['failureRate']*100:.0f}% failures")for rec in item.get("recommendations", []):print(f" → {rec}")
JavaScript / Node.js
import { ApifyClient } from 'apify-client';const client = new ApifyClient({ token: 'YOUR_API_TOKEN' });const run = await client.actor('ryanclinton/actor-health-monitor').call({apiToken: 'YOUR_API_TOKEN',hoursBack: 24,failureThreshold: 0.1,});const { items } = await client.dataset(run.defaultDatasetId).listItems();const summary = items.find(i => i.type === 'summary');console.log(`Fleet: ${summary.totalActors} actors, ${summary.actorsWithIssues} with issues`);const actorReports = items.filter(i => i.type !== 'summary');actorReports.forEach(report => {console.log(`[${report.status}] ${report.actorName}: ${(report.failureRate * 100).toFixed(0)}% failure rate`);report.recommendations.forEach(rec => console.log(` → ${rec}`));});
cURL
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-health-monitor/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"apiToken": "YOUR_API_TOKEN","hoursBack": 24,"failureThreshold": 0.1}'
How it works
- Fetches all your actors via the Apify API using your token
- Pulls recent run history for each actor (up to 100 most recent runs)
- Filters to your time window — only considers runs started within
hoursBack - For actors with failures, reads the last 500 characters of each failed run's log (up to 10 per actor to stay within rate limits)
- Diagnoses each failure by pattern-matching the log text against known error signatures:
pushData,schema,validation→ schema_mismatchtimeout,TIMED-OUT→ timeout401,403,credential,API key→ credential_errorENOMEM,heap out of memory→ memory_exceededrequired,missing,invalid input→ input_validationECONNREFUSED,503,socket hang up→ api_down- Everything else → unknown
- Computes trends by comparing current period failure rate to the previous period of equal length
- Estimates revenue impact using your PPE pricing:
failedRuns * avgResultsPerRun * pricePerResult - Generates recommendations based on failure categories and severity
- Sends webhook alert if configured and any actor exceeds the failure threshold
Failure categories explained
| Category | What it means | Common causes |
|---|---|---|
timeout | Run exceeded time limit | Too much data, slow API, infinite loop |
schema_mismatch | Output data doesn't match schema | Code change broke pushData format |
credential_error | Authentication failed | Expired API key, wrong token |
memory_exceeded | Ran out of RAM | Large datasets in memory, memory leak |
input_validation | Missing or invalid input fields | Caller didn't provide required fields |
api_down | External service unreachable | Third-party API outage, rate limiting |
unknown | Couldn't diagnose from log | Check full logs in Apify Console |
Integrations
Slack alerts
Use a Slack incoming webhook URL as the webhookUrl parameter. You'll get a JSON payload posted to your channel whenever actors exceed the failure threshold.
Scheduled monitoring
Use Apify Schedules to run this actor every hour, every morning, or on any cron schedule. Combined with webhook alerts, this gives you continuous fleet monitoring.
Zapier / Make / n8n
Connect the webhook to any automation platform to trigger workflows when actors fail — create Jira tickets, send emails, page on-call engineers.
Limitations
- Log analysis is heuristic — Failure diagnosis is based on pattern matching against the last 500 characters of run logs. Some failures may be categorized as "unknown" if the error message doesn't match known patterns.
- Revenue estimates are approximate — Based on PPE pricing and average results per run. Actual revenue impact depends on which specific runs failed and their expected output volume.
- Rate limits — Fetches logs for up to 10 failed runs per actor to stay within Apify API rate limits. Actors with more than 10 failures in the period will have some failures categorized as "unknown".
- 100 run limit per actor — Only checks the 100 most recent runs per actor. If an actor runs more than 100 times in your
hoursBackwindow, older runs won't be included. - No historical storage — Each run produces a fresh report. For historical tracking, export results to a database or use Apify datasets with named storage.
FAQ
How is this different from the Apify Console monitoring?
The Console shows individual run statuses. This actor aggregates across your entire fleet, diagnoses why runs failed by reading logs, tracks whether issues are improving or worsening, estimates revenue impact, and generates actionable recommendations. It's the difference between a dashboard and a health report.
Will this work with actors I don't own?
No. It uses your API token to fetch actors and run logs, so it only sees actors in your account. It cannot access other users' actors or shared organization actors unless your token has access.
How much does it cost to run?
$0.05 per health check. One run covers your entire fleet regardless of size. A daily check costs about $1.50/month.
Can I monitor specific actors only?
Currently, it monitors all actors in your account. You can filter the output dataset to focus on specific actors. A future version may add actor name/ID filters.
What if an actor has zero runs in the period?
It's silently skipped. The fleet summary counts it in totalActors but it won't appear in the individual reports.
Does the webhook send on every run?
Only when at least one actor exceeds your failureThreshold. If everything is healthy, no webhook is sent.
Pricing
- $0.05 per health check — one check covers your entire fleet
- Runs in under 2 minutes for fleets of 100+ actors
| Frequency | Monthly cost |
|---|---|
| Daily | ~$1.50 |
| Twice daily | ~$3.00 |
| Hourly | ~$36.00 |
Changelog
v1.0.0 (2026-03-18)
- Complete rebuild from scratch (v0 was too thin)
- Failure diagnosis via log pattern matching
- 7 failure categories with specific recommendations
- Trend tracking (current vs previous period)
- Revenue impact estimation from PPE pricing
- Webhook alerting with JSON payload
- Fleet-wide summary with top issue identification