AI Cost Audit & Anomaly Detection
Pricing
from $0.01 / 1,000 processed usage records
AI Cost Audit & Anomaly Detection
Audit and explain your LLM/API usage costs from CSV or JSON. Calculates token-based cost per model, groups by model/day/feature, detects spikes and expensive-model patterns, and outputs a dataset plus summary.json and report.md. No external APIs required.
Pricing
from $0.01 / 1,000 processed usage records
Rating
0.0
(0)
Developer

Hayder Al-Khalissi
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
10 hours ago
Last modified
Categories
Share
AI Cost Audit & Anomaly Detection Actor
v3.0 — Audit, analyze, and optimize your AI / LLM usage costs from CSV or JSON data.
This Actor calculates token-based costs per model, aggregates usage, detects anomalies, analyzes data quality, measures token efficiency, identifies cost optimization opportunities, and sends webhook alerts — all without calling external APIs. Get concrete, actionable recommendations to reduce costs by 20-40%.
NEW in v3.0:
- 🏷️ Attribution Layer — Cost breakdown by workflow, node, tenant, environment
- 📊 Outcome Linkage — Link costs to business outcomes (cost per outcome, outcomes per dollar)
- 🎯 Smarter Anomaly Detection — Dual-threshold spike detection (relative + absolute)
- 🛡️ Enforcement Engine — Simulated cost-control actions (warn, throttle, switch_model, block)
- 🔄 Idempotency Protection — Event deduplication with KV store caching (now with late event support)
- 🔒 Prompt Security — Automatic SHA-256 hashing (raw prompts never stored)
- 📅 Retention Metadata — Policy tracking for downstream systems
- 🔗 n8n Integration — First-class support for n8n workflow event ingestion with flexible field mapping
What this Actor does
For each usage record, the Actor:
- Calculates input / output token costs
- Aggregates costs by:
- model
- day
- feature
- user (optional)
- workflow, node, tenant, environment (v3 — optional)
- Detects anomalies and cost spikes (optional, v3: dual-threshold)
- Analyzes data quality with comprehensive metrics
- Calculates token efficiency (input/output ratio, cost per request, efficiency score)
- Links costs to outcomes (v3 — cost per outcome, outcomes per dollar)
- Identifies cost optimization opportunities with concrete, actionable recommendations
- Simulates enforcement actions (v3 — when thresholds exceeded)
- Deduplicates events (v3 — idempotency protection)
- Analyzes trends (last 7 vs previous 7 days) and projects monthly costs
- Sends webhook alerts for anomalies and budget thresholds (optional)
- Produces:
- 📊 Dataset — detailed per-record costs (6 pre-configured views including Attribution)
- 📄 summary.json — complete analysis with all metrics
- 🔍 Data Quality Report — validation metrics & recommendations
- ⚡ Token Efficiency Metrics — efficiency score & optimization insights
- 💡 Cost Optimization Opportunities — rules-based recommendations with potential savings
- 📈 Trend & Forecast — last 7 vs previous 7 days comparison and monthly projection
- 🏷️ Attribution Breakdown (v3) — by workflow, node, tenant, environment
- 📊 Outcome Efficiency (v3) — cost per outcome, outcomes per dollar
- 🛡️ Enforcement Actions (v3) — simulated cost-control decisions
- 📝 report.md — human-readable insights and recommendations
Typical use cases
- AI / LLM Cost Auditing — Comprehensive cost analysis with quality validation
- FinOps & Cost Optimization — Identify savings opportunities with concrete recommendations
- Anomaly Detection — Real-time alerts for unusual usage spikes (v3: dual-threshold)
- Token Efficiency Analysis — Optimize prompt engineering and model selection
- Cost Attribution (v3) — Track costs by workflow, node, tenant, environment
- Outcome-Based ROI (v3) — Link costs to business outcomes (leads, conversions, etc.)
- Cost Control (v3) — Simulate enforcement actions when thresholds exceeded
- Event Deduplication (v3) — Prevent double-counting with idempotency protection
- Data Quality Assurance — Validate usage data integrity and completeness
- Trend Analysis — Compare last 7 vs previous 7 days spending trends
- Cost Forecasting — Project monthly costs based on recent usage patterns
- Budget Monitoring — Webhook alerts when spending exceeds thresholds
- Internal Reporting — Executive-friendly reports for OpenAI / Anthropic / Groq / Gemini / Azure OpenAI usage
- Model Comparison — Compare costs and efficiency across different models
Quick start (Apify Platform)
To run immediately without hosting data: use the Example input (JSON) below — copy it into the Input tab and run.
- Open the Input tab
- Choose your input type:
- CSV (URL or raw)
- JSON array
- n8n events (v3 — see n8n Integration section)
- Configure the price table for your models
- (Optional) Enable v3 features:
- Attribution fields (workflowId, nodeId, tenantId, environment)
- Outcome linkage (outcomeType, outcomeValue)
- Enforcement simulation
- Idempotency protection
- n8n integration (for n8n input type)
- Run the Actor
Results will appear automatically in:
- Dataset (detailed rows with 6 pre-configured views including Attribution)
- Key-Value Store
summary.json(includes data quality metrics, attribution, outcomes, enforcement)report.md(includes all sections including v3 enhancements)
Features:
- ✅ Multi-provider support (OpenAI, Anthropic, Groq, Gemini, Azure OpenAI, generic)
- ✅ Model-aware anomaly detection (v3: dual-threshold)
- ✅ Data quality analysis (success rate, field completeness, outliers)
- ✅ Token efficiency metrics (efficiency score, input/output ratio, cost per request)
- ✅ Cost optimization engine (rules-based recommendations with potential savings)
- ✅ Trend & forecast (last 7 vs previous 7 days, monthly projection)
- ✅ Attribution breakdown (v3 — workflow, node, tenant, environment)
- ✅ Outcome efficiency (v3 — cost per outcome, outcomes per dollar)
- ✅ Enforcement simulation (v3 — cost-control actions)
- ✅ Idempotency protection (v3 — event deduplication)
- ✅ Prompt security (v3 — SHA-256 hashing)
- ✅ Webhook alerts (anomalies and budget thresholds)
- ✅ No external API keys required
Input overview
Required
- Usage data (CSV or JSON)
- Provider (openai, anthropic, groq, gemini, azure-openai, generic)
Optional
- Price table (uses defaults if not provided)
- Grouping fields (model, day, feature, user, workflow, node, tenant, environment)
- Anomaly detection settings (v3: dual-threshold with
minAbsoluteIncrease) - Enforcement config (v3 — cost-control simulation)
- Idempotency config (v3 — event deduplication)
- Retention metadata (v3 — informational policy)
- Webhook URL (for real-time alerts)
- Budget threshold (for budget exceeded alerts)
- Output options
Example input (JSON)
Copy the JSON below into the Actor input and run — it works as-is with no file hosting. Do not use placeholder URLs like https://example.com/usage-data.csv; they do not exist and will return HTTP 404.
To use your own data via URL instead, leave out rawData, set inputType to "csv" or "json", and set dataUrl to a real URL that serves your data.
{"inputType": "json","provider": "openai","currency": "USD","rawData": [{"timestamp":"2026-02-01T09:00:00Z","model":"gpt-4o","prompt_tokens":1200,"completion_tokens":450,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-classify","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-001"},{"timestamp":"2026-02-01T10:00:00Z","model":"gpt-4o","prompt_tokens":1500,"completion_tokens":600,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-002"},{"timestamp":"2026-02-01T11:00:00Z","model":"gpt-4o-mini","prompt_tokens":300,"completion_tokens":120,"feature":"summarizer","user_id":"user-2","workflowId":"wf-content","nodeId":"node-summarize","environment":"production","tenantId":"tenant-acme","endpoint":"/api/summarize","outcomeType":"summary","outcomeValue":1,"eventId":"evt-003"},{"timestamp":"2026-02-01T13:00:00Z","model":"claude-3-5-sonnet","prompt_tokens":1800,"completion_tokens":700,"feature":"analysis","user_id":"user-5","workflowId":"wf-analytics","nodeId":"node-analyze","environment":"staging","tenantId":"tenant-acme","endpoint":"/api/analyze","outcomeType":"insight","outcomeValue":2,"eventId":"evt-004"},{"timestamp":"2026-02-01T15:00:00Z","model":"gpt-4o","prompt_tokens":900,"completion_tokens":350,"feature":"assistant","user_id":"user-3","workflowId":"wf-internal","nodeId":"node-draft","environment":"development","tenantId":"tenant-beta","endpoint":"/api/assist","outcomeType":"draft","outcomeValue":1,"eventId":"evt-005"},{"timestamp":"2026-02-02T09:00:00Z","model":"gpt-4o","prompt_tokens":1100,"completion_tokens":420,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-classify","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-006"},{"timestamp":"2026-02-02T11:00:00Z","model":"claude-3-5-sonnet","prompt_tokens":2200,"completion_tokens":850,"feature":"analysis","user_id":"user-5","workflowId":"wf-analytics","nodeId":"node-analyze","environment":"staging","tenantId":"tenant-acme","endpoint":"/api/analyze","outcomeType":"insight","outcomeValue":3,"eventId":"evt-007"},{"timestamp":"2026-02-02T14:00:00Z","model":"gpt-4o","prompt_tokens":950,"completion_tokens":370,"feature":"assistant","user_id":"user-3","workflowId":"wf-internal","nodeId":"node-draft","environment":"development","tenantId":"tenant-beta","endpoint":"/api/assist","outcomeType":"draft","outcomeValue":1,"eventId":"evt-008"},{"timestamp":"2026-02-02T16:00:00Z","model":"gpt-4o","prompt_tokens":1600,"completion_tokens":620,"feature":"chatbot","user_id":"user-4","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-gamma","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-009"},{"timestamp":"2026-02-03T10:00:00Z","model":"gpt-4o","prompt_tokens":1350,"completion_tokens":530,"feature":"chatbot","user_id":"user-4","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-gamma","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-010"}],"priceTable": {"gpt-4o": { "inputPer1K": 0.0025, "outputPer1K": 0.01 },"gpt-4o-mini": { "inputPer1K": 0.00015, "outputPer1K": 0.0006 },"claude-3-5-sonnet": { "inputPer1K": 0.003, "outputPer1K": 0.015 }},"groupBy": ["model", "day", "feature", "workflow", "tenant"],"anomalyDetection": {"enabled": true,"thresholdMultiplier": 2.5,"minAbsoluteIncrease": 0.01,"groupByFeature": false},"enforcement": {"enabled": true,"maxDailyCost": 0.50,"action": "warn"},"idempotency": {"enabled": true,"ttlSeconds": 86400},"retention": {"rawDays": 90,"aggregateDays": 365},"budgetThreshold": 100,"writeDataset": true,"writeReportMd": true,"includeSummaryWhenEmpty": true}
Using a URL instead? Omit
rawData, set"inputType": "csv"or"json", and set"dataUrl"to a real URL that serves your CSV/JSON (e.g. a file you uploaded or a URL from your own server). Placeholder URLs likeexample.comdo not work.
Re-running the same example? With
idempotency.enabled: true, event IDs are remembered (forttlSeconds, default 24h). A second run with the sameeventIds will skip all events and complete successfully with 0 new records and an empty cost breakdown. To see the cost breakdown again, either set"idempotency": { "enabled": false }in the example or use differenteventIdvalues.
Supported input formats
CSV format
Your CSV should include columns such as:
timestamp(ordate,created_at)modelprompt_tokens(orinput_tokens,promptTokenCountfor Gemini)completion_tokens(oroutput_tokens,candidatesTokenCountfor Gemini)- Optional:
feature,user_id - v3 Optional:
workflowId,workflowName,nodeId,environment,tenantId,endpoint,provider,outcomeType,outcomeValue,eventId,prompt(will be hashed)
Provider-specific field names:
- OpenAI:
prompt_tokens,completion_tokens - Anthropic:
input_tokens,output_tokens - Groq:
prompt_tokens,completion_tokens - Gemini:
promptTokenCount,candidatesTokenCount - Azure OpenAI:
prompt_tokens,completion_tokens(same as OpenAI)
Example:
timestamp,model,prompt_tokens,completion_tokens,feature,workflowId,nodeId,outcomeType,outcomeValue2026-02-01T10:00:00Z,gpt-4o-mini,500,200,chatbot,workflow-123,node-1,lead,1
JSON format
OpenAI / Azure OpenAI:
[{"timestamp": "2026-02-01T10:00:00Z","model": "gpt-4o-mini","prompt_tokens": 500,"completion_tokens": 200,"feature": "chatbot","workflowId": "workflow-123","nodeId": "node-1","tenantId": "tenant-abc","environment": "prod","outcomeType": "lead","outcomeValue": 1,"eventId": "evt-xyz-123","prompt": "You are a helpful assistant..." // Will be hashed automatically}]
Anthropic:
[{"timestamp": "2026-02-01T10:00:00Z","model": "claude-3-5-sonnet","input_tokens": 500,"output_tokens": 200,"feature": "chatbot","workflowId": "workflow-123"}]
Gemini:
[{"timestamp": "2026-02-01T10:00:00Z","model": "gemini-1.5-pro","promptTokenCount": 500,"candidatesTokenCount": 200,"feature": "chatbot"}]
Getting all reports
To get every output the Actor can produce, set these in the Input (they are all on by default):
| Input | Default | Effect |
|---|---|---|
writeDataset | true | Write cost records (or one summary record when all are duplicates) to the Dataset. |
writeReportMd | true | Generate report.md in the Key-Value Store (human-readable audit report). |
includeSummaryWhenEmpty | true | When idempotency skips all records, still push one summary record to the Dataset so workflows (e.g. n8n) get a result. |
Summary (summary.json) is always written to the Key-Value Store; there is no option to disable it.
To include all reports: keep defaults or set explicitly:
"writeDataset": true,"writeReportMd": true,"includeSummaryWhenEmpty": true
Then you get:
- Dataset – Cost records (one per usage event), or one summary record when all events are duplicates.
- Key-Value Store –
summary.json(full JSON summary) andreport.md(markdown report).
Output details
Dataset (per record)
Each dataset item represents one usage record:
{"timestamp": "2026-02-01T10:00:00.000Z","model": "gpt-4o-mini","inputTokens": 500,"outputTokens": 200,"inputCost": 0.000075,"outputCost": 0.00012,"totalCost": 0.000195,"currency": "USD","feature": "chatbot","userId": "user123","workflowId": "workflow-123","nodeId": "node-1","environment": "prod","tenantId": "tenant-abc","outcomeType": "lead","outcomeValue": 1,"promptHash": "a1b2c3d4e5f6...","eventId": "evt-xyz-123"}
Summary (summary.json)
{"totalCost": 0.5234,"totalInputTokens": 15000,"totalOutputTokens": 10000,"totalRecords": 30,"currency": "USD","byModel": [...],"byDay": [...],"byWorkflow": [...],"byNode": [...],"byTenant": [...],"byEnvironment": [...],"topCostDrivers": [...],"anomalies": [...],"outcomeEfficiency": {"recordsWithOutcomes": 25,"totalOutcomes": 150,"costPerOutcome": 0.0035,"outcomesPerDollar": 286.5,"byOutcomeType": [...]},"enforcementActions": [...],"idempotencyStats": {"duplicatesSkipped": 5,"uniqueEventsProcessed": 25},"retentionPolicyApplied": {"rawDays": 90,"aggregateDays": 365,"appliedAt": "2026-02-06T..."},"dataQuality": {"overallQualityScore": 100,"qualityGrade": "excellent","successRate": 100,"fieldCompleteness": {...},"recommendations": [...]}}
Report (report.md)
A human-readable Markdown report including:
- Executive Summary with overall metrics and efficiency score
- Cost Breakdown Tables (by model, day, feature)
- Attribution Breakdown (v3 — by workflow, node, tenant, environment)
- Outcome Efficiency (v3 — cost per outcome, outcomes per dollar, by outcome type)
- Top Cost Drivers analysis
- Anomaly Alerts with detailed context and top-offender attribution (v3)
- Enforcement Actions (v3 — simulated cost-control decisions)
- Data Quality Report section:
- Quality score (0-100) with grade badge
- Field completeness table
- Data issues summary
- Outlier detection results
- Quality recommendations
- Token Efficiency Metrics section:
- Efficiency score (0-100) with grade
- Key metrics table (cost per request, tokens per request, ratios)
- Cost efficiency table (cost per 1K tokens)
- Token utilization percentages
- Efficiency recommendations
- Cost Optimization Opportunities section:
- Total potential savings
- Recommendations grouped by priority (high/medium/low)
- Specific action items for each opportunity
- Affected records and costs
- Trend & Forecast section:
- Last 7 days vs previous 7 days comparison table
- Trend direction (increasing/decreasing/stable) with percentage change
- Projected monthly cost based on recent daily averages
- Retention Policy (v3 — informational metadata)
- General Optimization Recommendations
Ideal for sharing with non-technical stakeholders and executives.
v3.0 New Features
🏷️ Attribution Layer
Track costs by workflow, node, tenant, and environment:
{"workflowId": "workflow-123","workflowName": "Lead Generation Pipeline","nodeId": "node-1","environment": "prod","tenantId": "tenant-abc","endpoint": "/api/chat"}
Aggregations:
byWorkflow— Cost breakdown by workflowbyNode— Cost breakdown by node/stepbyTenant— Cost breakdown by tenantbyEnvironment— Cost breakdown by environment (prod/stage/dev)
Use cases:
- Multi-tenant cost allocation
- Workflow-level ROI analysis
- Environment cost comparison
- Node-level optimization
📊 Outcome Linkage
Link costs to business outcomes:
{"outcomeType": "lead","outcomeValue": 1}
Metrics:
costPerOutcome— Average cost to produce one outcomeoutcomesPerDollar— How many outcomes per dollar spent- Breakdown by outcome type
Use cases:
- ROI calculation (cost per lead, cost per conversion)
- Efficiency tracking (outcomes per dollar)
- Outcome-based optimization
🎯 Smarter Anomaly Detection
Dual-threshold spike detection:
A spike is only flagged when both conditions are met:
- Relative spike >
thresholdMultiplier(e.g., 2.5x) - Absolute delta >
minAbsoluteIncrease(e.g., $0.50)
This prevents noise from small-value spikes while catching significant increases.
{"anomalyDetection": {"enabled": true,"thresholdMultiplier": 2.5,"minAbsoluteIncrease": 0.50,"groupByFeature": false}}
Enhanced anomaly context:
topOffender— Attribution context (workflowId, nodeId, model, promptHash)- Helps operators pinpoint root cause
🛡️ Enforcement Engine
Simulate cost-control actions when thresholds are exceeded:
{"enforcement": {"enabled": true,"maxDailyCost": 100,"action": "warn"}}
Actions:
warn— Log warning (default)throttle— Simulate throttlingswitch_model— Simulate model downgradeblock— Simulate request blocking
Important: All actions are simulated — no actual blocking occurs. This provides visibility into enforcement rules without risking data-pipeline disruption.
🔄 Idempotency Protection
Deduplicate events using eventId:
{"idempotency": {"enabled": true,"ttlSeconds": 86400}}
How it works:
- Records with the same
eventIdare processed only once (within TTL window) - Uses Apify KV store for persistence
- Records without
eventIdalways pass through (fail-open)
Use cases:
- Prevent double-counting from retries
- Handle duplicate webhook deliveries
- Ensure idempotent processing
🔒 Prompt Security
Automatic prompt hashing:
If a record contains a prompt field:
- Raw prompt is hashed using SHA-256
- Only the hash (
promptHash) is stored - Raw prompt is never logged or stored
Example:
{"prompt": "You are a helpful assistant...",// → Automatically converted to:"promptHash": "a1b2c3d4e5f6..."}
Security benefits:
- No PII leakage in logs
- No sensitive prompts in datasets
- Hash can be used for deduplication/analysis
📅 Retention Metadata
Attach retention policy metadata (informational only):
{"retention": {"rawDays": 90,"aggregateDays": 365}}
Note: This actor does not delete data. The policy is attached to the summary for downstream systems to interpret.
🔗 n8n Integration
First-class support for ingesting n8n workflow events with flexible field mapping and automatic attribution.
Features:
- Accepts n8n events from HTTP Request/Webhook exports
- Maps custom field names to canonical structure
- Automatic eventId generation when missing
- Prompt hashing (SHA-256) with size limits
- Late event detection and marking
- Full attribution breakdown (workflow → node → run)
Configuration:
{"inputType": "n8n","provider": "generic","n8n": {"enabled": true,"source": {"type": "raw","rawEvents": [{"eventId": "evt-001","timestamp": "2026-02-01T10:00:00Z","provider": "openai","model": "gpt-4o","inputTokens": 1200,"outputTokens": 450,"workflowId": "wf-support","workflowName": "Customer Support","nodeId": "node-classify","nodeName": "Classify Request","runId": "run-001","env": "prod","tenantId": "tenant-acme","userId": "user-1","endpoint": "/v1/chat/completions","prompt": "Classify this request","outcomeType": "resolution","outcomeValue": 1,"outcomeOk": true}]},"mapping": {"workflowIdField": "workflowId","workflowNameField": "workflowName","nodeIdField": "nodeId","nodeNameField": "nodeName","environmentField": "env","tenantIdField": "tenantId","userIdField": "userId","runIdField": "runId","eventIdField": "eventId","timestampField": "timestamp"},"defaults": {"environment": "prod"},"derive": {"workflowNameFrom": "workflowId","nodeIdFrom": "nodeName","hashPrompt": true}},"idempotency": {"enabled": true,"ttlHours": 168,"allowedLateHours": 48}}
Canonical Event Structure:
The Actor expects events with these fields (all optional except timestamp, model/provider, and token counts):
{"eventId": "uuid-or-hash","timestamp": "ISO string","provider": "openai|anthropic|google|bedrock|other","model": "gpt-4o|claude-3-5-sonnet|...","inputTokens": 123,"outputTokens": 456,"workflowId": "wf_123","workflowName": "Lead Qualification","nodeId": "node_7","nodeName": "OpenAI Chat","runId": "execution_999","env": "prod|stage|dev","tenantId": "customer_42","userId": "agent_7","endpoint": "/v1/responses","prompt": "OPTIONAL raw prompt","outcomeType": "lead","outcomeValue": 1,"outcomeOk": true}
Field Mapping:
Use n8n.mapping to map custom field names:
{"n8n": {"mapping": {"workflowIdField": "custom_workflow_id","nodeIdField": "custom_node_id"}}}
EventId Generation:
If eventId is missing, it's automatically generated from:
sha256(runId + nodeId + timestamp + model + inputTokens + outputTokens)
Prompt Handling:
- Prompts are automatically hashed using SHA-256
- Raw prompts are never stored (only
promptHash) - Prompts exceeding 1MB are dropped with a warning
- Set
derive.hashPrompt: falseto disable hashing
Late Events:
Events older than lastRunEnd - allowedLateHours are:
- Still processed (not skipped)
- Marked with
lateEvent: trueflag - Included in
idempotencyStats.lateEventsCount
Aggregations:
When n8n events include attribution fields, the Actor produces:
byWorkflow— Cost breakdown by workflowbyNode— Cost breakdown by node/stepbyTenant— Cost breakdown by tenantbyEnvironment— Cost breakdown by environmentbyRunId— Cost breakdown by execution run
Use Cases:
- Multi-tenant cost allocation
- Workflow-level ROI analysis
- Node-level optimization
- Cost per successful run (with
outcomeOk) - Late event reconciliation
Example Output:
{"byWorkflow": [{"key": "wf-support","totalCost": 0.5234,"inputTokens": 15000,"outputTokens": 10000,"recordCount": 30}],"byNode": [{"key": "node-classify","totalCost": 0.2345,"inputTokens": 8000,"outputTokens": 5000,"recordCount": 15}],"idempotencyStats": {"duplicatesSkipped": 5,"uniqueEventsProcessed": 25,"lateEventsCount": 2}}
Anomaly Detection
Overview
The Actor includes intelligent anomaly detection to identify unusual spending patterns. Anomalies are detected across multiple dimensions:
- Cost Spikes - Individual requests with unusually high costs (model-aware, v3: dual-threshold)
- Expensive Models - Models consuming disproportionate budget
- High Volume Days - Days with abnormally high usage
- Low Efficiency - Models with poor output/input token ratios
Configuration
Configure anomaly detection using the anomalyDetection object:
{"anomalyDetection": {"enabled": true,"thresholdMultiplier": 2.5,"minAbsoluteIncrease": 0.50,"groupByFeature": false}}
Options:
enabled(boolean) - Enable/disable anomaly detection (default:true)thresholdMultiplier(number) - Relative sensitivity multiplier (default:2.5)- Higher = less sensitive (fewer anomalies)
- Lower = more sensitive (more anomalies)
- Range: 1.5 - 10
minAbsoluteIncrease(number, v3) - Minimum absolute cost increase to trigger spike (default:0.01)- Prevents noise from small-value spikes
- Example:
0.50means $0.50 minimum delta required
groupByFeature(boolean) - Use per-model+feature baselines for spike detection (default:false)false- Compare each record against other records from the same modeltrue- Compare against records from the same model AND feature
Data Requirements
Anomaly detection requires minimum data thresholds:
- Spike Detection: Requires at least 10 records
- Volume Detection: Requires at least 3 unique days
If your dataset doesn't meet these requirements:
anomalyDetectionStatuswill be set to"insufficient_data"anomalyDetectionNoteswill explain what's missing- The report will show a "Data too small" warning
- No spike/volume anomalies will be generated
Example insufficient data response:
{"anomalyDetectionStatus": "insufficient_data","anomalyDetectionNotes": ["Too few records for spike detection (need at least 10, have 3).","Too few days for volume detection (need at least 3, have 1)."],"anomalies": []}
Model-Aware Spike Detection
Spike detection uses per-model baselines to avoid false positives when comparing different models:
- Each model's costs are analyzed separately
- A spike is only flagged if it's unusual for that specific model
- Example: A $0.50 request is normal for
gpt-4obut a spike forgpt-4o-mini
When groupByFeature: true, baselines are computed per model+feature combination:
- Useful when features have different typical usage patterns
- Example: "image-generation" vs "text-completion" may have different cost profiles
Enriched Spike Anomalies
Spike anomalies include detailed context for investigation:
{"type": "spike","severity": "high","model": "gpt-4o","feature": "chatbot","baseline": "per_model","value": 0.5234,"threshold": 0.2500,"modelMedian": 0.0850,"modelP95": 0.3200,"groupRecordCount": 150,"rankInGroup": 1,"timestamp": "2026-02-01T14:32:00Z","topOffender": {"workflowId": "workflow-123","nodeId": "node-1","model": "gpt-4o","promptHash": "a1b2c3..."}}
Fields:
baseline- How the baseline was computed (per_modelorper_model_feature)modelMedian- Median cost for this model/groupmodelP95- 95th percentile costgroupRecordCount- Total records in baseline grouprankInGroup- Where this record ranks (1 = most expensive)topOffender(v3) - Attribution context to help pinpoint root cause
Backward Compatibility
Legacy configuration format is still supported:
{"anomalyDetectionEnabled": true,"anomalyThreshold": 2.5}
However, the new anomalyDetection object format is recommended for access to all features including dual-threshold.
Key Features
🔍 Data Quality Analysis
- Overall quality score (0-100) with grade (excellent/good/fair/poor)
- Success rate tracking (valid vs invalid records)
- Field completeness metrics (feature, userId, required fields)
- Data issue detection (normalization errors, invalid timestamps, duplicates, outliers)
- Actionable quality recommendations
⚡ Token Efficiency Metrics
- Efficiency score (0-100) based on multiple factors
- Input/output ratio analysis
- Cost per request metrics
- Cost per 1K tokens (input/output/total)
- Token utilization percentages
- Efficiency recommendations
💡 Cost Optimization Engine
- Rules-based recommendations (no AI magic)
- 5 detection rules:
- Model downgrade opportunities (expensive models for simple tasks)
- Prompt optimization (high input/low output ratio)
- Caching opportunities (repetitive requests)
- Feature optimization (dominating spend)
- High-cost pattern detection
- Concrete action items for each recommendation
- Potential savings estimates
- Priority-based grouping (high/medium/low)
🏷️ Attribution & Outcome Analysis (v3)
- Cost attribution by workflow, node, tenant, environment
- Outcome efficiency — cost per outcome, outcomes per dollar
- Breakdown by outcome type for ROI analysis
🛡️ Enforcement & Security (v3)
- Enforcement simulation — cost-control actions when thresholds exceeded
- Idempotency protection — event deduplication with KV store
- Prompt security — automatic SHA-256 hashing (raw prompts never stored)
📡 Webhook Alerts (Optional)
- Anomaly alerts — HTTP POST when anomalies detected
- Budget alerts — HTTP POST when budget threshold exceeded
- Works with Slack, Discord, Zapier, or any webhook endpoint
- Simple HTTP POST with JSON payloads
Pricing notes
- This Actor does not call external APIs
- All cost calculations are done locally using your price table
- You are charged only for running the Actor, not per record processed (Apify platform fees apply)
Common issues & fixes
"No input provided"
Ensure input is passed via:
- Apify UI
- Input JSON
- URL or raw data
"No price found for model"
Add the missing model to the priceTable:
{"your-model-name": {"inputPer1K": 0.001,"outputPer1K": 0.002}}
CSV parsing errors
- Ensure header row exists
- Use commas as separators
- Quote text fields with commas
Webhook not sending
- Verify
webhookUrlis configured correctly (must be HTTP/HTTPS) - Check that anomalies were detected or budget threshold was exceeded
- Review Actor logs for webhook errors
- Test with https://webhook.site/ first
No optimization recommendations
- Optimization engine analyzes patterns automatically
- Recommendations appear when specific patterns are detected:
- Model downgrade: Requires 5+ simple requests with expensive models
- Prompt optimization: Requires 3+ requests with high input/low output ratio
- Caching: Requires 5+ repetitive requests (same model + feature)
- Feature optimization: Requires feature with >50% of spend
- If no recommendations: Current usage may already be efficient
v3 features not appearing
Attribution aggregations:
- Ensure records include
workflowId,nodeId,tenantId, orenvironmentfields - Add these fields to your
groupByarray if you want them in aggregations
Outcome efficiency:
- Ensure records include both
outcomeTypeandoutcomeValuefields - Section only appears if at least one record has outcome data
Enforcement actions:
- Enable
enforcement.enabled: truein config - Set
maxDailyCostthreshold - Actions only appear if threshold is exceeded
Idempotency:
- Enable
idempotency.enabled: truein config - Ensure records include
eventIdfield - Stats only appear if deduplication is enabled
Local development (optional)
For contributors or advanced users.
Install dependencies
$npm install
Build
$npm run build
Run tests
$npm test
Run locally
$npx apify run --input-file=example-input-raw.json
Deployment (developers)
apify loginapify push
Then run via the Apify Console.
What Makes This Actor Unique?
🎯 Actionable Insights (Not Just Metrics)
- Concrete recommendations: "Switch 15 requests from gpt-4o to gpt-4o-mini"
- Potential savings: See estimated cost reduction before acting
- Specific action items: Know exactly what to do
📊 Comprehensive Analysis
- Cost analysis: Aggregations by model, day, feature, user, workflow, node, tenant, environment
- Anomaly detection: Model-aware spike detection with dual-threshold (v3)
- Data quality: Quality score, completeness metrics, validation errors
- Token efficiency: Efficiency score, ratios, utilization analysis
- Outcome linkage (v3): Cost per outcome, outcomes per dollar
- Optimization: Rules-based recommendations with savings estimates
- Trend & forecast: Last 7 vs previous 7 days comparison, monthly projection
- Enforcement simulation (v3): Cost-control actions when thresholds exceeded
🔔 Real-Time Alerts
- Webhook notifications: Get alerted when anomalies detected or budget exceeded
- Works with any endpoint: Slack, Discord, Zapier, custom APIs
- Simple setup: Just provide webhook URL
🚀 Production Ready
- No external APIs: All processing done locally
- Fast: Processes thousands of records in seconds
- Secure: No API keys, no prompt logging (v3: automatic prompt hashing)
- Idempotent (v3): Event deduplication prevents double-counting
- Well-documented: Complete documentation and examples
Backward Compatibility
v3.0 is fully backward compatible with v2.0:
- ✅ All existing fields preserved
- ✅ Legacy config format still supported
- ✅ Dataset schema only extended (new fields optional)
- ✅ Summary schema only extended (new sections optional)
- ✅ All v3 features disabled by default
- ✅ Actor behaves identically to v2.0 unless you opt in
- ✅ n8n mode is opt-in (does not affect csv/json modes)
Migration:
- No changes required — existing configs continue to work
- Enable v3 features by adding new config sections
- Add attribution/outcome fields to your data to unlock new aggregations
- Use
inputType: "n8n"to enable n8n event ingestion
Ready to audit your AI costs?
Run the Actor, upload your usage data, and get:
- ✅ Complete cost breakdown
- ✅ Attribution by workflow/node/tenant/environment (v3)
- ✅ Outcome efficiency metrics (v3)
- ✅ Data quality validation
- ✅ Token efficiency analysis
- ✅ Concrete optimization recommendations
- ✅ Trend analysis & monthly forecast
- ✅ Dual-threshold anomaly detection (v3)
- ✅ Enforcement simulation (v3)
- ✅ Idempotency protection (v3)
- ✅ Real-time webhook alerts
Get clear answers about where your AI budget is going — and how to reduce it — in minutes.
🚀