AI Cost Audit & Anomaly Detection avatar
AI Cost Audit & Anomaly Detection

Pricing

from $0.01 / 1,000 processed usage records

Go to Apify Store
AI Cost Audit & Anomaly Detection

AI Cost Audit & Anomaly Detection

Audit and explain your LLM/API usage costs from CSV or JSON. Calculates token-based cost per model, groups by model/day/feature, detects spikes and expensive-model patterns, and outputs a dataset plus summary.json and report.md. No external APIs required.

Pricing

from $0.01 / 1,000 processed usage records

Rating

0.0

(0)

Developer

Hayder Al-Khalissi

Hayder Al-Khalissi

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

10 hours ago

Last modified

Share

AI Cost Audit & Anomaly Detection Actor

v3.0 — Audit, analyze, and optimize your AI / LLM usage costs from CSV or JSON data.

This Actor calculates token-based costs per model, aggregates usage, detects anomalies, analyzes data quality, measures token efficiency, identifies cost optimization opportunities, and sends webhook alerts — all without calling external APIs. Get concrete, actionable recommendations to reduce costs by 20-40%.

NEW in v3.0:

  • 🏷️ Attribution Layer — Cost breakdown by workflow, node, tenant, environment
  • 📊 Outcome Linkage — Link costs to business outcomes (cost per outcome, outcomes per dollar)
  • 🎯 Smarter Anomaly Detection — Dual-threshold spike detection (relative + absolute)
  • 🛡️ Enforcement Engine — Simulated cost-control actions (warn, throttle, switch_model, block)
  • 🔄 Idempotency Protection — Event deduplication with KV store caching (now with late event support)
  • 🔒 Prompt Security — Automatic SHA-256 hashing (raw prompts never stored)
  • 📅 Retention Metadata — Policy tracking for downstream systems
  • 🔗 n8n Integration — First-class support for n8n workflow event ingestion with flexible field mapping

What this Actor does

For each usage record, the Actor:

  • Calculates input / output token costs
  • Aggregates costs by:
    • model
    • day
    • feature
    • user (optional)
    • workflow, node, tenant, environment (v3 — optional)
  • Detects anomalies and cost spikes (optional, v3: dual-threshold)
  • Analyzes data quality with comprehensive metrics
  • Calculates token efficiency (input/output ratio, cost per request, efficiency score)
  • Links costs to outcomes (v3 — cost per outcome, outcomes per dollar)
  • Identifies cost optimization opportunities with concrete, actionable recommendations
  • Simulates enforcement actions (v3 — when thresholds exceeded)
  • Deduplicates events (v3 — idempotency protection)
  • Analyzes trends (last 7 vs previous 7 days) and projects monthly costs
  • Sends webhook alerts for anomalies and budget thresholds (optional)
  • Produces:
    • 📊 Dataset — detailed per-record costs (6 pre-configured views including Attribution)
    • 📄 summary.json — complete analysis with all metrics
    • 🔍 Data Quality Report — validation metrics & recommendations
    • Token Efficiency Metrics — efficiency score & optimization insights
    • 💡 Cost Optimization Opportunities — rules-based recommendations with potential savings
    • 📈 Trend & Forecast — last 7 vs previous 7 days comparison and monthly projection
    • 🏷️ Attribution Breakdown (v3) — by workflow, node, tenant, environment
    • 📊 Outcome Efficiency (v3) — cost per outcome, outcomes per dollar
    • 🛡️ Enforcement Actions (v3) — simulated cost-control decisions
    • 📝 report.md — human-readable insights and recommendations

Typical use cases

  • AI / LLM Cost Auditing — Comprehensive cost analysis with quality validation
  • FinOps & Cost Optimization — Identify savings opportunities with concrete recommendations
  • Anomaly Detection — Real-time alerts for unusual usage spikes (v3: dual-threshold)
  • Token Efficiency Analysis — Optimize prompt engineering and model selection
  • Cost Attribution (v3) — Track costs by workflow, node, tenant, environment
  • Outcome-Based ROI (v3) — Link costs to business outcomes (leads, conversions, etc.)
  • Cost Control (v3) — Simulate enforcement actions when thresholds exceeded
  • Event Deduplication (v3) — Prevent double-counting with idempotency protection
  • Data Quality Assurance — Validate usage data integrity and completeness
  • Trend Analysis — Compare last 7 vs previous 7 days spending trends
  • Cost Forecasting — Project monthly costs based on recent usage patterns
  • Budget Monitoring — Webhook alerts when spending exceeds thresholds
  • Internal Reporting — Executive-friendly reports for OpenAI / Anthropic / Groq / Gemini / Azure OpenAI usage
  • Model Comparison — Compare costs and efficiency across different models

Quick start (Apify Platform)

To run immediately without hosting data: use the Example input (JSON) below — copy it into the Input tab and run.

  1. Open the Input tab
  2. Choose your input type:
    • CSV (URL or raw)
    • JSON array
    • n8n events (v3 — see n8n Integration section)
  3. Configure the price table for your models
  4. (Optional) Enable v3 features:
    • Attribution fields (workflowId, nodeId, tenantId, environment)
    • Outcome linkage (outcomeType, outcomeValue)
    • Enforcement simulation
    • Idempotency protection
    • n8n integration (for n8n input type)
  5. Run the Actor

Results will appear automatically in:

  • Dataset (detailed rows with 6 pre-configured views including Attribution)
  • Key-Value Store
    • summary.json (includes data quality metrics, attribution, outcomes, enforcement)
    • report.md (includes all sections including v3 enhancements)

Features:

  • ✅ Multi-provider support (OpenAI, Anthropic, Groq, Gemini, Azure OpenAI, generic)
  • ✅ Model-aware anomaly detection (v3: dual-threshold)
  • Data quality analysis (success rate, field completeness, outliers)
  • Token efficiency metrics (efficiency score, input/output ratio, cost per request)
  • Cost optimization engine (rules-based recommendations with potential savings)
  • Trend & forecast (last 7 vs previous 7 days, monthly projection)
  • Attribution breakdown (v3 — workflow, node, tenant, environment)
  • Outcome efficiency (v3 — cost per outcome, outcomes per dollar)
  • Enforcement simulation (v3 — cost-control actions)
  • Idempotency protection (v3 — event deduplication)
  • Prompt security (v3 — SHA-256 hashing)
  • Webhook alerts (anomalies and budget thresholds)
  • ✅ No external API keys required

Input overview

Required

  • Usage data (CSV or JSON)
  • Provider (openai, anthropic, groq, gemini, azure-openai, generic)

Optional

  • Price table (uses defaults if not provided)
  • Grouping fields (model, day, feature, user, workflow, node, tenant, environment)
  • Anomaly detection settings (v3: dual-threshold with minAbsoluteIncrease)
  • Enforcement config (v3 — cost-control simulation)
  • Idempotency config (v3 — event deduplication)
  • Retention metadata (v3 — informational policy)
  • Webhook URL (for real-time alerts)
  • Budget threshold (for budget exceeded alerts)
  • Output options

Example input (JSON)

Copy the JSON below into the Actor input and run — it works as-is with no file hosting. Do not use placeholder URLs like https://example.com/usage-data.csv; they do not exist and will return HTTP 404.

To use your own data via URL instead, leave out rawData, set inputType to "csv" or "json", and set dataUrl to a real URL that serves your data.

{
"inputType": "json",
"provider": "openai",
"currency": "USD",
"rawData": [
{"timestamp":"2026-02-01T09:00:00Z","model":"gpt-4o","prompt_tokens":1200,"completion_tokens":450,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-classify","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-001"},
{"timestamp":"2026-02-01T10:00:00Z","model":"gpt-4o","prompt_tokens":1500,"completion_tokens":600,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-002"},
{"timestamp":"2026-02-01T11:00:00Z","model":"gpt-4o-mini","prompt_tokens":300,"completion_tokens":120,"feature":"summarizer","user_id":"user-2","workflowId":"wf-content","nodeId":"node-summarize","environment":"production","tenantId":"tenant-acme","endpoint":"/api/summarize","outcomeType":"summary","outcomeValue":1,"eventId":"evt-003"},
{"timestamp":"2026-02-01T13:00:00Z","model":"claude-3-5-sonnet","prompt_tokens":1800,"completion_tokens":700,"feature":"analysis","user_id":"user-5","workflowId":"wf-analytics","nodeId":"node-analyze","environment":"staging","tenantId":"tenant-acme","endpoint":"/api/analyze","outcomeType":"insight","outcomeValue":2,"eventId":"evt-004"},
{"timestamp":"2026-02-01T15:00:00Z","model":"gpt-4o","prompt_tokens":900,"completion_tokens":350,"feature":"assistant","user_id":"user-3","workflowId":"wf-internal","nodeId":"node-draft","environment":"development","tenantId":"tenant-beta","endpoint":"/api/assist","outcomeType":"draft","outcomeValue":1,"eventId":"evt-005"},
{"timestamp":"2026-02-02T09:00:00Z","model":"gpt-4o","prompt_tokens":1100,"completion_tokens":420,"feature":"chatbot","user_id":"user-1","workflowId":"wf-support","nodeId":"node-classify","environment":"production","tenantId":"tenant-acme","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-006"},
{"timestamp":"2026-02-02T11:00:00Z","model":"claude-3-5-sonnet","prompt_tokens":2200,"completion_tokens":850,"feature":"analysis","user_id":"user-5","workflowId":"wf-analytics","nodeId":"node-analyze","environment":"staging","tenantId":"tenant-acme","endpoint":"/api/analyze","outcomeType":"insight","outcomeValue":3,"eventId":"evt-007"},
{"timestamp":"2026-02-02T14:00:00Z","model":"gpt-4o","prompt_tokens":950,"completion_tokens":370,"feature":"assistant","user_id":"user-3","workflowId":"wf-internal","nodeId":"node-draft","environment":"development","tenantId":"tenant-beta","endpoint":"/api/assist","outcomeType":"draft","outcomeValue":1,"eventId":"evt-008"},
{"timestamp":"2026-02-02T16:00:00Z","model":"gpt-4o","prompt_tokens":1600,"completion_tokens":620,"feature":"chatbot","user_id":"user-4","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-gamma","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-009"},
{"timestamp":"2026-02-03T10:00:00Z","model":"gpt-4o","prompt_tokens":1350,"completion_tokens":530,"feature":"chatbot","user_id":"user-4","workflowId":"wf-support","nodeId":"node-respond","environment":"production","tenantId":"tenant-gamma","endpoint":"/api/chat","outcomeType":"resolution","outcomeValue":1,"eventId":"evt-010"}
],
"priceTable": {
"gpt-4o": { "inputPer1K": 0.0025, "outputPer1K": 0.01 },
"gpt-4o-mini": { "inputPer1K": 0.00015, "outputPer1K": 0.0006 },
"claude-3-5-sonnet": { "inputPer1K": 0.003, "outputPer1K": 0.015 }
},
"groupBy": ["model", "day", "feature", "workflow", "tenant"],
"anomalyDetection": {
"enabled": true,
"thresholdMultiplier": 2.5,
"minAbsoluteIncrease": 0.01,
"groupByFeature": false
},
"enforcement": {
"enabled": true,
"maxDailyCost": 0.50,
"action": "warn"
},
"idempotency": {
"enabled": true,
"ttlSeconds": 86400
},
"retention": {
"rawDays": 90,
"aggregateDays": 365
},
"budgetThreshold": 100,
"writeDataset": true,
"writeReportMd": true,
"includeSummaryWhenEmpty": true
}

Using a URL instead? Omit rawData, set "inputType": "csv" or "json", and set "dataUrl" to a real URL that serves your CSV/JSON (e.g. a file you uploaded or a URL from your own server). Placeholder URLs like example.com do not work.

Re-running the same example? With idempotency.enabled: true, event IDs are remembered (for ttlSeconds, default 24h). A second run with the same eventIds will skip all events and complete successfully with 0 new records and an empty cost breakdown. To see the cost breakdown again, either set "idempotency": { "enabled": false } in the example or use different eventId values.


Supported input formats

CSV format

Your CSV should include columns such as:

  • timestamp (or date, created_at)
  • model
  • prompt_tokens (or input_tokens, promptTokenCount for Gemini)
  • completion_tokens (or output_tokens, candidatesTokenCount for Gemini)
  • Optional: feature, user_id
  • v3 Optional: workflowId, workflowName, nodeId, environment, tenantId, endpoint, provider, outcomeType, outcomeValue, eventId, prompt (will be hashed)

Provider-specific field names:

  • OpenAI: prompt_tokens, completion_tokens
  • Anthropic: input_tokens, output_tokens
  • Groq: prompt_tokens, completion_tokens
  • Gemini: promptTokenCount, candidatesTokenCount
  • Azure OpenAI: prompt_tokens, completion_tokens (same as OpenAI)

Example:

timestamp,model,prompt_tokens,completion_tokens,feature,workflowId,nodeId,outcomeType,outcomeValue
2026-02-01T10:00:00Z,gpt-4o-mini,500,200,chatbot,workflow-123,node-1,lead,1

JSON format

OpenAI / Azure OpenAI:

[
{
"timestamp": "2026-02-01T10:00:00Z",
"model": "gpt-4o-mini",
"prompt_tokens": 500,
"completion_tokens": 200,
"feature": "chatbot",
"workflowId": "workflow-123",
"nodeId": "node-1",
"tenantId": "tenant-abc",
"environment": "prod",
"outcomeType": "lead",
"outcomeValue": 1,
"eventId": "evt-xyz-123",
"prompt": "You are a helpful assistant..." // Will be hashed automatically
}
]

Anthropic:

[
{
"timestamp": "2026-02-01T10:00:00Z",
"model": "claude-3-5-sonnet",
"input_tokens": 500,
"output_tokens": 200,
"feature": "chatbot",
"workflowId": "workflow-123"
}
]

Gemini:

[
{
"timestamp": "2026-02-01T10:00:00Z",
"model": "gemini-1.5-pro",
"promptTokenCount": 500,
"candidatesTokenCount": 200,
"feature": "chatbot"
}
]

Getting all reports

To get every output the Actor can produce, set these in the Input (they are all on by default):

InputDefaultEffect
writeDatasettrueWrite cost records (or one summary record when all are duplicates) to the Dataset.
writeReportMdtrueGenerate report.md in the Key-Value Store (human-readable audit report).
includeSummaryWhenEmptytrueWhen idempotency skips all records, still push one summary record to the Dataset so workflows (e.g. n8n) get a result.

Summary (summary.json) is always written to the Key-Value Store; there is no option to disable it.

To include all reports: keep defaults or set explicitly:

"writeDataset": true,
"writeReportMd": true,
"includeSummaryWhenEmpty": true

Then you get:

  • Dataset – Cost records (one per usage event), or one summary record when all events are duplicates.
  • Key-Value Storesummary.json (full JSON summary) and report.md (markdown report).

Output details

Dataset (per record)

Each dataset item represents one usage record:

{
"timestamp": "2026-02-01T10:00:00.000Z",
"model": "gpt-4o-mini",
"inputTokens": 500,
"outputTokens": 200,
"inputCost": 0.000075,
"outputCost": 0.00012,
"totalCost": 0.000195,
"currency": "USD",
"feature": "chatbot",
"userId": "user123",
"workflowId": "workflow-123",
"nodeId": "node-1",
"environment": "prod",
"tenantId": "tenant-abc",
"outcomeType": "lead",
"outcomeValue": 1,
"promptHash": "a1b2c3d4e5f6...",
"eventId": "evt-xyz-123"
}

Summary (summary.json)

{
"totalCost": 0.5234,
"totalInputTokens": 15000,
"totalOutputTokens": 10000,
"totalRecords": 30,
"currency": "USD",
"byModel": [...],
"byDay": [...],
"byWorkflow": [...],
"byNode": [...],
"byTenant": [...],
"byEnvironment": [...],
"topCostDrivers": [...],
"anomalies": [...],
"outcomeEfficiency": {
"recordsWithOutcomes": 25,
"totalOutcomes": 150,
"costPerOutcome": 0.0035,
"outcomesPerDollar": 286.5,
"byOutcomeType": [...]
},
"enforcementActions": [...],
"idempotencyStats": {
"duplicatesSkipped": 5,
"uniqueEventsProcessed": 25
},
"retentionPolicyApplied": {
"rawDays": 90,
"aggregateDays": 365,
"appliedAt": "2026-02-06T..."
},
"dataQuality": {
"overallQualityScore": 100,
"qualityGrade": "excellent",
"successRate": 100,
"fieldCompleteness": {...},
"recommendations": [...]
}
}

Report (report.md)

A human-readable Markdown report including:

  • Executive Summary with overall metrics and efficiency score
  • Cost Breakdown Tables (by model, day, feature)
  • Attribution Breakdown (v3 — by workflow, node, tenant, environment)
  • Outcome Efficiency (v3 — cost per outcome, outcomes per dollar, by outcome type)
  • Top Cost Drivers analysis
  • Anomaly Alerts with detailed context and top-offender attribution (v3)
  • Enforcement Actions (v3 — simulated cost-control decisions)
  • Data Quality Report section:
    • Quality score (0-100) with grade badge
    • Field completeness table
    • Data issues summary
    • Outlier detection results
    • Quality recommendations
  • Token Efficiency Metrics section:
    • Efficiency score (0-100) with grade
    • Key metrics table (cost per request, tokens per request, ratios)
    • Cost efficiency table (cost per 1K tokens)
    • Token utilization percentages
    • Efficiency recommendations
  • Cost Optimization Opportunities section:
    • Total potential savings
    • Recommendations grouped by priority (high/medium/low)
    • Specific action items for each opportunity
    • Affected records and costs
  • Trend & Forecast section:
    • Last 7 days vs previous 7 days comparison table
    • Trend direction (increasing/decreasing/stable) with percentage change
    • Projected monthly cost based on recent daily averages
  • Retention Policy (v3 — informational metadata)
  • General Optimization Recommendations

Ideal for sharing with non-technical stakeholders and executives.


v3.0 New Features

🏷️ Attribution Layer

Track costs by workflow, node, tenant, and environment:

{
"workflowId": "workflow-123",
"workflowName": "Lead Generation Pipeline",
"nodeId": "node-1",
"environment": "prod",
"tenantId": "tenant-abc",
"endpoint": "/api/chat"
}

Aggregations:

  • byWorkflow — Cost breakdown by workflow
  • byNode — Cost breakdown by node/step
  • byTenant — Cost breakdown by tenant
  • byEnvironment — Cost breakdown by environment (prod/stage/dev)

Use cases:

  • Multi-tenant cost allocation
  • Workflow-level ROI analysis
  • Environment cost comparison
  • Node-level optimization

📊 Outcome Linkage

Link costs to business outcomes:

{
"outcomeType": "lead",
"outcomeValue": 1
}

Metrics:

  • costPerOutcome — Average cost to produce one outcome
  • outcomesPerDollar — How many outcomes per dollar spent
  • Breakdown by outcome type

Use cases:

  • ROI calculation (cost per lead, cost per conversion)
  • Efficiency tracking (outcomes per dollar)
  • Outcome-based optimization

🎯 Smarter Anomaly Detection

Dual-threshold spike detection:

A spike is only flagged when both conditions are met:

  1. Relative spike > thresholdMultiplier (e.g., 2.5x)
  2. Absolute delta > minAbsoluteIncrease (e.g., $0.50)

This prevents noise from small-value spikes while catching significant increases.

{
"anomalyDetection": {
"enabled": true,
"thresholdMultiplier": 2.5,
"minAbsoluteIncrease": 0.50,
"groupByFeature": false
}
}

Enhanced anomaly context:

  • topOffender — Attribution context (workflowId, nodeId, model, promptHash)
  • Helps operators pinpoint root cause

🛡️ Enforcement Engine

Simulate cost-control actions when thresholds are exceeded:

{
"enforcement": {
"enabled": true,
"maxDailyCost": 100,
"action": "warn"
}
}

Actions:

  • warn — Log warning (default)
  • throttle — Simulate throttling
  • switch_model — Simulate model downgrade
  • block — Simulate request blocking

Important: All actions are simulated — no actual blocking occurs. This provides visibility into enforcement rules without risking data-pipeline disruption.


🔄 Idempotency Protection

Deduplicate events using eventId:

{
"idempotency": {
"enabled": true,
"ttlSeconds": 86400
}
}

How it works:

  • Records with the same eventId are processed only once (within TTL window)
  • Uses Apify KV store for persistence
  • Records without eventId always pass through (fail-open)

Use cases:

  • Prevent double-counting from retries
  • Handle duplicate webhook deliveries
  • Ensure idempotent processing

🔒 Prompt Security

Automatic prompt hashing:

If a record contains a prompt field:

  • Raw prompt is hashed using SHA-256
  • Only the hash (promptHash) is stored
  • Raw prompt is never logged or stored

Example:

{
"prompt": "You are a helpful assistant...",
// → Automatically converted to:
"promptHash": "a1b2c3d4e5f6..."
}

Security benefits:

  • No PII leakage in logs
  • No sensitive prompts in datasets
  • Hash can be used for deduplication/analysis

📅 Retention Metadata

Attach retention policy metadata (informational only):

{
"retention": {
"rawDays": 90,
"aggregateDays": 365
}
}

Note: This actor does not delete data. The policy is attached to the summary for downstream systems to interpret.


🔗 n8n Integration

First-class support for ingesting n8n workflow events with flexible field mapping and automatic attribution.

Features:

  • Accepts n8n events from HTTP Request/Webhook exports
  • Maps custom field names to canonical structure
  • Automatic eventId generation when missing
  • Prompt hashing (SHA-256) with size limits
  • Late event detection and marking
  • Full attribution breakdown (workflow → node → run)

Configuration:

{
"inputType": "n8n",
"provider": "generic",
"n8n": {
"enabled": true,
"source": {
"type": "raw",
"rawEvents": [
{
"eventId": "evt-001",
"timestamp": "2026-02-01T10:00:00Z",
"provider": "openai",
"model": "gpt-4o",
"inputTokens": 1200,
"outputTokens": 450,
"workflowId": "wf-support",
"workflowName": "Customer Support",
"nodeId": "node-classify",
"nodeName": "Classify Request",
"runId": "run-001",
"env": "prod",
"tenantId": "tenant-acme",
"userId": "user-1",
"endpoint": "/v1/chat/completions",
"prompt": "Classify this request",
"outcomeType": "resolution",
"outcomeValue": 1,
"outcomeOk": true
}
]
},
"mapping": {
"workflowIdField": "workflowId",
"workflowNameField": "workflowName",
"nodeIdField": "nodeId",
"nodeNameField": "nodeName",
"environmentField": "env",
"tenantIdField": "tenantId",
"userIdField": "userId",
"runIdField": "runId",
"eventIdField": "eventId",
"timestampField": "timestamp"
},
"defaults": {
"environment": "prod"
},
"derive": {
"workflowNameFrom": "workflowId",
"nodeIdFrom": "nodeName",
"hashPrompt": true
}
},
"idempotency": {
"enabled": true,
"ttlHours": 168,
"allowedLateHours": 48
}
}

Canonical Event Structure:

The Actor expects events with these fields (all optional except timestamp, model/provider, and token counts):

{
"eventId": "uuid-or-hash",
"timestamp": "ISO string",
"provider": "openai|anthropic|google|bedrock|other",
"model": "gpt-4o|claude-3-5-sonnet|...",
"inputTokens": 123,
"outputTokens": 456,
"workflowId": "wf_123",
"workflowName": "Lead Qualification",
"nodeId": "node_7",
"nodeName": "OpenAI Chat",
"runId": "execution_999",
"env": "prod|stage|dev",
"tenantId": "customer_42",
"userId": "agent_7",
"endpoint": "/v1/responses",
"prompt": "OPTIONAL raw prompt",
"outcomeType": "lead",
"outcomeValue": 1,
"outcomeOk": true
}

Field Mapping:

Use n8n.mapping to map custom field names:

{
"n8n": {
"mapping": {
"workflowIdField": "custom_workflow_id",
"nodeIdField": "custom_node_id"
}
}
}

EventId Generation:

If eventId is missing, it's automatically generated from:

sha256(runId + nodeId + timestamp + model + inputTokens + outputTokens)

Prompt Handling:

  • Prompts are automatically hashed using SHA-256
  • Raw prompts are never stored (only promptHash)
  • Prompts exceeding 1MB are dropped with a warning
  • Set derive.hashPrompt: false to disable hashing

Late Events:

Events older than lastRunEnd - allowedLateHours are:

  • Still processed (not skipped)
  • Marked with lateEvent: true flag
  • Included in idempotencyStats.lateEventsCount

Aggregations:

When n8n events include attribution fields, the Actor produces:

  • byWorkflow — Cost breakdown by workflow
  • byNode — Cost breakdown by node/step
  • byTenant — Cost breakdown by tenant
  • byEnvironment — Cost breakdown by environment
  • byRunId — Cost breakdown by execution run

Use Cases:

  • Multi-tenant cost allocation
  • Workflow-level ROI analysis
  • Node-level optimization
  • Cost per successful run (with outcomeOk)
  • Late event reconciliation

Example Output:

{
"byWorkflow": [
{
"key": "wf-support",
"totalCost": 0.5234,
"inputTokens": 15000,
"outputTokens": 10000,
"recordCount": 30
}
],
"byNode": [
{
"key": "node-classify",
"totalCost": 0.2345,
"inputTokens": 8000,
"outputTokens": 5000,
"recordCount": 15
}
],
"idempotencyStats": {
"duplicatesSkipped": 5,
"uniqueEventsProcessed": 25,
"lateEventsCount": 2
}
}

Anomaly Detection

Overview

The Actor includes intelligent anomaly detection to identify unusual spending patterns. Anomalies are detected across multiple dimensions:

  • Cost Spikes - Individual requests with unusually high costs (model-aware, v3: dual-threshold)
  • Expensive Models - Models consuming disproportionate budget
  • High Volume Days - Days with abnormally high usage
  • Low Efficiency - Models with poor output/input token ratios

Configuration

Configure anomaly detection using the anomalyDetection object:

{
"anomalyDetection": {
"enabled": true,
"thresholdMultiplier": 2.5,
"minAbsoluteIncrease": 0.50,
"groupByFeature": false
}
}

Options:

  • enabled (boolean) - Enable/disable anomaly detection (default: true)
  • thresholdMultiplier (number) - Relative sensitivity multiplier (default: 2.5)
    • Higher = less sensitive (fewer anomalies)
    • Lower = more sensitive (more anomalies)
    • Range: 1.5 - 10
  • minAbsoluteIncrease (number, v3) - Minimum absolute cost increase to trigger spike (default: 0.01)
    • Prevents noise from small-value spikes
    • Example: 0.50 means $0.50 minimum delta required
  • groupByFeature (boolean) - Use per-model+feature baselines for spike detection (default: false)
    • false - Compare each record against other records from the same model
    • true - Compare against records from the same model AND feature

Data Requirements

Anomaly detection requires minimum data thresholds:

  • Spike Detection: Requires at least 10 records
  • Volume Detection: Requires at least 3 unique days

If your dataset doesn't meet these requirements:

  • anomalyDetectionStatus will be set to "insufficient_data"
  • anomalyDetectionNotes will explain what's missing
  • The report will show a "Data too small" warning
  • No spike/volume anomalies will be generated

Example insufficient data response:

{
"anomalyDetectionStatus": "insufficient_data",
"anomalyDetectionNotes": [
"Too few records for spike detection (need at least 10, have 3).",
"Too few days for volume detection (need at least 3, have 1)."
],
"anomalies": []
}

Model-Aware Spike Detection

Spike detection uses per-model baselines to avoid false positives when comparing different models:

  • Each model's costs are analyzed separately
  • A spike is only flagged if it's unusual for that specific model
  • Example: A $0.50 request is normal for gpt-4o but a spike for gpt-4o-mini

When groupByFeature: true, baselines are computed per model+feature combination:

  • Useful when features have different typical usage patterns
  • Example: "image-generation" vs "text-completion" may have different cost profiles

Enriched Spike Anomalies

Spike anomalies include detailed context for investigation:

{
"type": "spike",
"severity": "high",
"model": "gpt-4o",
"feature": "chatbot",
"baseline": "per_model",
"value": 0.5234,
"threshold": 0.2500,
"modelMedian": 0.0850,
"modelP95": 0.3200,
"groupRecordCount": 150,
"rankInGroup": 1,
"timestamp": "2026-02-01T14:32:00Z",
"topOffender": {
"workflowId": "workflow-123",
"nodeId": "node-1",
"model": "gpt-4o",
"promptHash": "a1b2c3..."
}
}

Fields:

  • baseline - How the baseline was computed (per_model or per_model_feature)
  • modelMedian - Median cost for this model/group
  • modelP95 - 95th percentile cost
  • groupRecordCount - Total records in baseline group
  • rankInGroup - Where this record ranks (1 = most expensive)
  • topOffender (v3) - Attribution context to help pinpoint root cause

Backward Compatibility

Legacy configuration format is still supported:

{
"anomalyDetectionEnabled": true,
"anomalyThreshold": 2.5
}

However, the new anomalyDetection object format is recommended for access to all features including dual-threshold.


Key Features

🔍 Data Quality Analysis

  • Overall quality score (0-100) with grade (excellent/good/fair/poor)
  • Success rate tracking (valid vs invalid records)
  • Field completeness metrics (feature, userId, required fields)
  • Data issue detection (normalization errors, invalid timestamps, duplicates, outliers)
  • Actionable quality recommendations

⚡ Token Efficiency Metrics

  • Efficiency score (0-100) based on multiple factors
  • Input/output ratio analysis
  • Cost per request metrics
  • Cost per 1K tokens (input/output/total)
  • Token utilization percentages
  • Efficiency recommendations

💡 Cost Optimization Engine

  • Rules-based recommendations (no AI magic)
  • 5 detection rules:
    • Model downgrade opportunities (expensive models for simple tasks)
    • Prompt optimization (high input/low output ratio)
    • Caching opportunities (repetitive requests)
    • Feature optimization (dominating spend)
    • High-cost pattern detection
  • Concrete action items for each recommendation
  • Potential savings estimates
  • Priority-based grouping (high/medium/low)

🏷️ Attribution & Outcome Analysis (v3)

  • Cost attribution by workflow, node, tenant, environment
  • Outcome efficiency — cost per outcome, outcomes per dollar
  • Breakdown by outcome type for ROI analysis

🛡️ Enforcement & Security (v3)

  • Enforcement simulation — cost-control actions when thresholds exceeded
  • Idempotency protection — event deduplication with KV store
  • Prompt security — automatic SHA-256 hashing (raw prompts never stored)

📡 Webhook Alerts (Optional)

  • Anomaly alerts — HTTP POST when anomalies detected
  • Budget alerts — HTTP POST when budget threshold exceeded
  • Works with Slack, Discord, Zapier, or any webhook endpoint
  • Simple HTTP POST with JSON payloads

Pricing notes

  • This Actor does not call external APIs
  • All cost calculations are done locally using your price table
  • You are charged only for running the Actor, not per record processed (Apify platform fees apply)

Common issues & fixes

"No input provided"

Ensure input is passed via:

  • Apify UI
  • Input JSON
  • URL or raw data

"No price found for model"

Add the missing model to the priceTable:

{
"your-model-name": {
"inputPer1K": 0.001,
"outputPer1K": 0.002
}
}

CSV parsing errors

  • Ensure header row exists
  • Use commas as separators
  • Quote text fields with commas

Webhook not sending

  • Verify webhookUrl is configured correctly (must be HTTP/HTTPS)
  • Check that anomalies were detected or budget threshold was exceeded
  • Review Actor logs for webhook errors
  • Test with https://webhook.site/ first

No optimization recommendations

  • Optimization engine analyzes patterns automatically
  • Recommendations appear when specific patterns are detected:
    • Model downgrade: Requires 5+ simple requests with expensive models
    • Prompt optimization: Requires 3+ requests with high input/low output ratio
    • Caching: Requires 5+ repetitive requests (same model + feature)
    • Feature optimization: Requires feature with >50% of spend
  • If no recommendations: Current usage may already be efficient

v3 features not appearing

Attribution aggregations:

  • Ensure records include workflowId, nodeId, tenantId, or environment fields
  • Add these fields to your groupBy array if you want them in aggregations

Outcome efficiency:

  • Ensure records include both outcomeType and outcomeValue fields
  • Section only appears if at least one record has outcome data

Enforcement actions:

  • Enable enforcement.enabled: true in config
  • Set maxDailyCost threshold
  • Actions only appear if threshold is exceeded

Idempotency:

  • Enable idempotency.enabled: true in config
  • Ensure records include eventId field
  • Stats only appear if deduplication is enabled

Local development (optional)

For contributors or advanced users.

Install dependencies

$npm install

Build

$npm run build

Run tests

$npm test

Run locally

$npx apify run --input-file=example-input-raw.json

Deployment (developers)

apify login
apify push

Then run via the Apify Console.


What Makes This Actor Unique?

🎯 Actionable Insights (Not Just Metrics)

  • Concrete recommendations: "Switch 15 requests from gpt-4o to gpt-4o-mini"
  • Potential savings: See estimated cost reduction before acting
  • Specific action items: Know exactly what to do

📊 Comprehensive Analysis

  • Cost analysis: Aggregations by model, day, feature, user, workflow, node, tenant, environment
  • Anomaly detection: Model-aware spike detection with dual-threshold (v3)
  • Data quality: Quality score, completeness metrics, validation errors
  • Token efficiency: Efficiency score, ratios, utilization analysis
  • Outcome linkage (v3): Cost per outcome, outcomes per dollar
  • Optimization: Rules-based recommendations with savings estimates
  • Trend & forecast: Last 7 vs previous 7 days comparison, monthly projection
  • Enforcement simulation (v3): Cost-control actions when thresholds exceeded

🔔 Real-Time Alerts

  • Webhook notifications: Get alerted when anomalies detected or budget exceeded
  • Works with any endpoint: Slack, Discord, Zapier, custom APIs
  • Simple setup: Just provide webhook URL

🚀 Production Ready

  • No external APIs: All processing done locally
  • Fast: Processes thousands of records in seconds
  • Secure: No API keys, no prompt logging (v3: automatic prompt hashing)
  • Idempotent (v3): Event deduplication prevents double-counting
  • Well-documented: Complete documentation and examples

Backward Compatibility

v3.0 is fully backward compatible with v2.0:

  • ✅ All existing fields preserved
  • ✅ Legacy config format still supported
  • ✅ Dataset schema only extended (new fields optional)
  • ✅ Summary schema only extended (new sections optional)
  • ✅ All v3 features disabled by default
  • ✅ Actor behaves identically to v2.0 unless you opt in
  • ✅ n8n mode is opt-in (does not affect csv/json modes)

Migration:

  • No changes required — existing configs continue to work
  • Enable v3 features by adding new config sections
  • Add attribution/outcome fields to your data to unlock new aggregations
  • Use inputType: "n8n" to enable n8n event ingestion

Ready to audit your AI costs?

Run the Actor, upload your usage data, and get:

  • ✅ Complete cost breakdown
  • ✅ Attribution by workflow/node/tenant/environment (v3)
  • ✅ Outcome efficiency metrics (v3)
  • ✅ Data quality validation
  • ✅ Token efficiency analysis
  • Concrete optimization recommendations
  • Trend analysis & monthly forecast
  • Dual-threshold anomaly detection (v3)
  • Enforcement simulation (v3)
  • Idempotency protection (v3)
  • ✅ Real-time webhook alerts

Get clear answers about where your AI budget is going — and how to reduce it — in minutes.

🚀