Actor Pipeline Builder — Validate Multi-Actor Workflows avatar

Actor Pipeline Builder — Validate Multi-Actor Workflows

Pricing

$400.00 / 1,000 pipeline builders

Go to Apify Store
Actor Pipeline Builder — Validate Multi-Actor Workflows

Actor Pipeline Builder — Validate Multi-Actor Workflows

Actor Pipeline Builder. Available on the Apify Store with pay-per-event pricing.

Pricing

$400.00 / 1,000 pipeline builders

Rating

0.0

(0)

Developer

ryan clinton

ryan clinton

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

Pipeline Preflight — Validate Apify Actor Pipelines Before You Run Them

Pipeline errors should be caught before execution — not after minutes of runtime. Invalid pipelines fail at runtime. Pipeline Preflight detects failures at definition time.

Pipeline Preflight validates multi-actor Apify pipelines before execution and returns a production decision.

Pipeline Preflight is a pre-run validation stage in an Apify pipeline execution lifecycle.

Never run a pipeline in production without validating it first.

Contract

Pipeline Preflight checks that all stages in a pipeline compose correctly (input schemas, dataset schemas, field mappings, reachability) and returns a decision.

Use this actor when you need to verify that a pipeline is safe to run before executing it.

Execution pattern: define pipeline → run Pipeline Preflight → branch on decisionPosture → deploy or fix.

Guarantee: the pipeline is callable and stages compose correctly across inputs and outputs.

Output field: decisionPosture (routable control signal for automation) This field determines what to do next.

  • ship_pipeline
  • canary_recommended
  • monitor_only
  • no_call

Always branch on decisionPosture. It is the only field you should use for control flow. Do not branch on oneLine or decisionReason.

This actor does not run pipelines — it validates them before execution.

Flat rate: $0.40 per pipeline-build event. Platform compute (memory × runtime) billed separately by Apify.

No side effects. Pipeline Preflight reads Apify API metadata. It does not call, run, trigger, or schedule the target actors. Safe for CI, cron, and autonomous agents.

Mental model

Treat each actor as a function:

  • input schema = function arguments
  • dataset schema = return type
  • fieldMapping = argument binding

Pipeline Preflight checks that the types line up across the chain. Pipeline Preflight ensures these functions compose correctly.

What it does

Core

  • Type-checks stage transitions (input schema ↔ dataset schema ↔ field mapping)
  • Resolves actor reachability via the Apify API (/v2/acts/{id}/builds/default)
  • Produces a deterministic production decision: ship_pipeline / canary_recommended / monitor_only / no_call

Additional

  • Schema completeness scoring (per-stage + pipeline-wide; drives SCHEMA_AGENTIC_COVERAGE_LOW)
  • Optional empirical input validation (validateRuntime: true → calls actor-input-tester per stage; no target actors run)
  • Ordered fixPlan[] + schema-based mappingSuggestions[]
  • TypeScript orchestration codegen (minimal / productionish / typed) with codegenAssumptions[] and codegenWarnings[]
  • Agent contract (agentContract.safeToCall + stable recommendedAction enum) for MCP planners

Common causes of pipeline failure

Most multi-actor Apify pipelines break on cross-stage shape mismatches. Pipeline Preflight surfaces each as a stable verdictReasonCodes[] entry with a typed recommendation:

  • MAPPED_FIELD_NOT_IN_PREV_OUTPUT — mapping points at a field the upstream actor does not emit (per its declared dataset schema)
  • TARGET_FIELD_NOT_IN_INPUT_SCHEMA — downstream actor's input schema does not declare the mapped field
  • NO_FIELD_MAPPING — non-first stage has no mapping; downstream actor receives { data: [...] } and rejects
  • DATASET_SCHEMA_MISSING — upstream declares no dataset schema; generated code cannot verify field names
  • ACTOR_NOT_FOUND — slug wrong, actor private, or token lacks access
  • RUNTIME_VALIDATION_FAILEDvalidateRuntime: true called actor-input-tester and a stage rejected the synthesized input
  • SCHEMA_AGENTIC_COVERAGE_LOW — < 50% of resolved stages declare both input and dataset schemas

Full enum in Failure modes.

How is this different from Zapier or Make?

Pipeline Preflight does not execute workflows.

It validates that an Apify actor chain is callable and generates the orchestration code.

Zapier/Make run workflows. Pipeline Preflight verifies that your workflow definition is correct before you run it anywhere (Actor.call(), Apify scheduler, webhook, Zapier, Make, n8n, GitHub Actions, MCP agent).

When NOT to use Pipeline Preflight

Pipeline Preflight only validates pipeline definitions (schemas, mappings, reachability). Do NOT use this actor if you need to:

  • Validate a single actor's input JSON against its schema → use Input Guard.
  • Check real output data quality from a live run → use Output Guard.
  • Monitor production outputs for drift, null spikes, or schema regressions → use Output Guard.
  • Run integration or regression tests against an actor → use Deploy Guard.
  • Audit actors for PII, GDPR, CCPA, ToS, or CFAA risk → use Compliance Scanner.
  • Score an actor's historical run quality → use Quality Monitor.
  • Compare two actors A/B on the same input → use A/B Tester.
  • Execute the pipeline → copy generatedCode into your own orchestrator actor.
  • Run cost, revenue, or fleet-level analytics → use Fleet Analytics.
  • Do Apify Store competitive analysis → out of scope.

Pipeline Preflight validates pipeline definitions (schemas, mappings, reachability) and generates runnable orchestration code. Everything after the pipeline is callable lives in a sibling actor.

What to do with decisionPosture

PostureWhat it meansWhat to do
ship_pipelineValid, zero advisories, runtime-validatedPipe generatedCode straight into your orchestrator — agentContract.safeToCall = true.
canary_recommendedValid, zero advisories, runtime not verifiedDeploy behind a canary (one record through first), then promote.
monitor_onlyValid, but schema advisories remainDry-run against a single record before scheduling. Treat as "will probably work, needs a human eyeball."
no_callBlocking issues presentDo NOT call. Work the fixPlan[] top-to-bottom, then re-preflight.

Example decision flow

Input — 3-stage pipeline with a missing mapping on stage 2:

{
"stages": [
{ "actorId": "apify/rag-web-browser" },
{ "actorId": "apify/website-content-crawler" },
{ "actorId": "ryanclinton/bulk-email-verifier",
"fieldMapping": { "emails": "emailPattern" } }
]
}

Output (abridged):

{
"decisionPosture": "no_call",
"decisionReason": "1 blocking issue — cannot generate a runnable pipeline. Fix the errors and retry.",
"readinessScore": 0,
"verdictReasonCodes": ["NO_FIELD_MAPPING"],
"fixPlan": [
{ "order": 1, "stage": 2, "severity": "blocking",
"code": "NO_FIELD_MAPPING",
"action": "Add a fieldMapping on stage 2 describing which of apify/rag-web-browser's output fields to feed into apify/website-content-crawler's input.",
"why": "Stage 2: no field mapping defined — output from apify/rag-web-browser won't be passed to apify/website-content-crawler" }
],
"agentContract": {
"safeToCall": false,
"recommendedAction": "fix_mapping",
"requiredFixes": [{ "stage": 2, "code": "NO_FIELD_MAPPING" }]
}
}

Next step → fix the mapping → re-run Pipeline Preflight → decisionPosture flips to canary_recommended or ship_pipeline → deploy.

Decision contract

These are the always-true promises, enforced in code:

  • decisionPosture = ship_pipeline implies: valid = true, zero blocking issues, zero advisory issues, runtime validation ran AND passed, decisionReadiness = actionable, readinessScore = 1.0, agentContract.safeToCall = true. Safe to pipe through Actor.call() in production.
  • decisionPosture = canary_recommended implies: valid = true, zero blocking issues, zero advisory issues, runtime validation NOT run. readinessScore around 0.85. Pipeline likely works but hasn't been empirically verified — wire it up behind a canary.
  • decisionPosture = monitor_only implies: valid = true, zero blocking issues, at least one advisory. readinessScore around 0.6. Pipeline may run but schema advisories remain — dry-run before scheduling.
  • decisionPosture = no_call implies: valid = false, at least one blocking issue (ACTOR_NOT_FOUND, NO_FIELD_MAPPING, or RUNTIME_VALIDATION_FAILED), decisionReadiness = insufficient-data, readinessScore = 0, generatedCode = '', agentContract.safeToCall = false.
  • readinessScore and confidenceScore are independent. Readiness is "how close to safe execution" (gate-like). Confidence is "how much to trust the verdict" (driven by schema completeness and evidence). A pipeline can be 100% ready but 50% confident if the stages declare thin schemas.
  • Blocking vs advisory is stable. Automation should gate on blocking only (issues.filter(i => i.severity === 'blocking')). info is purely explanatory.
  • verdictReasonCodes is additive-only within a major version — new codes may be added; existing codes will not be renamed or repurposed.
  • confidencePolicyVersion is bumped whenever the confidence-scoring formula changes (component weights, harmonic base, bands). Scores are comparable only within the same policy version.
  • The actor never exits FAILED for user input errors. Every error branch (including <2 stages, unreachable sub-actors, and catch-block errors) pushes a structured record to the dataset and exits SUCCEEDED — safe to schedule on a cron without tripping Apify's default-input auto-test.

Schema quality

Apify platform drives the Console run form, API validation, and MCP tool inference from input-schema metadata. Thin schemas aren't unusable but are effectively invisible to agent planners. schemaCompleteness grades each stage and the pipeline on good / partial / poor / missing, exposing fieldDescriptionCoverage, exampleCoverage, typedFieldCoverage, and agenticCoverage as 0–1 floats. SCHEMA_AGENTIC_COVERAGE_LOW fires below 0.5.

Automation contract

Three common consumers, three different fields to read:

ConsumerRead this fieldWhy
Webhook / Zapier / Slack alertingdecisionPosture + oneLineOne scalar + one sentence. No prose parsing.
Dashboard / UIdecisionCards[] + confidenceLevel + costEstimateScannable cards + human-readable level + cost.
Agent tool call / LLMissues[] + verdictReasonCodesStructured evidence with recommendations, stable codes.

Input contract

type Input = {
stages: Array<{
actorId: string; // 'username/actor-name'
fieldMapping?: Record<string, string>; // { downstreamInputField: upstreamOutputField }
memory?: number; // MB, embedded in generatedCode
timeout?: number; // seconds, embedded in generatedCode
alias?: string; // optional human name in generatedCode comments
}>; // >= 2; required
validateRuntime?: boolean; // default false — empirical per-stage check via input-tester
codegenMode?: 'minimal' | 'productionish' | 'typed'; // default 'minimal'
paginationMode?: 'limit_1000' | 'paginate_all'; // default 'limit_1000'
emitAgentContract?: boolean; // default true
emitSignals?: boolean; // default true
suggestionMode?: 'schema_only' | 'off'; // default 'schema_only'
strictness?: 'default' | 'strict' | 'lenient'; // default 'default'
};

Output contract

type Report = {
recordType: 'report' | 'input-error' | 'error';
oneLine: string;
decisionPosture: 'ship_pipeline' | 'canary_recommended' | 'monitor_only' | 'no_call';
decisionReason: string;
decisionReadiness: 'actionable' | 'monitor' | 'insufficient-data';
readinessScore: number; // 0..1, gate-like
confidenceScore: number; // 0..1, harmonic mean of breakdown
confidenceLevel: 'high' | 'medium' | 'low';
confidencePolicyVersion: string;
confidenceBreakdown: {
resolutionCoverage: number; // fraction of actors resolved
mappingCoverage: number; // fraction of transitions with mapping
schemaCoverage: number; // fraction with both input + dataset schemas
metadataCoverage: number; // fraction of fields with title/desc/example
runtimeBoost: number; // 1.0 if validateRuntime passed, else 0.5-0.6
};
confidencePenaltyReasons: string[];
verdictReasonCodes: IssueCode[]; // see Failure modes
decisionCards: Array<{ // 2-3 cards: fix-this-first / watch-out / cost-heads-up
kind: string; title: string; shortReason: string;
recommendation: string | null; urgency: string; stage: number | null;
}>;
schemaCompleteness: {
inputSchemaQuality: 'good' | 'partial' | 'poor' | 'missing';
datasetSchemaQuality: 'good' | 'partial' | 'poor' | 'missing';
outputSchemaPresent: boolean;
fieldDescriptionCoverage: number;
exampleCoverage: number;
typedFieldCoverage: number;
agenticCoverage: number;
};
stages: number;
valid: boolean;
errors: string[]; // legacy mirror of blocking issues[].message
warnings: string[]; // legacy mirror of advisory issues[].message
issues: Array<{
severity: 'blocking' | 'advisory' | 'info';
code: IssueCode;
stage: number | null; // 1-based or null for pipeline-level
message: string;
recommendation: string | null;
evidence?: Record<string, unknown>;
}>;
fixPlan: Array<{
order: number; stage: number | null;
severity: string; code: IssueCode;
action: string; why: string;
}>;
mappingSuggestions?: Array<{
stage: number; targetField: string; suggestedSourceField: string;
basis: 'schema_name_match' | 'schema_metadata_match';
confidence: number; // 0..1
}>;
stageDetails: Array<{
stage: number; alias: string | null;
actor: string; actorId: string;
reachable: boolean; defaultBuildResolved: boolean;
ppePrice: number; memory: number; timeout: number;
inputFields: string[]; outputFields: string[];
inputSchemaQuality: string; datasetSchemaQuality: string; outputSchemaPresent: boolean;
fieldDescriptionCoverage: number; exampleCoverage: number;
mappingStatus: 'ok' | 'partial' | 'broken' | 'not_applicable';
stageSignals: IssueCode[];
}>;
generatedCode: string; // empty when decisionPosture = 'no_call'
codegenMode: 'minimal' | 'productionish' | 'typed';
codegenAssumptions: string[];
codegenWarnings: string[];
costEstimate: {
perRun: number; // sum of sub-actor PPE
monthly100: number;
monthly1000: number;
excludesPlatformCompute: true;
} | null;
runtimeValidation?: { // present when validateRuntime = true
allStagesOk: boolean;
stagesChecked: number; stagesPassed: number; stagesFailed: number;
perStage: Array<{
stage: number; inputTesterOk: boolean;
inputTesterErrors: string[]; inputTesterWarnings: string[];
durationSeconds: number;
}>;
};
agentContract?: { // emitted when emitAgentContract = true (default)
safeToCall: boolean;
recommendedAction: 'ship' | 'canary' | 'fix_mapping' | 'fix_schema' | 'do_not_call';
safeInvocationMode: 'production' | 'canary_only' | 'not_ready';
expectedOutputHandle: 'defaultDataset';
requiredFixes: Array<{ stage: number | null; code: IssueCode }>;
toolHint: string;
postRunGuardSuggestion: string | null;
};
signals?: IssueCode[]; // emitted when emitSignals = true (default)
evidenceCounts: {
resolvedStages: number; totalStages: number;
withInputSchema: number; withDatasetSchema: number;
issuesBlocking: number; issuesAdvisory: number; issuesInfo: number;
mappingSuggestionsEmitted: number;
};
builtAt: string; // ISO 8601
};

SUMMARY is mirrored to the key-value store under the SUMMARY key (decision scalars + schema completeness + cost).

Failure modes

Every issue carries a stable code (member of IssueCode) and a severity. Codes are additive-only within a major version; confidencePolicyVersion bumps when the scoring formula changes.

codeseverityfires when
ACTOR_NOT_FOUNDblocking/v2/acts/{id} returns non-2xx under the caller's token
NO_FIELD_MAPPINGblockingnon-first stage has fieldMapping = {} or absent
RUNTIME_VALIDATION_FAILEDblockingvalidateRuntime=true and ≥1 stage's inputTesterOk = false
MAPPED_FIELD_NOT_IN_PREV_OUTPUTadvisorymapping source field absent from upstream's declared dataset schema (only when schema is declared)
TARGET_FIELD_NOT_IN_INPUT_SCHEMAadvisorymapping target field absent from downstream's declared input schema
FIRST_STAGE_HAS_MAPPINGadvisorystage 1 has a fieldMapping (meaningless — no upstream)
DUPLICATE_ACTOR_IN_PIPELINEadvisorysame actorId appears in two or more stages
PIPELINE_VERY_LARGEadvisorystages.length > 20
INPUT_SCHEMA_THINadvisorystage resolves but declares no input-schema properties
DATASET_SCHEMA_MISSINGadvisorystage declares no actorDefinition.storages.dataset.fields
SCHEMA_AGENTIC_COVERAGE_LOWadvisorypipeline-wide agenticCoverage < 0.5
RUNTIME_VALIDATION_UNAVAILABLEadvisoryvalidateRuntime=true and input-tester failed to complete
OUTPUT_SCHEMA_MISSINGinfostage declares no explicit output schema
FIELD_METADATA_THINinfoinput-schema fields have < 50% title/description coverage (suppressed in strictness=lenient)

Error-branch records carry recordType: 'input-error' (fewer than 2 stages) or recordType: 'error' (catch-block) with message, recommendation, and timestamp. The actor never exits FAILED.

For AI agents

Pipeline Preflight is compatible with the Apify MCP server. Outputs are flat typed JSON; agentContract.recommendedAction is a stable enum consumers switch() on. Typical flow: propose {stages: [...]} → call Pipeline Preflight → branch on decisionPosture / agentContract.recommendedAction; if no_call, iterate requiredFixes[] and retry.

CI

  • Fail: decisionPosture === 'no_call'
  • Canary: decisionPosture ∈ {'ship_pipeline', 'canary_recommended'}
  • Promote: decisionPosture === 'ship_pipeline' && decisionReadiness === 'actionable'

Usage

Pass a stages array. Each stage must have actorId; non-first stages must have fieldMapping. 3-stage validation completes in ~30s and charges the flat $0.40 event price. generatedCode is the orchestrator — paste into your own actor or orchestration script.

Input parameters (reference)

ParameterTypeRequiredDefaultDescription
stagesarrayYes[]Array of pipeline stage objects. Minimum 2 stages required. Each object: actorId (string, required), fieldMapping (object, optional), memory (number MB, optional), timeout (number seconds, optional).
validateRuntimebooleanNofalsev3. When true, also call actor-input-tester on each stage with a synthetic input built from the field mapping, verifying the target actors' real input schemas would accept what the pipeline would send them. No actors are actually run -- input-tester only validates shapes. Transforms Pipeline Preflight from "schemas line up on paper" to "schemas line up AND empirical input contracts hold".

Stage object format

Each entry in the stages array follows this structure:

FieldTypeRequiredDescription
actorIdstringYesFull actor identifier, e.g. ryanclinton/google-maps-email-extractor
fieldMappingobjectNoMaps this stage's input field names (keys) to the previous stage's output field names (values)
memorynumberNoMemory in MB for this stage (default: 512). Embedded in generated code.
timeoutnumberNoTimeout in seconds for this stage (default: 120). Embedded in generated code.

Input examples

Three-stage lead generation pipeline (Maps → Email → Verify):

{
"stages": [
{
"actorId": "ryanclinton/google-maps-email-extractor",
"memory": 1024,
"timeout": 300
},
{
"actorId": "ryanclinton/email-pattern-finder",
"fieldMapping": {
"urls": "website"
},
"memory": 512,
"timeout": 120
},
{
"actorId": "ryanclinton/bulk-email-verifier",
"fieldMapping": {
"emails": "emailPattern"
},
"memory": 256,
"timeout": 60
}
]
}

Two-stage enrichment pipeline (Contact scraper → CRM push):

{
"stages": [
{
"actorId": "ryanclinton/website-contact-scraper",
"memory": 512,
"timeout": 120
},
{
"actorId": "ryanclinton/hubspot-lead-pusher",
"fieldMapping": {
"email": "email",
"name": "contactName",
"company": "companyName"
},
"memory": 256,
"timeout": 60
}
]
}

Reachability-only smoke test (will return no_call due to missing mappings): Use this shape to confirm both actors resolve without committing to a mapping yet. Every non-first stage must have a fieldMapping for the preflight to return ship_pipeline or canary_recommended.

{
"stages": [
{
"actorId": "ryanclinton/website-tech-stack-detector"
},
{
"actorId": "ryanclinton/b2b-lead-qualifier"
}
]
}

With empirical runtime validation (calls input-tester per stage):

{
"stages": [
{
"actorId": "ryanclinton/website-contact-scraper"
},
{
"actorId": "ryanclinton/hubspot-lead-pusher",
"fieldMapping": { "email": "email", "name": "contactName" }
}
],
"validateRuntime": true
}

Input tips

  • Define field mappings for every non-first stage — omitting fieldMapping is a blocking issue (NO_FIELD_MAPPING). Without it, the downstream actor receives { data: [...] } instead of its declared input shape and will reject the call at runtime.
  • Use the full actor identifier — always use username/actor-name format (e.g., ryanclinton/website-contact-scraper), not just the actor name. The actor lookup will fail without the username prefix.
  • Check field names against actor schemas first — use Schema Registry or Schema Diff to confirm the exact field names before building the pipeline.
  • Start with 2 stages — validate the core connection first, then extend to 3 or 4 stages once the first pair validates cleanly.
  • Set realistic memory values — the generated code uses the memory value you specify. Check each actor's recommended memory in the Apify Store before setting these values.

Output example

{
"recordType": "report",
"oneLine": "Pipeline canary recommended: 3 stages, 1 advisory, est $0.26/run (medium confidence).",
"decisionPosture": "canary_recommended",
"decisionReason": "1 advisory; runtime validation not requested — wire it up behind a canary before trusting it in production.",
"decisionReadiness": "monitor",
"confidenceScore": 0.68,
"confidenceLevel": "medium",
"confidenceBreakdown": {
"resolutionCoverage": 1.0,
"mappingCoverage": 0.67,
"schemaCoverage": 0.83,
"runtimeBoost": 0.5
},
"verdictReasonCodes": ["MAPPED_FIELD_NOT_IN_PREV_OUTPUT"],
"decisionCards": [
{
"kind": "watch-out",
"title": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema",
"shortReason": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema",
"recommendation": "Check ryanclinton/email-pattern-finder's dataset schema; the field name may be different (e.g. 'markdown' vs 'text').",
"urgency": "advisory",
"stage": 3
},
{
"kind": "cost-heads-up",
"title": "Estimated $0.26 per pipeline run",
"shortReason": "3 stages, aggregate PPE of sub-actors",
"recommendation": "Does not include platform compute (memory × runtime). Check each sub-actor's pricing for the full picture.",
"urgency": "info",
"stage": null
}
],
"stages": 3,
"valid": true,
"errors": [],
"warnings": [
"Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema"
],
"issues": [
{
"severity": "advisory",
"code": "MAPPED_FIELD_NOT_IN_PREV_OUTPUT",
"stage": 3,
"message": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema",
"recommendation": "Check ryanclinton/email-pattern-finder's dataset schema; the field name may be different (e.g. 'markdown' vs 'text')."
}
],
"stageDetails": [
{
"stage": 1,
"actor": "ryanclinton/google-maps-email-extractor",
"actorId": "ryanclinton/google-maps-email-extractor",
"ppePrice": 0.15,
"memory": 1024,
"timeout": 300,
"outputFields": ["businessName", "website", "email", "phone", "address", "rating", "reviewCount"],
"inputFields": ["searchQuery", "maxResults", "country", "language", "proxyConfig"]
},
{
"stage": 2,
"actor": "ryanclinton/email-pattern-finder",
"actorId": "ryanclinton/email-pattern-finder",
"ppePrice": 0.10,
"memory": 512,
"timeout": 120,
"outputFields": ["domain", "emailPattern", "confidence", "examples"],
"inputFields": ["urls", "maxResults", "timeout"]
},
{
"stage": 3,
"actor": "ryanclinton/bulk-email-verifier",
"actorId": "ryanclinton/bulk-email-verifier",
"ppePrice": 0.005,
"memory": 256,
"timeout": 60,
"outputFields": ["email", "valid", "mxCheck", "smtpCheck", "score"],
"inputFields": ["emails", "verifySmtp", "timeout"]
}
],
"generatedCode": "import { Actor } from 'apify';\n\nActor.main(async () => {\n // Stage 1: ryanclinton/google-maps-email-extractor\n const input = await Actor.getInput();\n const run1 = await Actor.call('ryanclinton/google-maps-email-extractor', input, { memory: 1024, timeout: 300 });\n\n // Stage 2: ryanclinton/email-pattern-finder\n const ds1 = await Actor.apifyClient.dataset(run1.defaultDatasetId).listItems();\n const run2 = await Actor.call('ryanclinton/email-pattern-finder', { urls: ds1.items.map(i => i.website) }, { memory: 512, timeout: 120 });\n\n // Stage 3: ryanclinton/bulk-email-verifier\n const ds2 = await Actor.apifyClient.dataset(run2.defaultDatasetId).listItems();\n const run3 = await Actor.call('ryanclinton/bulk-email-verifier', { emails: ds2.items.map(i => i.emailPattern) }, { memory: 256, timeout: 60 });\n\n // Collect final output\n const finalDs = await Actor.apifyClient.dataset(run3.defaultDatasetId).listItems();\n await Actor.pushData(finalDs.items);\n});",
"costEstimate": {
"perRun": 0.26,
"monthly100": 26.00,
"monthly1000": 260.00,
"excludesPlatformCompute": true
},
"builtAt": "2026-03-20T14:32:11.000Z"
}

Output fields (reference)

FieldTypeDescription
recordTypestringDiscriminator: "report" for the main analysis, "input-error" for <2-stage input rejections, "error" for catch-block records. Filter downstream with WHERE recordType = 'report'.
oneLinestringSingle-sentence verdict safe to paste into Slack, email subjects, or dashboard tiles.
decisionPosturestringRoutable verdict: ship_pipeline (valid + runtime-validated + zero advisories), canary_recommended (valid but unverified), monitor_only (valid but schema advisories), no_call (blocking issues present). Branch on this, not on prose.
decisionReasonstringOne sentence explaining why the posture landed where it did.
decisionReadinessstringactionable / monitor / insufficient-data. Automation should only execute pipelines with actionable readiness.
readinessScorenumber0–1 gate-like score. 1.0 for ship_pipeline, ~0.85 for canary_recommended, ~0.6 for monitor_only, 0 when any blocking issue is present.
confidenceScorenumber0–1 harmonic mean of the five confidenceBreakdown components.
confidenceLevelstringhigh (≥0.75) / medium (≥0.5) / low (<0.5). Use the level for UI filtering, the score for sorting.
confidencePolicyVersionstringVersion tag for the scoring formula. Bumped when components, weights, or bands change.
confidenceBreakdownobjectPer-component scores (0–1): resolutionCoverage, mappingCoverage, schemaCoverage, metadataCoverage, runtimeBoost.
confidencePenaltyReasonsstring[]Plain-English reasons explaining why confidence is below 1.0.
schemaCompletenessobjectPipeline-wide schema quality: inputSchemaQuality, datasetSchemaQuality (each good/partial/poor/missing), outputSchemaPresent, fieldDescriptionCoverage, exampleCoverage, typedFieldCoverage, agenticCoverage.
fixPlanobject[]Ordered remediation: blocking first, then advisory, then info. Each entry {order, stage, action, why, severity, code}. Follow top-to-bottom.
mappingSuggestionsobject[]Present only when NO_FIELD_MAPPING fires and both schemas are declared. Each entry {stage, targetField, suggestedSourceField, basis, confidence}. Never apply without review.
agentContractobject{safeToCall, recommendedAction, safeInvocationMode, expectedOutputHandle, requiredFixes[{stage, code}], toolHint, postRunGuardSuggestion}. Emitted when emitAgentContract=true (default).
signalsstring[]Fleet-consumable signal codes. Emitted when emitSignals=true (default).
codegenModestringMirrors the input mode: minimal / productionish / typed.
codegenAssumptionsstring[]Plain-English assumptions baked into generatedCode (e.g. pagination mode).
codegenWarningsstring[]Per-stage warnings about the generated code (e.g. no dataset schema declared).
evidenceCountsobjectCounts backing the verdict: resolvedStages, totalStages, withInputSchema, withDatasetSchema, issuesBlocking, issuesAdvisory, issuesInfo, mappingSuggestionsEmitted.
verdictReasonCodesstring[]Stable machine-readable codes present on this report. Additive-only within a major version.
decisionCardsobject[]2–3 scannable cards: {kind, title, shortReason, recommendation, urgency, stage}. Kinds: fix-this-first, watch-out, cost-heads-up.
issuesobject[]Structured issue list: {severity, code, stage, message, recommendation}. Branch on code, display message, act on recommendation.
stagesnumberTotal number of pipeline stages validated.
validbooleantrue if no blocking issues.
errorsstring[]Blocking issue messages (mirrors issues[].message where severity='blocking'). Legacy shape kept for dashboard consumers.
warningsstring[]Advisory issue messages (mirrors issues[].message where severity='advisory'). Legacy shape kept for dashboard consumers.
stageDetailsobject[]Per-stage details array (see nested fields below)
stageDetails[].stagenumberStage index (1-based)
stageDetails[].actorstringResolved actor name in username/name format
stageDetails[].actorIdstringOriginal actor ID as provided in the input
stageDetails[].ppePricenumberPPE price per event in USD from the Apify API
stageDetails[].memorynumberMemory in MB (from input or default 512)
stageDetails[].timeoutnumberTimeout in seconds (from input or default 120)
stageDetails[].outputFieldsstring[]Field names from the actor's dataset storage schema
stageDetails[].inputFieldsstring[]Field names from the actor's input schema
generatedCodestringComplete TypeScript Actor.main() orchestration script
costEstimate.perRunnumberSum of all stage PPE prices, rounded to 2 decimal places
costEstimate.monthly100numberProjected monthly cost at 100 runs
costEstimate.monthly1000numberProjected monthly cost at 1,000 runs
runtimeValidationobjectv3. Present when validateRuntime: true. Contains allStagesOk, stagesChecked, stagesPassed, stagesFailed, and perStage[] with per-stage inputTesterOk, inputTesterErrors[], inputTesterWarnings[], and durationSeconds. If any stage fails empirical input validation, valid in the main report is forced to false.
builtAtstringISO 8601 timestamp of the validation run

How much does it cost to build an actor pipeline?

Pipeline Preflight uses pay-per-event pricing — you pay $0.40 per pipeline build. Platform compute costs (memory × runtime) are separate and are charged on top of the event price by Apify; they are not included in costEstimate.perRun. A typical 3-stage run uses 256 MB for under 30 seconds and adds a few fractions of a cent to the event price.

ScenarioPipelinesCost per buildTotal cost
Quick test1$0.40$0.40
Design sprint10$0.40$4.00
Weekly CI validation50$0.40$20.00
Daily automated checks200$0.40$80.00
Continuous integration suite1,000$0.40$400.00

You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.

Comparable pipeline design tools like Zapier ($19–$69/month) and Make ($9–$29/month) charge monthly subscriptions and do not generate TypeScript code or validate Apify actor schemas. With Pipeline Preflight, most teams spend $2–$10/month validating pipelines on demand, with no subscription.

Build an actor pipeline using the API

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("ryanclinton/actor-pipeline-builder").call(run_input={
"stages": [
{
"actorId": "ryanclinton/google-maps-email-extractor",
"memory": 1024,
"timeout": 300
},
{
"actorId": "ryanclinton/email-pattern-finder",
"fieldMapping": {"urls": "website"},
"memory": 512,
"timeout": 120
},
{
"actorId": "ryanclinton/bulk-email-verifier",
"fieldMapping": {"emails": "emailPattern"},
"memory": 256,
"timeout": 60
}
]
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"Valid: {item['valid']} | Stages: {item['stages']} | Cost/run: ${item['costEstimate']['perRun']}")
if item.get("warnings"):
for w in item["warnings"]:
print(f" Warning: {w}")
print("\n--- Generated Code ---")
print(item["generatedCode"])

JavaScript

import { ApifyClient } from "apify-client";
const client = new ApifyClient({ token: "YOUR_API_TOKEN" });
const run = await client.actor("ryanclinton/actor-pipeline-builder").call({
stages: [
{
actorId: "ryanclinton/google-maps-email-extractor",
memory: 1024,
timeout: 300
},
{
actorId: "ryanclinton/email-pattern-finder",
fieldMapping: { urls: "website" },
memory: 512,
timeout: 120
},
{
actorId: "ryanclinton/bulk-email-verifier",
fieldMapping: { emails: "emailPattern" },
memory: 256,
timeout: 60
}
]
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
console.log(`Valid: ${item.valid} | Stages: ${item.stages} | Cost/run: $${item.costEstimate.perRun}`);
item.warnings?.forEach(w => console.log(` Warning: ${w}`));
console.log("\n--- Generated Code ---\n" + item.generatedCode);
}

cURL

# Start the actor run
curl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-pipeline-builder/runs?token=YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"stages": [
{
"actorId": "ryanclinton/google-maps-email-extractor",
"memory": 1024,
"timeout": 300
},
{
"actorId": "ryanclinton/email-pattern-finder",
"fieldMapping": {"urls": "website"},
"memory": 512,
"timeout": 120
}
]
}'
# Fetch results (replace DATASET_ID from the run response)
curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"

How it works

  1. Resolve each stage. Parallel GET /v2/acts/{id}/builds/default with Promise.allSettled and 30s AbortSignal.timeout. Falls back to GET /v2/acts/{id}taggedBuilds.latest || anyGET /v2/acts/{id}/builds/{buildId} when default is unavailable. Retry on 429/5xx with exponential backoff (500ms, 1s, 2s). Failures become ACTOR_NOT_FOUND, not thrown exceptions.
  2. Parse schemas. buildData.inputSchema (JSON string) → input field names + per-field title/description/example coverage. buildData.actorDefinition.storages.dataset.fields → output field names + types + metadata coverage.
  3. Type-check transitions. For each non-first stage, check every fieldMapping[inputField] = outputField entry against both schemas. Missing field mappings are blocking; field-name mismatches are advisory.
  4. Score completeness. Per-stage and pipeline-wide schema grades (good/partial/poor/missing), metadata coverage, agenticCoverage = avg(schemaCoverage, metadataCoverage).
  5. Compute decision. decisionPosture from (valid, advisoryCount, runtimeValidated). confidenceScore = harmonic mean of confidenceBreakdown. readinessScore is an independent gate-like score (1.0 ship / 0.85 canary / 0.6 monitor / 0 no_call).
  6. Generate code. Emit Actor.main() with Actor.call() per stage, listItems({limit:1000}) or pagination helper, field-mapping projections. Assumptions and warnings captured in codegenAssumptions[] / codegenWarnings[].
  7. Sum cost. costEstimate.perRun = Σ pricingInfos[last].pricingPerEvent.actorChargeEvents[0].eventPriceUsd across resolved stages. excludesPlatformCompute: true is explicit.
  8. (Optional) Empirical runtime check. validateRuntime: true calls actor-input-tester per stage with a synthesized input built from the declared mapping. 5-minute wall-clock cap via Promise.race. Promise.allSettled so one stage hang doesn't crash the batch.
  9. Emit. Push a single recordType: 'report' item. Write SUMMARY to KV. Charge pipeline-build if isPPE. Exit SUCCEEDED.

Limitations

  • Schema-declaration dependence. If an actor declares no dataset.fields, output-field validation degrades to DATASET_SCHEMA_MISSING advisory. Generated code may still work at runtime; it just wasn't verifiable at design time.
  • Design-time only. An actor can declare one shape in its schema and emit another at runtime. Enable validateRuntime: true for empirical per-stage input checks, but even that doesn't catch output-shape drift.
  • Token scope. Private actors outside the caller's token scope return ACTOR_NOT_FOUND.
  • Flat mappings only. fieldMapping is { string: string }. Nested paths and type coercion are out of scope — write them manually on the generated code.
  • Cost excludes platform compute. costEstimate.perRun sums PPE event prices. Memory × runtime compute is not modelled.
  • ≥2 stages required. Single-actor validation → Input Tester.

Troubleshooting

Stage returns "Actor not found" error despite the actor existing. Confirm that the actorId uses the full username/actor-name format (e.g., ryanclinton/website-contact-scraper, not just website-contact-scraper). Also verify that your Apify token has read access to the actor — private actors belonging to other users cannot be fetched.

All output fields appear empty for a stage. The actor's latest build does not declare a dataset schema in actorDefinition.storages.dataset.fields. This is common for older actors. Pipeline Preflight will issue a warning but still generate code. Use the Apify Console to inspect the actor's actual output dataset and confirm field names manually before relying on the mapping.

Generated code runs but no data appears in the final dataset. This typically means a field mapping references a field name that does not match the actual runtime output. Check the warnings array in the validation report for mapping issues. For actors where the schema is not declared, run the upstream actor manually and inspect its output dataset to get the actual field names.

Validation warnings on every stage transition. If all stages produce warnings about missing schemas, the actors in your pipeline are likely older and do not expose build-time dataset schemas. The validation will still succeed (valid: true) and the generated code will still run — the warnings indicate reduced validation confidence, not a broken pipeline.

Run completes instantly with valid: false and no stageDetails. At least one actor ID was not found. Check each actorId in the errors array, correct the identifier, and re-run.

Responsible use

  • This actor only accesses actor metadata and schema information via the Apify API.
  • Only actors that your API token has permission to read will be processed.
  • Do not use this actor to harvest pricing or schema data from competitor actors at scale.
  • Generated code uses Actor.call() which triggers billable runs on the target actors — review cost estimates before deploying generated pipelines.

FAQ

How does Pipeline Preflight work? It validates that a chain of Apify actors composes correctly. It fetches each stage's declared input and dataset schemas from the Apify API, type-checks every field mapping against both schemas, and returns a decision (ship_pipeline / canary_recommended / monitor_only / no_call) plus a TypeScript orchestrator.

How many stages can an actor pipeline have? There is no hard maximum imposed by Pipeline Preflight — pipelines with 2 to 6+ stages have been validated successfully. Performance scales linearly since all actor schemas are fetched in parallel. Practical limits come from the Apify API rate limits and the 512 MB default memory allocation of Pipeline Preflight itself.

Does Pipeline Preflight run any of the actors in my pipeline? No. Pipeline Preflight only reads actor metadata and schemas from the Apify API. It never calls Actor.call() on your pipeline actors during validation. No credits are consumed by the target actors during a build.

How accurate is the field mapping validation? Accuracy depends on whether each actor publishes its dataset schema in its build metadata. Actors that declare actorDefinition.storages.dataset.fields are validated fully. Actors that define output fields at runtime receive warnings instead of confirmed passes. For best results, combine with Schema Registry to inspect field names before building.

Can I validate pipelines that include actors from other Apify users? Yes, as long as the actors are public (or your token has access to them). The actor ID format is always username/actor-name.

How long does a typical pipeline validation take? Most 2-4 stage pipelines complete in under 30 seconds. Each actor lookup has a 30-second timeout, and all stages are fetched concurrently, so total time is determined by the slowest individual lookup — typically 5-15 seconds for a 3-stage pipeline.

Is the generated TypeScript code production-ready? The generated code is a correct functional starting point for the described pipeline. It needs error handling, pagination for large datasets, logging, and environment-specific configuration before production deployment. Treat it as a reference implementation, not a finished product.

Can I use Pipeline Preflight to validate pipelines with non-Apify actors? No. Pipeline Preflight reads schemas exclusively from the Apify API. All stages must reference valid Apify actor IDs.

How is Pipeline Preflight different from Zapier or Make? Zapier and Make are no-code runtime orchestration tools. Pipeline Preflight is a design-time validation tool that generates Apify-native TypeScript code. It does not execute workflows — it validates that actor schemas connect correctly and produces the code for you to deploy yourself via the Apify platform.

Can I schedule Pipeline Preflight to run automatically? Yes. Use Apify's built-in scheduler to run Pipeline Preflight on a cron schedule — for example, nightly or after each actor deployment. This acts as a schema regression test for your pipeline.

What happens if an actor has no PPE pricing? The stage's ppePrice will be 0 and the cost estimate will exclude that stage. This occurs for actors using compute-unit pricing rather than pay-per-event. The pipeline will still validate and generate code normally.

Can I use the output with the Apify MCP server? Yes. The structured JSON output and the generatedCode string can be passed to an LLM via the Apify MCP server for code review, documentation generation, or further pipeline design assistance.

Is it legal to read actor schemas and pricing via the Apify API? Yes. Pipeline Preflight uses the official Apify API with your own API token to read publicly available metadata for actors you have permission to access. This is standard platform API usage, not scraping.

Sibling actors in the same fleet

Pipeline Preflight is one step in a larger backend/DevOps fleet. When your need falls outside pre-run pipeline-design validation, route to the right tool:

NeedUse this instead
Validate a single actor's input JSON against its declared schemaInput Tester
Diff two versions of an actor's schema to detect breaking changesSchema Validator
Score an actor's overall quality (runs, reviews, revenue, issues)Quality Monitor
Find actors by input/output shape in your accountSchema Registry
Run a real integration test on a single actor with a known-good inputTest Runner
Actually execute the generated pipeline (Pipeline Preflight only generates the code)Copy generatedCode into your own orchestrator actor, or use Cloud Staging for a dry-run harness
Detect silent output-quality regressions after a live pipeline runsCompliance Scanner

Pipeline Preflight is a contract validator for Apify pipelines — it ensures that actor inputs and outputs align across stages. Once you have the generatedCode and a ship_pipeline (or canary_recommended) verdict, the orchestrator is yours to schedule, guard, and instrument with the sibling actors above.

Help us improve

If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:

  1. Go to Account Settings > Privacy
  2. Enable Share runs with public Actor creators

This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.

Support

Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.