Actor Pipeline Builder — Validate Multi-Actor Workflows
Pricing
$400.00 / 1,000 pipeline builders
Actor Pipeline Builder — Validate Multi-Actor Workflows
Actor Pipeline Builder. Available on the Apify Store with pay-per-event pricing.
Pricing
$400.00 / 1,000 pipeline builders
Rating
0.0
(0)
Developer
ryan clinton
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
Pipeline Preflight — Validate Apify Actor Pipelines Before You Run Them
Pipeline errors should be caught before execution — not after minutes of runtime. Invalid pipelines fail at runtime. Pipeline Preflight detects failures at definition time.
Pipeline Preflight validates multi-actor Apify pipelines before execution and returns a production decision.
Pipeline Preflight is a pre-run validation stage in an Apify pipeline execution lifecycle.
Never run a pipeline in production without validating it first.
Contract
Pipeline Preflight checks that all stages in a pipeline compose correctly (input schemas, dataset schemas, field mappings, reachability) and returns a decision.
Use this actor when you need to verify that a pipeline is safe to run before executing it.
Execution pattern: define pipeline → run Pipeline Preflight → branch on decisionPosture → deploy or fix.
Guarantee: the pipeline is callable and stages compose correctly across inputs and outputs.
Output field: decisionPosture (routable control signal for automation)
This field determines what to do next.
ship_pipelinecanary_recommendedmonitor_onlyno_call
Always branch on decisionPosture. It is the only field you should use for control flow.
Do not branch on oneLine or decisionReason.
This actor does not run pipelines — it validates them before execution.
Flat rate: $0.40 per pipeline-build event. Platform compute (memory × runtime) billed separately by Apify.
No side effects. Pipeline Preflight reads Apify API metadata. It does not call, run, trigger, or schedule the target actors. Safe for CI, cron, and autonomous agents.
Mental model
Treat each actor as a function:
- input schema = function arguments
- dataset schema = return type
- fieldMapping = argument binding
Pipeline Preflight checks that the types line up across the chain. Pipeline Preflight ensures these functions compose correctly.
What it does
Core
- Type-checks stage transitions (input schema ↔ dataset schema ↔ field mapping)
- Resolves actor reachability via the Apify API (
/v2/acts/{id}/builds/default) - Produces a deterministic production decision:
ship_pipeline/canary_recommended/monitor_only/no_call
Additional
- Schema completeness scoring (per-stage + pipeline-wide; drives
SCHEMA_AGENTIC_COVERAGE_LOW) - Optional empirical input validation (
validateRuntime: true→ callsactor-input-testerper stage; no target actors run) - Ordered
fixPlan[]+ schema-basedmappingSuggestions[] - TypeScript orchestration codegen (
minimal/productionish/typed) withcodegenAssumptions[]andcodegenWarnings[] - Agent contract (
agentContract.safeToCall+ stablerecommendedActionenum) for MCP planners
Common causes of pipeline failure
Most multi-actor Apify pipelines break on cross-stage shape mismatches. Pipeline Preflight surfaces each as a stable verdictReasonCodes[] entry with a typed recommendation:
MAPPED_FIELD_NOT_IN_PREV_OUTPUT— mapping points at a field the upstream actor does not emit (per its declared dataset schema)TARGET_FIELD_NOT_IN_INPUT_SCHEMA— downstream actor's input schema does not declare the mapped fieldNO_FIELD_MAPPING— non-first stage has no mapping; downstream actor receives{ data: [...] }and rejectsDATASET_SCHEMA_MISSING— upstream declares no dataset schema; generated code cannot verify field namesACTOR_NOT_FOUND— slug wrong, actor private, or token lacks accessRUNTIME_VALIDATION_FAILED—validateRuntime: truecalledactor-input-testerand a stage rejected the synthesized inputSCHEMA_AGENTIC_COVERAGE_LOW— < 50% of resolved stages declare both input and dataset schemas
Full enum in Failure modes.
How is this different from Zapier or Make?
Pipeline Preflight does not execute workflows.
It validates that an Apify actor chain is callable and generates the orchestration code.
Zapier/Make run workflows. Pipeline Preflight verifies that your workflow definition is correct before you run it anywhere (Actor.call(), Apify scheduler, webhook, Zapier, Make, n8n, GitHub Actions, MCP agent).
When NOT to use Pipeline Preflight
Pipeline Preflight only validates pipeline definitions (schemas, mappings, reachability). Do NOT use this actor if you need to:
- Validate a single actor's input JSON against its schema → use Input Guard.
- Check real output data quality from a live run → use Output Guard.
- Monitor production outputs for drift, null spikes, or schema regressions → use Output Guard.
- Run integration or regression tests against an actor → use Deploy Guard.
- Audit actors for PII, GDPR, CCPA, ToS, or CFAA risk → use Compliance Scanner.
- Score an actor's historical run quality → use Quality Monitor.
- Compare two actors A/B on the same input → use A/B Tester.
- Execute the pipeline → copy
generatedCodeinto your own orchestrator actor. - Run cost, revenue, or fleet-level analytics → use Fleet Analytics.
- Do Apify Store competitive analysis → out of scope.
Pipeline Preflight validates pipeline definitions (schemas, mappings, reachability) and generates runnable orchestration code. Everything after the pipeline is callable lives in a sibling actor.
What to do with decisionPosture
| Posture | What it means | What to do |
|---|---|---|
ship_pipeline | Valid, zero advisories, runtime-validated | Pipe generatedCode straight into your orchestrator — agentContract.safeToCall = true. |
canary_recommended | Valid, zero advisories, runtime not verified | Deploy behind a canary (one record through first), then promote. |
monitor_only | Valid, but schema advisories remain | Dry-run against a single record before scheduling. Treat as "will probably work, needs a human eyeball." |
no_call | Blocking issues present | Do NOT call. Work the fixPlan[] top-to-bottom, then re-preflight. |
Example decision flow
Input — 3-stage pipeline with a missing mapping on stage 2:
{"stages": [{ "actorId": "apify/rag-web-browser" },{ "actorId": "apify/website-content-crawler" },{ "actorId": "ryanclinton/bulk-email-verifier","fieldMapping": { "emails": "emailPattern" } }]}
Output (abridged):
{"decisionPosture": "no_call","decisionReason": "1 blocking issue — cannot generate a runnable pipeline. Fix the errors and retry.","readinessScore": 0,"verdictReasonCodes": ["NO_FIELD_MAPPING"],"fixPlan": [{ "order": 1, "stage": 2, "severity": "blocking","code": "NO_FIELD_MAPPING","action": "Add a fieldMapping on stage 2 describing which of apify/rag-web-browser's output fields to feed into apify/website-content-crawler's input.","why": "Stage 2: no field mapping defined — output from apify/rag-web-browser won't be passed to apify/website-content-crawler" }],"agentContract": {"safeToCall": false,"recommendedAction": "fix_mapping","requiredFixes": [{ "stage": 2, "code": "NO_FIELD_MAPPING" }]}}
Next step → fix the mapping → re-run Pipeline Preflight → decisionPosture flips to canary_recommended or ship_pipeline → deploy.
Decision contract
These are the always-true promises, enforced in code:
decisionPosture = ship_pipelineimplies:valid = true, zero blocking issues, zero advisory issues, runtime validation ran AND passed,decisionReadiness = actionable,readinessScore = 1.0,agentContract.safeToCall = true. Safe to pipe throughActor.call()in production.decisionPosture = canary_recommendedimplies:valid = true, zero blocking issues, zero advisory issues, runtime validation NOT run.readinessScorearound 0.85. Pipeline likely works but hasn't been empirically verified — wire it up behind a canary.decisionPosture = monitor_onlyimplies:valid = true, zero blocking issues, at least one advisory.readinessScorearound 0.6. Pipeline may run but schema advisories remain — dry-run before scheduling.decisionPosture = no_callimplies:valid = false, at least one blocking issue (ACTOR_NOT_FOUND,NO_FIELD_MAPPING, orRUNTIME_VALIDATION_FAILED),decisionReadiness = insufficient-data,readinessScore = 0,generatedCode = '',agentContract.safeToCall = false.readinessScoreandconfidenceScoreare independent. Readiness is "how close to safe execution" (gate-like). Confidence is "how much to trust the verdict" (driven by schema completeness and evidence). A pipeline can be 100% ready but 50% confident if the stages declare thin schemas.- Blocking vs advisory is stable. Automation should gate on
blockingonly (issues.filter(i => i.severity === 'blocking')).infois purely explanatory. verdictReasonCodesis additive-only within a major version — new codes may be added; existing codes will not be renamed or repurposed.confidencePolicyVersionis bumped whenever the confidence-scoring formula changes (component weights, harmonic base, bands). Scores are comparable only within the same policy version.- The actor never exits FAILED for user input errors. Every error branch (including
<2 stages, unreachable sub-actors, and catch-block errors) pushes a structured record to the dataset and exits SUCCEEDED — safe to schedule on a cron without tripping Apify's default-input auto-test.
Schema quality
Apify platform drives the Console run form, API validation, and MCP tool inference from input-schema metadata. Thin schemas aren't unusable but are effectively invisible to agent planners. schemaCompleteness grades each stage and the pipeline on good / partial / poor / missing, exposing fieldDescriptionCoverage, exampleCoverage, typedFieldCoverage, and agenticCoverage as 0–1 floats. SCHEMA_AGENTIC_COVERAGE_LOW fires below 0.5.
Automation contract
Three common consumers, three different fields to read:
| Consumer | Read this field | Why |
|---|---|---|
| Webhook / Zapier / Slack alerting | decisionPosture + oneLine | One scalar + one sentence. No prose parsing. |
| Dashboard / UI | decisionCards[] + confidenceLevel + costEstimate | Scannable cards + human-readable level + cost. |
| Agent tool call / LLM | issues[] + verdictReasonCodes | Structured evidence with recommendations, stable codes. |
Input contract
type Input = {stages: Array<{actorId: string; // 'username/actor-name'fieldMapping?: Record<string, string>; // { downstreamInputField: upstreamOutputField }memory?: number; // MB, embedded in generatedCodetimeout?: number; // seconds, embedded in generatedCodealias?: string; // optional human name in generatedCode comments}>; // >= 2; requiredvalidateRuntime?: boolean; // default false — empirical per-stage check via input-testercodegenMode?: 'minimal' | 'productionish' | 'typed'; // default 'minimal'paginationMode?: 'limit_1000' | 'paginate_all'; // default 'limit_1000'emitAgentContract?: boolean; // default trueemitSignals?: boolean; // default truesuggestionMode?: 'schema_only' | 'off'; // default 'schema_only'strictness?: 'default' | 'strict' | 'lenient'; // default 'default'};
Output contract
type Report = {recordType: 'report' | 'input-error' | 'error';oneLine: string;decisionPosture: 'ship_pipeline' | 'canary_recommended' | 'monitor_only' | 'no_call';decisionReason: string;decisionReadiness: 'actionable' | 'monitor' | 'insufficient-data';readinessScore: number; // 0..1, gate-likeconfidenceScore: number; // 0..1, harmonic mean of breakdownconfidenceLevel: 'high' | 'medium' | 'low';confidencePolicyVersion: string;confidenceBreakdown: {resolutionCoverage: number; // fraction of actors resolvedmappingCoverage: number; // fraction of transitions with mappingschemaCoverage: number; // fraction with both input + dataset schemasmetadataCoverage: number; // fraction of fields with title/desc/exampleruntimeBoost: number; // 1.0 if validateRuntime passed, else 0.5-0.6};confidencePenaltyReasons: string[];verdictReasonCodes: IssueCode[]; // see Failure modesdecisionCards: Array<{ // 2-3 cards: fix-this-first / watch-out / cost-heads-upkind: string; title: string; shortReason: string;recommendation: string | null; urgency: string; stage: number | null;}>;schemaCompleteness: {inputSchemaQuality: 'good' | 'partial' | 'poor' | 'missing';datasetSchemaQuality: 'good' | 'partial' | 'poor' | 'missing';outputSchemaPresent: boolean;fieldDescriptionCoverage: number;exampleCoverage: number;typedFieldCoverage: number;agenticCoverage: number;};stages: number;valid: boolean;errors: string[]; // legacy mirror of blocking issues[].messagewarnings: string[]; // legacy mirror of advisory issues[].messageissues: Array<{severity: 'blocking' | 'advisory' | 'info';code: IssueCode;stage: number | null; // 1-based or null for pipeline-levelmessage: string;recommendation: string | null;evidence?: Record<string, unknown>;}>;fixPlan: Array<{order: number; stage: number | null;severity: string; code: IssueCode;action: string; why: string;}>;mappingSuggestions?: Array<{stage: number; targetField: string; suggestedSourceField: string;basis: 'schema_name_match' | 'schema_metadata_match';confidence: number; // 0..1}>;stageDetails: Array<{stage: number; alias: string | null;actor: string; actorId: string;reachable: boolean; defaultBuildResolved: boolean;ppePrice: number; memory: number; timeout: number;inputFields: string[]; outputFields: string[];inputSchemaQuality: string; datasetSchemaQuality: string; outputSchemaPresent: boolean;fieldDescriptionCoverage: number; exampleCoverage: number;mappingStatus: 'ok' | 'partial' | 'broken' | 'not_applicable';stageSignals: IssueCode[];}>;generatedCode: string; // empty when decisionPosture = 'no_call'codegenMode: 'minimal' | 'productionish' | 'typed';codegenAssumptions: string[];codegenWarnings: string[];costEstimate: {perRun: number; // sum of sub-actor PPEmonthly100: number;monthly1000: number;excludesPlatformCompute: true;} | null;runtimeValidation?: { // present when validateRuntime = trueallStagesOk: boolean;stagesChecked: number; stagesPassed: number; stagesFailed: number;perStage: Array<{stage: number; inputTesterOk: boolean;inputTesterErrors: string[]; inputTesterWarnings: string[];durationSeconds: number;}>;};agentContract?: { // emitted when emitAgentContract = true (default)safeToCall: boolean;recommendedAction: 'ship' | 'canary' | 'fix_mapping' | 'fix_schema' | 'do_not_call';safeInvocationMode: 'production' | 'canary_only' | 'not_ready';expectedOutputHandle: 'defaultDataset';requiredFixes: Array<{ stage: number | null; code: IssueCode }>;toolHint: string;postRunGuardSuggestion: string | null;};signals?: IssueCode[]; // emitted when emitSignals = true (default)evidenceCounts: {resolvedStages: number; totalStages: number;withInputSchema: number; withDatasetSchema: number;issuesBlocking: number; issuesAdvisory: number; issuesInfo: number;mappingSuggestionsEmitted: number;};builtAt: string; // ISO 8601};
SUMMARY is mirrored to the key-value store under the SUMMARY key (decision scalars + schema completeness + cost).
Failure modes
Every issue carries a stable code (member of IssueCode) and a severity. Codes are additive-only within a major version; confidencePolicyVersion bumps when the scoring formula changes.
| code | severity | fires when |
|---|---|---|
ACTOR_NOT_FOUND | blocking | /v2/acts/{id} returns non-2xx under the caller's token |
NO_FIELD_MAPPING | blocking | non-first stage has fieldMapping = {} or absent |
RUNTIME_VALIDATION_FAILED | blocking | validateRuntime=true and ≥1 stage's inputTesterOk = false |
MAPPED_FIELD_NOT_IN_PREV_OUTPUT | advisory | mapping source field absent from upstream's declared dataset schema (only when schema is declared) |
TARGET_FIELD_NOT_IN_INPUT_SCHEMA | advisory | mapping target field absent from downstream's declared input schema |
FIRST_STAGE_HAS_MAPPING | advisory | stage 1 has a fieldMapping (meaningless — no upstream) |
DUPLICATE_ACTOR_IN_PIPELINE | advisory | same actorId appears in two or more stages |
PIPELINE_VERY_LARGE | advisory | stages.length > 20 |
INPUT_SCHEMA_THIN | advisory | stage resolves but declares no input-schema properties |
DATASET_SCHEMA_MISSING | advisory | stage declares no actorDefinition.storages.dataset.fields |
SCHEMA_AGENTIC_COVERAGE_LOW | advisory | pipeline-wide agenticCoverage < 0.5 |
RUNTIME_VALIDATION_UNAVAILABLE | advisory | validateRuntime=true and input-tester failed to complete |
OUTPUT_SCHEMA_MISSING | info | stage declares no explicit output schema |
FIELD_METADATA_THIN | info | input-schema fields have < 50% title/description coverage (suppressed in strictness=lenient) |
Error-branch records carry recordType: 'input-error' (fewer than 2 stages) or recordType: 'error' (catch-block) with message, recommendation, and timestamp. The actor never exits FAILED.
For AI agents
Pipeline Preflight is compatible with the Apify MCP server. Outputs are flat typed JSON; agentContract.recommendedAction is a stable enum consumers switch() on. Typical flow: propose {stages: [...]} → call Pipeline Preflight → branch on decisionPosture / agentContract.recommendedAction; if no_call, iterate requiredFixes[] and retry.
CI
- Fail:
decisionPosture === 'no_call' - Canary:
decisionPosture ∈ {'ship_pipeline', 'canary_recommended'} - Promote:
decisionPosture === 'ship_pipeline' && decisionReadiness === 'actionable'
Usage
Pass a stages array. Each stage must have actorId; non-first stages must have fieldMapping. 3-stage validation completes in ~30s and charges the flat $0.40 event price. generatedCode is the orchestrator — paste into your own actor or orchestration script.
Input parameters (reference)
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
stages | array | Yes | [] | Array of pipeline stage objects. Minimum 2 stages required. Each object: actorId (string, required), fieldMapping (object, optional), memory (number MB, optional), timeout (number seconds, optional). |
validateRuntime | boolean | No | false | v3. When true, also call actor-input-tester on each stage with a synthetic input built from the field mapping, verifying the target actors' real input schemas would accept what the pipeline would send them. No actors are actually run -- input-tester only validates shapes. Transforms Pipeline Preflight from "schemas line up on paper" to "schemas line up AND empirical input contracts hold". |
Stage object format
Each entry in the stages array follows this structure:
| Field | Type | Required | Description |
|---|---|---|---|
actorId | string | Yes | Full actor identifier, e.g. ryanclinton/google-maps-email-extractor |
fieldMapping | object | No | Maps this stage's input field names (keys) to the previous stage's output field names (values) |
memory | number | No | Memory in MB for this stage (default: 512). Embedded in generated code. |
timeout | number | No | Timeout in seconds for this stage (default: 120). Embedded in generated code. |
Input examples
Three-stage lead generation pipeline (Maps → Email → Verify):
{"stages": [{"actorId": "ryanclinton/google-maps-email-extractor","memory": 1024,"timeout": 300},{"actorId": "ryanclinton/email-pattern-finder","fieldMapping": {"urls": "website"},"memory": 512,"timeout": 120},{"actorId": "ryanclinton/bulk-email-verifier","fieldMapping": {"emails": "emailPattern"},"memory": 256,"timeout": 60}]}
Two-stage enrichment pipeline (Contact scraper → CRM push):
{"stages": [{"actorId": "ryanclinton/website-contact-scraper","memory": 512,"timeout": 120},{"actorId": "ryanclinton/hubspot-lead-pusher","fieldMapping": {"email": "email","name": "contactName","company": "companyName"},"memory": 256,"timeout": 60}]}
Reachability-only smoke test (will return no_call due to missing mappings):
Use this shape to confirm both actors resolve without committing to a mapping yet. Every non-first stage must have a fieldMapping for the preflight to return ship_pipeline or canary_recommended.
{"stages": [{"actorId": "ryanclinton/website-tech-stack-detector"},{"actorId": "ryanclinton/b2b-lead-qualifier"}]}
With empirical runtime validation (calls input-tester per stage):
{"stages": [{"actorId": "ryanclinton/website-contact-scraper"},{"actorId": "ryanclinton/hubspot-lead-pusher","fieldMapping": { "email": "email", "name": "contactName" }}],"validateRuntime": true}
Input tips
- Define field mappings for every non-first stage — omitting
fieldMappingis a blocking issue (NO_FIELD_MAPPING). Without it, the downstream actor receives{ data: [...] }instead of its declared input shape and will reject the call at runtime. - Use the full actor identifier — always use
username/actor-nameformat (e.g.,ryanclinton/website-contact-scraper), not just the actor name. The actor lookup will fail without the username prefix. - Check field names against actor schemas first — use Schema Registry or Schema Diff to confirm the exact field names before building the pipeline.
- Start with 2 stages — validate the core connection first, then extend to 3 or 4 stages once the first pair validates cleanly.
- Set realistic memory values — the generated code uses the
memoryvalue you specify. Check each actor's recommended memory in the Apify Store before setting these values.
Output example
{"recordType": "report","oneLine": "Pipeline canary recommended: 3 stages, 1 advisory, est $0.26/run (medium confidence).","decisionPosture": "canary_recommended","decisionReason": "1 advisory; runtime validation not requested — wire it up behind a canary before trusting it in production.","decisionReadiness": "monitor","confidenceScore": 0.68,"confidenceLevel": "medium","confidenceBreakdown": {"resolutionCoverage": 1.0,"mappingCoverage": 0.67,"schemaCoverage": 0.83,"runtimeBoost": 0.5},"verdictReasonCodes": ["MAPPED_FIELD_NOT_IN_PREV_OUTPUT"],"decisionCards": [{"kind": "watch-out","title": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema","shortReason": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema","recommendation": "Check ryanclinton/email-pattern-finder's dataset schema; the field name may be different (e.g. 'markdown' vs 'text').","urgency": "advisory","stage": 3},{"kind": "cost-heads-up","title": "Estimated $0.26 per pipeline run","shortReason": "3 stages, aggregate PPE of sub-actors","recommendation": "Does not include platform compute (memory × runtime). Check each sub-actor's pricing for the full picture.","urgency": "info","stage": null}],"stages": 3,"valid": true,"errors": [],"warnings": ["Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema"],"issues": [{"severity": "advisory","code": "MAPPED_FIELD_NOT_IN_PREV_OUTPUT","stage": 3,"message": "Stage 3: mapped field 'emailPattern' not in ryanclinton/email-pattern-finder's output schema","recommendation": "Check ryanclinton/email-pattern-finder's dataset schema; the field name may be different (e.g. 'markdown' vs 'text')."}],"stageDetails": [{"stage": 1,"actor": "ryanclinton/google-maps-email-extractor","actorId": "ryanclinton/google-maps-email-extractor","ppePrice": 0.15,"memory": 1024,"timeout": 300,"outputFields": ["businessName", "website", "email", "phone", "address", "rating", "reviewCount"],"inputFields": ["searchQuery", "maxResults", "country", "language", "proxyConfig"]},{"stage": 2,"actor": "ryanclinton/email-pattern-finder","actorId": "ryanclinton/email-pattern-finder","ppePrice": 0.10,"memory": 512,"timeout": 120,"outputFields": ["domain", "emailPattern", "confidence", "examples"],"inputFields": ["urls", "maxResults", "timeout"]},{"stage": 3,"actor": "ryanclinton/bulk-email-verifier","actorId": "ryanclinton/bulk-email-verifier","ppePrice": 0.005,"memory": 256,"timeout": 60,"outputFields": ["email", "valid", "mxCheck", "smtpCheck", "score"],"inputFields": ["emails", "verifySmtp", "timeout"]}],"generatedCode": "import { Actor } from 'apify';\n\nActor.main(async () => {\n // Stage 1: ryanclinton/google-maps-email-extractor\n const input = await Actor.getInput();\n const run1 = await Actor.call('ryanclinton/google-maps-email-extractor', input, { memory: 1024, timeout: 300 });\n\n // Stage 2: ryanclinton/email-pattern-finder\n const ds1 = await Actor.apifyClient.dataset(run1.defaultDatasetId).listItems();\n const run2 = await Actor.call('ryanclinton/email-pattern-finder', { urls: ds1.items.map(i => i.website) }, { memory: 512, timeout: 120 });\n\n // Stage 3: ryanclinton/bulk-email-verifier\n const ds2 = await Actor.apifyClient.dataset(run2.defaultDatasetId).listItems();\n const run3 = await Actor.call('ryanclinton/bulk-email-verifier', { emails: ds2.items.map(i => i.emailPattern) }, { memory: 256, timeout: 60 });\n\n // Collect final output\n const finalDs = await Actor.apifyClient.dataset(run3.defaultDatasetId).listItems();\n await Actor.pushData(finalDs.items);\n});","costEstimate": {"perRun": 0.26,"monthly100": 26.00,"monthly1000": 260.00,"excludesPlatformCompute": true},"builtAt": "2026-03-20T14:32:11.000Z"}
Output fields (reference)
| Field | Type | Description |
|---|---|---|
recordType | string | Discriminator: "report" for the main analysis, "input-error" for <2-stage input rejections, "error" for catch-block records. Filter downstream with WHERE recordType = 'report'. |
oneLine | string | Single-sentence verdict safe to paste into Slack, email subjects, or dashboard tiles. |
decisionPosture | string | Routable verdict: ship_pipeline (valid + runtime-validated + zero advisories), canary_recommended (valid but unverified), monitor_only (valid but schema advisories), no_call (blocking issues present). Branch on this, not on prose. |
decisionReason | string | One sentence explaining why the posture landed where it did. |
decisionReadiness | string | actionable / monitor / insufficient-data. Automation should only execute pipelines with actionable readiness. |
readinessScore | number | 0–1 gate-like score. 1.0 for ship_pipeline, ~0.85 for canary_recommended, ~0.6 for monitor_only, 0 when any blocking issue is present. |
confidenceScore | number | 0–1 harmonic mean of the five confidenceBreakdown components. |
confidenceLevel | string | high (≥0.75) / medium (≥0.5) / low (<0.5). Use the level for UI filtering, the score for sorting. |
confidencePolicyVersion | string | Version tag for the scoring formula. Bumped when components, weights, or bands change. |
confidenceBreakdown | object | Per-component scores (0–1): resolutionCoverage, mappingCoverage, schemaCoverage, metadataCoverage, runtimeBoost. |
confidencePenaltyReasons | string[] | Plain-English reasons explaining why confidence is below 1.0. |
schemaCompleteness | object | Pipeline-wide schema quality: inputSchemaQuality, datasetSchemaQuality (each good/partial/poor/missing), outputSchemaPresent, fieldDescriptionCoverage, exampleCoverage, typedFieldCoverage, agenticCoverage. |
fixPlan | object[] | Ordered remediation: blocking first, then advisory, then info. Each entry {order, stage, action, why, severity, code}. Follow top-to-bottom. |
mappingSuggestions | object[] | Present only when NO_FIELD_MAPPING fires and both schemas are declared. Each entry {stage, targetField, suggestedSourceField, basis, confidence}. Never apply without review. |
agentContract | object | {safeToCall, recommendedAction, safeInvocationMode, expectedOutputHandle, requiredFixes[{stage, code}], toolHint, postRunGuardSuggestion}. Emitted when emitAgentContract=true (default). |
signals | string[] | Fleet-consumable signal codes. Emitted when emitSignals=true (default). |
codegenMode | string | Mirrors the input mode: minimal / productionish / typed. |
codegenAssumptions | string[] | Plain-English assumptions baked into generatedCode (e.g. pagination mode). |
codegenWarnings | string[] | Per-stage warnings about the generated code (e.g. no dataset schema declared). |
evidenceCounts | object | Counts backing the verdict: resolvedStages, totalStages, withInputSchema, withDatasetSchema, issuesBlocking, issuesAdvisory, issuesInfo, mappingSuggestionsEmitted. |
verdictReasonCodes | string[] | Stable machine-readable codes present on this report. Additive-only within a major version. |
decisionCards | object[] | 2–3 scannable cards: {kind, title, shortReason, recommendation, urgency, stage}. Kinds: fix-this-first, watch-out, cost-heads-up. |
issues | object[] | Structured issue list: {severity, code, stage, message, recommendation}. Branch on code, display message, act on recommendation. |
stages | number | Total number of pipeline stages validated. |
valid | boolean | true if no blocking issues. |
errors | string[] | Blocking issue messages (mirrors issues[].message where severity='blocking'). Legacy shape kept for dashboard consumers. |
warnings | string[] | Advisory issue messages (mirrors issues[].message where severity='advisory'). Legacy shape kept for dashboard consumers. |
stageDetails | object[] | Per-stage details array (see nested fields below) |
stageDetails[].stage | number | Stage index (1-based) |
stageDetails[].actor | string | Resolved actor name in username/name format |
stageDetails[].actorId | string | Original actor ID as provided in the input |
stageDetails[].ppePrice | number | PPE price per event in USD from the Apify API |
stageDetails[].memory | number | Memory in MB (from input or default 512) |
stageDetails[].timeout | number | Timeout in seconds (from input or default 120) |
stageDetails[].outputFields | string[] | Field names from the actor's dataset storage schema |
stageDetails[].inputFields | string[] | Field names from the actor's input schema |
generatedCode | string | Complete TypeScript Actor.main() orchestration script |
costEstimate.perRun | number | Sum of all stage PPE prices, rounded to 2 decimal places |
costEstimate.monthly100 | number | Projected monthly cost at 100 runs |
costEstimate.monthly1000 | number | Projected monthly cost at 1,000 runs |
runtimeValidation | object | v3. Present when validateRuntime: true. Contains allStagesOk, stagesChecked, stagesPassed, stagesFailed, and perStage[] with per-stage inputTesterOk, inputTesterErrors[], inputTesterWarnings[], and durationSeconds. If any stage fails empirical input validation, valid in the main report is forced to false. |
builtAt | string | ISO 8601 timestamp of the validation run |
How much does it cost to build an actor pipeline?
Pipeline Preflight uses pay-per-event pricing — you pay $0.40 per pipeline build. Platform compute costs (memory × runtime) are separate and are charged on top of the event price by Apify; they are not included in costEstimate.perRun. A typical 3-stage run uses 256 MB for under 30 seconds and adds a few fractions of a cent to the event price.
| Scenario | Pipelines | Cost per build | Total cost |
|---|---|---|---|
| Quick test | 1 | $0.40 | $0.40 |
| Design sprint | 10 | $0.40 | $4.00 |
| Weekly CI validation | 50 | $0.40 | $20.00 |
| Daily automated checks | 200 | $0.40 | $80.00 |
| Continuous integration suite | 1,000 | $0.40 | $400.00 |
You can set a maximum spending limit per run to control costs. The actor stops when your budget is reached.
Comparable pipeline design tools like Zapier ($19–$69/month) and Make ($9–$29/month) charge monthly subscriptions and do not generate TypeScript code or validate Apify actor schemas. With Pipeline Preflight, most teams spend $2–$10/month validating pipelines on demand, with no subscription.
Build an actor pipeline using the API
Python
from apify_client import ApifyClientclient = ApifyClient("YOUR_API_TOKEN")run = client.actor("ryanclinton/actor-pipeline-builder").call(run_input={"stages": [{"actorId": "ryanclinton/google-maps-email-extractor","memory": 1024,"timeout": 300},{"actorId": "ryanclinton/email-pattern-finder","fieldMapping": {"urls": "website"},"memory": 512,"timeout": 120},{"actorId": "ryanclinton/bulk-email-verifier","fieldMapping": {"emails": "emailPattern"},"memory": 256,"timeout": 60}]})for item in client.dataset(run["defaultDatasetId"]).iterate_items():print(f"Valid: {item['valid']} | Stages: {item['stages']} | Cost/run: ${item['costEstimate']['perRun']}")if item.get("warnings"):for w in item["warnings"]:print(f" Warning: {w}")print("\n--- Generated Code ---")print(item["generatedCode"])
JavaScript
import { ApifyClient } from "apify-client";const client = new ApifyClient({ token: "YOUR_API_TOKEN" });const run = await client.actor("ryanclinton/actor-pipeline-builder").call({stages: [{actorId: "ryanclinton/google-maps-email-extractor",memory: 1024,timeout: 300},{actorId: "ryanclinton/email-pattern-finder",fieldMapping: { urls: "website" },memory: 512,timeout: 120},{actorId: "ryanclinton/bulk-email-verifier",fieldMapping: { emails: "emailPattern" },memory: 256,timeout: 60}]});const { items } = await client.dataset(run.defaultDatasetId).listItems();for (const item of items) {console.log(`Valid: ${item.valid} | Stages: ${item.stages} | Cost/run: $${item.costEstimate.perRun}`);item.warnings?.forEach(w => console.log(` Warning: ${w}`));console.log("\n--- Generated Code ---\n" + item.generatedCode);}
cURL
# Start the actor runcurl -X POST "https://api.apify.com/v2/acts/ryanclinton~actor-pipeline-builder/runs?token=YOUR_API_TOKEN" \-H "Content-Type: application/json" \-d '{"stages": [{"actorId": "ryanclinton/google-maps-email-extractor","memory": 1024,"timeout": 300},{"actorId": "ryanclinton/email-pattern-finder","fieldMapping": {"urls": "website"},"memory": 512,"timeout": 120}]}'# Fetch results (replace DATASET_ID from the run response)curl "https://api.apify.com/v2/datasets/DATASET_ID/items?token=YOUR_API_TOKEN&format=json"
How it works
- Resolve each stage. Parallel
GET /v2/acts/{id}/builds/defaultwithPromise.allSettledand 30sAbortSignal.timeout. Falls back toGET /v2/acts/{id}→taggedBuilds.latest || any→GET /v2/acts/{id}/builds/{buildId}whendefaultis unavailable. Retry on 429/5xx with exponential backoff (500ms, 1s, 2s). Failures becomeACTOR_NOT_FOUND, not thrown exceptions. - Parse schemas.
buildData.inputSchema(JSON string) → input field names + per-field title/description/example coverage.buildData.actorDefinition.storages.dataset.fields→ output field names + types + metadata coverage. - Type-check transitions. For each non-first stage, check every
fieldMapping[inputField] = outputFieldentry against both schemas. Missing field mappings are blocking; field-name mismatches are advisory. - Score completeness. Per-stage and pipeline-wide schema grades (
good/partial/poor/missing), metadata coverage,agenticCoverage= avg(schemaCoverage, metadataCoverage). - Compute decision.
decisionPosturefrom(valid, advisoryCount, runtimeValidated).confidenceScore= harmonic mean ofconfidenceBreakdown.readinessScoreis an independent gate-like score (1.0 ship / 0.85 canary / 0.6 monitor / 0 no_call). - Generate code. Emit
Actor.main()withActor.call()per stage,listItems({limit:1000})or pagination helper, field-mapping projections. Assumptions and warnings captured incodegenAssumptions[]/codegenWarnings[]. - Sum cost.
costEstimate.perRun= ΣpricingInfos[last].pricingPerEvent.actorChargeEvents[0].eventPriceUsdacross resolved stages.excludesPlatformCompute: trueis explicit. - (Optional) Empirical runtime check.
validateRuntime: truecallsactor-input-testerper stage with a synthesized input built from the declared mapping. 5-minute wall-clock cap viaPromise.race.Promise.allSettledso one stage hang doesn't crash the batch. - Emit. Push a single
recordType: 'report'item. WriteSUMMARYto KV. Chargepipeline-buildifisPPE. Exit SUCCEEDED.
Limitations
- Schema-declaration dependence. If an actor declares no
dataset.fields, output-field validation degrades toDATASET_SCHEMA_MISSINGadvisory. Generated code may still work at runtime; it just wasn't verifiable at design time. - Design-time only. An actor can declare one shape in its schema and emit another at runtime. Enable
validateRuntime: truefor empirical per-stage input checks, but even that doesn't catch output-shape drift. - Token scope. Private actors outside the caller's token scope return
ACTOR_NOT_FOUND. - Flat mappings only.
fieldMappingis{ string: string }. Nested paths and type coercion are out of scope — write them manually on the generated code. - Cost excludes platform compute.
costEstimate.perRunsums PPE event prices. Memory × runtime compute is not modelled. ≥2stages required. Single-actor validation → Input Tester.
Troubleshooting
Stage returns "Actor not found" error despite the actor existing. Confirm that the actorId uses the full username/actor-name format (e.g., ryanclinton/website-contact-scraper, not just website-contact-scraper). Also verify that your Apify token has read access to the actor — private actors belonging to other users cannot be fetched.
All output fields appear empty for a stage. The actor's latest build does not declare a dataset schema in actorDefinition.storages.dataset.fields. This is common for older actors. Pipeline Preflight will issue a warning but still generate code. Use the Apify Console to inspect the actor's actual output dataset and confirm field names manually before relying on the mapping.
Generated code runs but no data appears in the final dataset. This typically means a field mapping references a field name that does not match the actual runtime output. Check the warnings array in the validation report for mapping issues. For actors where the schema is not declared, run the upstream actor manually and inspect its output dataset to get the actual field names.
Validation warnings on every stage transition. If all stages produce warnings about missing schemas, the actors in your pipeline are likely older and do not expose build-time dataset schemas. The validation will still succeed (valid: true) and the generated code will still run — the warnings indicate reduced validation confidence, not a broken pipeline.
Run completes instantly with valid: false and no stageDetails. At least one actor ID was not found. Check each actorId in the errors array, correct the identifier, and re-run.
Responsible use
- This actor only accesses actor metadata and schema information via the Apify API.
- Only actors that your API token has permission to read will be processed.
- Do not use this actor to harvest pricing or schema data from competitor actors at scale.
- Generated code uses
Actor.call()which triggers billable runs on the target actors — review cost estimates before deploying generated pipelines.
FAQ
How does Pipeline Preflight work? It validates that a chain of Apify actors composes correctly. It fetches each stage's declared input and dataset schemas from the Apify API, type-checks every field mapping against both schemas, and returns a decision (ship_pipeline / canary_recommended / monitor_only / no_call) plus a TypeScript orchestrator.
How many stages can an actor pipeline have? There is no hard maximum imposed by Pipeline Preflight — pipelines with 2 to 6+ stages have been validated successfully. Performance scales linearly since all actor schemas are fetched in parallel. Practical limits come from the Apify API rate limits and the 512 MB default memory allocation of Pipeline Preflight itself.
Does Pipeline Preflight run any of the actors in my pipeline? No. Pipeline Preflight only reads actor metadata and schemas from the Apify API. It never calls Actor.call() on your pipeline actors during validation. No credits are consumed by the target actors during a build.
How accurate is the field mapping validation? Accuracy depends on whether each actor publishes its dataset schema in its build metadata. Actors that declare actorDefinition.storages.dataset.fields are validated fully. Actors that define output fields at runtime receive warnings instead of confirmed passes. For best results, combine with Schema Registry to inspect field names before building.
Can I validate pipelines that include actors from other Apify users? Yes, as long as the actors are public (or your token has access to them). The actor ID format is always username/actor-name.
How long does a typical pipeline validation take? Most 2-4 stage pipelines complete in under 30 seconds. Each actor lookup has a 30-second timeout, and all stages are fetched concurrently, so total time is determined by the slowest individual lookup — typically 5-15 seconds for a 3-stage pipeline.
Is the generated TypeScript code production-ready? The generated code is a correct functional starting point for the described pipeline. It needs error handling, pagination for large datasets, logging, and environment-specific configuration before production deployment. Treat it as a reference implementation, not a finished product.
Can I use Pipeline Preflight to validate pipelines with non-Apify actors? No. Pipeline Preflight reads schemas exclusively from the Apify API. All stages must reference valid Apify actor IDs.
How is Pipeline Preflight different from Zapier or Make? Zapier and Make are no-code runtime orchestration tools. Pipeline Preflight is a design-time validation tool that generates Apify-native TypeScript code. It does not execute workflows — it validates that actor schemas connect correctly and produces the code for you to deploy yourself via the Apify platform.
Can I schedule Pipeline Preflight to run automatically? Yes. Use Apify's built-in scheduler to run Pipeline Preflight on a cron schedule — for example, nightly or after each actor deployment. This acts as a schema regression test for your pipeline.
What happens if an actor has no PPE pricing? The stage's ppePrice will be 0 and the cost estimate will exclude that stage. This occurs for actors using compute-unit pricing rather than pay-per-event. The pipeline will still validate and generate code normally.
Can I use the output with the Apify MCP server? Yes. The structured JSON output and the generatedCode string can be passed to an LLM via the Apify MCP server for code review, documentation generation, or further pipeline design assistance.
Is it legal to read actor schemas and pricing via the Apify API? Yes. Pipeline Preflight uses the official Apify API with your own API token to read publicly available metadata for actors you have permission to access. This is standard platform API usage, not scraping.
Sibling actors in the same fleet
Pipeline Preflight is one step in a larger backend/DevOps fleet. When your need falls outside pre-run pipeline-design validation, route to the right tool:
| Need | Use this instead |
|---|---|
| Validate a single actor's input JSON against its declared schema | Input Tester |
| Diff two versions of an actor's schema to detect breaking changes | Schema Validator |
| Score an actor's overall quality (runs, reviews, revenue, issues) | Quality Monitor |
| Find actors by input/output shape in your account | Schema Registry |
| Run a real integration test on a single actor with a known-good input | Test Runner |
| Actually execute the generated pipeline (Pipeline Preflight only generates the code) | Copy generatedCode into your own orchestrator actor, or use Cloud Staging for a dry-run harness |
| Detect silent output-quality regressions after a live pipeline runs | Compliance Scanner |
Pipeline Preflight is a contract validator for Apify pipelines — it ensures that actor inputs and outputs align across stages. Once you have the generatedCode and a ship_pipeline (or canary_recommended) verdict, the orchestrator is yours to schedule, guard, and instrument with the sibling actors above.
Help us improve
If you encounter issues, you can help us debug faster by enabling run sharing in your Apify account:
- Go to Account Settings > Privacy
- Enable Share runs with public Actor creators
This lets us see your run details when something goes wrong, so we can fix issues faster. Your data is only visible to the actor developer, not publicly.
Support
Found a bug or have a feature request? Open an issue in the Issues tab on this actor's page. For custom solutions or enterprise integrations, reach out through the Apify platform.