Pricing

from $5.00 / 1,000 completed jobs

Try for free

Go to Apify Store

Custom Run Async Endpoint

Try for free

Custom Run Async Endpoint runs an async HTTP API inside an Apify Actor container. Submit jobs, poll status, wait for results, control concurrency, protect routes with Bearer auth, and save completed job outputs to the dataset.

Pricing

from $5.00 / 1,000 completed jobs

Rating

5.0

(1)

Developer

Sovanza

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Async API Runner – Custom Endpoint Execution & Background Processing

Run asynchronous workloads behind a hosted HTTP API on your Actor’s container web server URL. Submit jobs, poll for status, or wait longer than typical synchronous proxies allow — ideal when you need queued execution, controlled concurrency, and structured results in the Apify dataset without building extra infrastructure.

What is Custom Run Async Endpoint and How Does It Work?

Custom Run Async Endpoint is an Apify Actor that runs a long-lived Express HTTP server inside your Actor container. Clients call the Container URL to enqueue work; an internal worker pool executes tasks with a configurable concurrency cap; each finished job is written as one dataset row for audit and downstream automation.

This Actor is designed for:

Developers building async handoffs on Apify
Automation engineers who need 202 Accepted job submission
Teams that want dataset-auditable task outcomes
Integration prototypes with echo, sleep, or guarded fetch workloads

What this repo is: src/main.js implements job enqueue, an internal queue (maxConcurrentTasks), optional Bearer auth, blocking wait endpoints, and dataset export per job.

What it is not: a universal “retry any REST method with arbitrary JSON/form bodies” integration engine. Tasks are echo, sleep, or (optionally) fetch wrappers — extend the codebase or call from your own service for full REST orchestration.

Why Use This Async API Runner?

Use this Actor when:

Upstream gateways time out before your work finishes
You need async submission with optional blocking wait on the same run
You want to cap parallel work with maxConcurrentTasks
You need a dataset trail of every completed or failed job
You already run on Apify and want a container URL without separate hosting

➡️ Best for controlled, auditable background-style processing inside an Apify run — not a detached serverless fleet.

Core Capabilities

Capability	Description
Async submission	`POST /tasks` returns `jobId` immediately (`202`); work continues if the client disconnects
Concurrency pool	`maxConcurrentTasks` (1–50) limits parallel workers
Blocking waits	`POST /tasks/:jobId/wait` and `POST /run-and-wait` poll until terminal status or timeout
Optional Bearer auth	`accessToken` in input → protected routes require `Authorization: Bearer …` or `?token=` (except `/health`)
Guarded fetch	Outbound HTTP `fetch` tasks only when `allowFetchTasks` is `true` (off by default)
Dataset audit	Each terminal job pushes one row with status, task, result/error, timestamps

Retries: automatic HTTP retry/backoff is not implemented in main.js. Add retry logic in clients calling this API or extend the worker.

Execution Flow

Enable Container web server in Actor settings; Apify assigns ACTOR_WEB_SERVER_PORT and a public Container URL.
HTTP server listens on that port (default 4321 locally).
Client POST /tasks with task JSON → job pending → worker running → completed or failed.
Terminal jobs call Actor.pushData with jobId, task, result/error, timestamps.
Wait endpoints poll every ~200 ms until done or timeout (clamped by input).

HTTP Routes

Method	Path	Description
GET	`/health`	Liveness `{ ok, service, activeTasks, queued }` — no auth
GET	`/`	Discovery JSON (routes + optional `containerUrl`)
POST	`/tasks`	Body = task JSON → 202 `{ jobId, getUrl, waitUrl }`
GET	`/tasks/:jobId`	Full job snapshot
POST	`/tasks/:jobId/wait`	Block until terminal or timeout (`timeoutMs` query/body)
POST	`/run-and-wait`	Enqueue task + block until terminal or timeout

Task JSON (Supported Workloads)

Default type is echo if omitted.

`type`	Example body	Behaviour
`echo`	`{ "type": "echo", "payload": { … } }`	Returns structured echo result (safe demos)
`sleep`	`{ "type": "sleep", "ms": 5000 }`	Async delay capped by `maxSleepMs`
`fetch`	`{ "type": "fetch", "url": "https://…", "method": "GET", "headers": { } }`	Outbound `fetch()` only if `allowFetchTasks` is `true`. SSRF risk — use `accessToken` and trusted callers only. No configurable request body in shipped code.

How to Use on Apify

Using the Actor

Go to Custom Run Async Endpoint on the Apify platform.
Enable Container web server in Actor Settings → Container web server.
Configure input (optional accessToken, concurrency, wait timeouts, allowFetchTasks).
Start a run and copy the Container URL from the run page.
Call GET /health, then POST /tasks with your task JSON.
Poll GET /tasks/:jobId, block with POST /tasks/:jobId/wait, or use POST /run-and-wait.
Open the Dataset tab (or Output schema links) for finished job rows.

Apify Console (health check, icon, monetization)

After deploying build 0.2+:

Health check: prefilled input sets runHealthCheckOnStart: true — runs a quick echo task, writes one dataset row, and exits in seconds (under 5 minutes). For production API runs, set runHealthCheckOnStart: false to start the long-lived HTTP server.
Standby mode: enabled in actor.json (usesStandbyMode: true). The server handles x-apify-container-server-readiness-probe on GET / and GET /health.
Actor icon: included as .actor/icon.svg (pictureUrl in actor definition).
Monetization (Console only): in Publication → Monetization, switch to Pay per event without passing platform usage to users, and enable Store discounts for Bronze/Silver/Gold tiers.

Input Configuration

Health-check / Try actor (fast exit):

{
  "runHealthCheckOnStart": true,
  "maxConcurrentTasks": 2,
  "allowFetchTasks": false
}

Production server mode:

{
  "runHealthCheckOnStart": false,
  "accessToken": "your-secret-token",
  "maxConcurrentTasks": 5,
  "defaultWaitTimeoutMs": 600000,
  "maxWaitTimeoutMs": 3600000,
  "allowFetchTasks": false,
  "maxSleepMs": 3600000
}

Legacy full example:

{
  "runHealthCheckOnStart": false,
  "accessToken": "your-secret-token",
  "maxConcurrentTasks": 5,
  "defaultWaitTimeoutMs": 600000,
  "maxWaitTimeoutMs": 3600000,
  "allowFetchTasks": false,
  "maxSleepMs": 3600000
}

Field	Description
`runHealthCheckOnStart`	If `true`, runs a demo echo task and exits (Apify health checks). If `false`, starts the HTTP server (production).
`accessToken`	Optional shared secret (secret input). Protects all routes except `GET /health`.
`maxConcurrentTasks`	Max parallel workers (default `5`, range 1–50).
`defaultWaitTimeoutMs`	Default wait window when client omits `timeoutMs` (default `600000`).
`maxWaitTimeoutMs`	Hard cap for any wait timeout (default `3600000`).
`allowFetchTasks`	Must be `true` to enable outbound `fetch` tasks (default `false`).
`maxSleepMs`	Maximum milliseconds for `sleep` tasks (default `3600000`).

Output

The Actor exposes two output surfaces (see Output schema in Console):

Output	Description
Container API	Live base URL (`{{run.containerUrl}}`) while the run is active — use for `/health`, `/tasks`, wait routes
Job results	Default dataset — one row per completed or failed job

Dataset fields (per job)

Field	Description
`jobId`	UUID assigned at enqueue
`status`	`completed` or `failed`
`task`	Original task JSON
`result`	Structured result object when successful
`error`	Error message when failed
`createdAt`	When the job was enqueued
`startedAt`	When execution started
`finishedAt`	When the job finished

Example dataset row (echo task):

{
  "jobId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "completed",
  "task": { "type": "echo", "payload": { "message": "hello" } },
  "result": { "taskType": "echo", "payload": { "message": "hello" } },
  "error": null,
  "createdAt": "2026-05-21T10:00:00.000Z",
  "startedAt": "2026-05-21T10:00:00.100Z",
  "finishedAt": "2026-05-21T10:00:00.150Z"
}

Authentication

When accessToken is set, protected routes accept:

Header: Authorization: Bearer <token>
Query: ?token=<token>

GET /health remains public for probes.

Performance & Reliability

Increase maxConcurrentTasks to drain the queue faster — avoid overloading downstream targets if fetch is enabled.
timeoutMs on wait endpoints is clamped between safe bounds from input.
Actor run lifetime is still bounded by your Apify plan timeout.
Edge proxies may impose limits below your configured timeoutMs.

Use Cases

Scenario	Fits when…
Long interactions without blocking callers	Accept job now, finalize later via poll/wait
Parallel fan-out demos	Many echo or bounded sleep jobs to prove queues
Controlled enrichment fetch	You explicitly enable `allowFetchTasks` and lock down `accessToken`
Webhook receivers	Your external service POSTs tasks to the Container URL

For bulk REST integrations requiring bodies, multipart form, retries, or OAuth — extend this codebase or wrap a dedicated upstream service.

Integrations & API

Call the Container URL from curl, Postman, Zapier, Make, or your backend
Read finished jobs from the Apify dataset API
Chain with schedules: keep a long-lived run or restart per batch depending on your pattern
Use Output schema links in Console for Container API and dataset URLs after each run

FAQ

Is this async or synchronous?

Both. Jobs are async by default (POST /tasks → 202). Wait endpoints block on the same Actor process until the job finishes or times out.

Does the Actor retry failed fetch tasks?

No automatic retry inside the worker. Clients may submit a new POST /tasks if needed.

Can I send PUT/PATCH bodies or multipart forms?

The shipped fetch path forwards method + headers only — extend executeTask if you need bodies or uploads.

How do I secure outbound fetch?

Keep allowFetchTasks: false unless required. Always set accessToken and restrict who can reach the Container URL.

Where do results go?

Each finished job is pushData’d to the default dataset. Pending/running jobs exist only in memory until they complete.

Why is my Container URL empty locally?

Set ACTOR_WEB_SERVER_URL (see local development below). On Apify, the URL appears on the run page when the container web server is enabled.

SEO Keywords (high-intent)

async api runner apify
custom run async endpoint
apify container web server
background job queue actor
async task endpoint
apify express api actor
long running api apify
dataset job audit apify

Why Choose This Actor?

Native Apify container URL — no separate hosting for the API layer
202 Accepted job model with optional blocking waits
Concurrency control built in
Dataset row per job for automation and auditing
Optional auth and fetch guardrails

Limitations

Item	Detail
Run lifetime	Bounded by Actor plan / max run duration
No built-in retry/backoff	Add in clients or extend worker
Fetch semantics	Single `fetch` call; body preview capped at 50k chars
In-memory queue	Jobs are lost if the run aborts before completion
Not a full integration bus	Echo/sleep/guarded fetch only in stock code

Running Locally

Apify CLI (recommended):

cd custom-run-async-endpoint
npm install
apify run

Manual Node (Windows PowerShell example):

cd custom-run-async-endpoint
npm install
Remove-Item Env:APIFY_IS_AT_HOME -ErrorAction SilentlyContinue
$env:APIFY_LOCAL_STORAGE_DIR = "$PWD\storage"
$env:CRAWLEE_STORAGE_DIR = "$PWD\storage"
$env:CRAWLEE_PURGE_ON_START = "0"
$env:ACTOR_WEB_SERVER_PORT = "4321"
$env:ACTOR_WEB_SERVER_URL = "http://127.0.0.1:4321"
npm start

Then open http://127.0.0.1:4321/health.

Deploy to Apify

Push from custom-run-async-endpoint/ (apify push or Git integration).
Enable Container web server in Actor settings.
Build and start a run → copy Container URL → verify GET /health.
Use Output schema links for Container API and dataset access.

Get Started

Enable the container web server, start a run, submit your first echo task to POST /tasks, and inspect job rows in the dataset — then scale up with concurrency, auth, and wait endpoints as needed.

Facebook Reels scraper

datapilot/facebook-reels-scraper

Facebook Reels Scraper is an Apify Actor that uses to extract reel data from URLs. It collects title, description, uploader, views, likes, thumbnail, duration, resolution, and a direct video link. Supports residential proxies, async processing, and outputs structured JSON results.

Data Pilot

Amazon Product Scraper

datapilot/amazon-product-scraper

Amazon Scraper Actor uses to collect product data via search terms, URLs, or ASINs. Extracts title, price, rating, reviews, images, features, and sellers. Supports proxies, async scraping, and optional full details. Outputs clean structured JSON.

Data Pilot

Async-Friendly Company Directory

alizarin_refrigerator-owner/async-company-directory

Async Company Directory features 200+ verified companies scored on meeting practices, remote policies & work-life balance. Includes detailed culture profiles w/policies, tools & Glassdoor ratings. Perfect for job seekers escaping meeting-heavy workplaces, career coaches & HR benchmarking competitors

The Howlers

Shopify Scraper API

experthasan/shopify-scraper-api

High-performance async API for scraping Shopify store data including products, apps, themes, and store information for superior speed and reliability.

Mahmudul Hasan

Webhook Lead Pusher — Send Enriched Leads to Any Endpoint

ryanclinton/webhook-lead-pusher

Push enriched leads to any HTTP webhook — Zapier, Make, N8N, HubSpot, Salesforce, or custom APIs. Individual, batch, or chunked delivery with retry logic, custom headers, Bearer auth, and fan-out to multiple endpoints. $0.03/lead.

Ryan Clinton

Website API and Endpoint Analyzer

lofomachines/website-api-and-endpoint-analyzer

Analyze one or more page URLs and output one dataset row per detected API or endpoint with network metadata and risk signals.

Lofomachines

Rss Feed Scraper

yashgoyal.md/rss-feed-scraper

Scrape multiple RSS feeds concurrently with async I/O. Input a list of URLs and a per-feed item limit. Outputs structured JSON: title, URL, date, description. Fast, parallel, production-ready.

Yash Goyal

5.0

Save To S3

drinksight/save-to-s3

Designed to be run from an ACTOR.RUN.SUCCEEDED webhook, this actor downloads a task run's default dataset and saves it to an S3 bucket.

Richard Weaver

100

Website Contact Scraper — v Phone & Lead Generation Tool

scrapepilot/website-contact-scraper-phone-lead-generation-tool

Extract verified emails, phone numbers, addresses, and social profiles from any domain list. Features ultra-fast async crawling, automated lead quality scoring, tech stack detection, and state recovery. Ultra-lightweight and cost-efficient (runs browserless to save compute!)

Scrape Pilot

Realtor Property Tax History

realtorscraper/realtor-property-tax-history

Fetch property and tax history data from Realtor.com efficiently with this Apify actor. Supports multiple properties via IDs or URLs, async processing for faster results, and immediate data push. Includes comprehensive error handling and logging for smooth execution.