Pricing

from $50.00 / 1,000 results

AI Web Task Runner

Run natural-language browser tasks with Playwright. Extract structured data, follow task-relevant links, capture screenshots, generate reports, and export reusable scripts.

Pricing

from $50.00 / 1,000 results

Rating

0.0

(0)

Developer

Solutions Smart

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

What does AI Web Task Runner do?

AI Web Task Runner is an Apify Actor that turns natural-language browser tasks into controlled Playwright automation runs.

It can:

browse public websites
follow task-relevant links
extract structured results
capture screenshots
save raw HTML
generate a human-readable report
export a reusable Playwright Python script from the successful task trajectory

This Actor is designed for public-web automation, extraction, research, and script generation.

It is not a login bot, spam bot, comment bot, messaging bot, or anti-bot bypass tool.

How it differs from a fixed scraper

Most scrapers are built for one website and one output shape.

AI Web Task Runner is different:

you describe the task in natural language
the Actor opens one or more start URLs
it follows task-relevant public pages
it records an action trajectory
it extracts best-effort results even without an LLM
it can optionally use an LLM for safer planning, schema mapping, and summarization

This makes it useful for a wider class of public-web tasks than a single-purpose scraper, while still staying controlled and safety-constrained.

Main modes

`run_task`

Default mode.

Use this for general browser-task execution, such as:

finding features
locating pricing information
summarizing a product page
finding the correct public page for a business task

`extract`

Use this for structured extraction.

If you provide an extractionSchema, the Actor tries to map observed content into that schema.

`research`

Use this to browse task-relevant public pages and produce a summary with source URLs.

`generate_script`

Use this to run a task and export a reusable standalone Playwright Python script based on the successful action trajectory.

`audit_lead`

Optional compatibility mode.

This preserves the previous lead/contact-audit workflow and outputs company-profile style results for website contact and outreach auditing.

Input examples

Example 1: Pricing extraction

{
  "task": "Find the pricing plans and extract plan name, price, billing period, and main features.",
  "startUrls": [
    { "url": "https://example.com" }
  ],
  "mode": "extract",
  "extractionSchema": {
    "plans": [
      {
        "name": "",
        "price": "",
        "billingPeriod": "",
        "features": []
      }
    ]
  },
  "maxPages": 5,
  "captureScreenshots": true
}

Example 2: Research task

{
  "task": "Find what services this company offers and summarize them with source URLs.",
  "startUrls": [
    { "url": "https://example.com" }
  ],
  "mode": "research",
  "maxPages": 6,
  "maxDepth": 2,
  "sameDomainOnly": true
}

Example 3: Generate reusable Playwright script

{
  "task": "Open the website, navigate to the pricing page, and extract the pricing table.",
  "startUrls": [
    { "url": "https://example.com" }
  ],
  "mode": "generate_script",
  "generateReusableScript": true,
  "maxPages": 5,
  "captureScreenshots": true
}

Example 4: Optional lead audit template

{
  "task": "Audit this website for contact and sales outreach readiness.",
  "startUrls": [
    { "url": "https://example.com" }
  ],
  "mode": "audit_lead",
  "maxPages": 5,
  "captureScreenshots": true
}

Output record types

For all main modes except audit_lead, the default dataset contains only:

`task_result`

One final task-level record. This is the main export row and the recommended unit for pricing and CSV export.

Detailed page snapshots and extracted items are stored in the key-value store and referenced from the final task result.

It contains:

task
mode
final status
pages visited
steps executed
summary
result payload
confidence
screenshot keys
report key
trajectory key
generated script key if applicable
page snapshots key
extracted items key

`audit_lead` compatibility output

When mode = audit_lead, the default dataset contains:

company_profile

The older page records are preserved as compatibility artifacts in the key-value store, not as billable default dataset rows.

Key-value store artifacts

For the main task-runner modes, the Actor saves:

REPORT.html
TASK_RESULT.json
TASK_TRAJECTORY.json
PAGE_SNAPSHOTS.json
EXTRACTED_ITEMS.json
GENERATED_SCRIPT_RECORD.json if enabled
generated_script.py if enabled
generated_script_metadata.json if enabled
screenshots if enabled
raw HTML if saveHtml = true

This means the default dataset stays clean and export-friendly, while detailed execution artifacts remain available in the key-value store.

For audit_lead, compatibility artifacts include:

COMPANY_PROFILES.json
PAGE_RECORDS.json
REPORT.html
run_log.json

Generated Playwright script

When generateReusableScript = true or mode = generate_script, the Actor saves:

generated_script.py
generated_script_metadata.json

The generated script:

is standalone Playwright Python
is based on the recorded safe action trajectory
includes comments for the reproduced browser steps
contains no secrets
does not rely on arbitrary LLM-generated executable code

Safety model

The Actor only allows a fixed safe action set:

visit_url
click_link_text
click_css_selector
type_text
press_key
select_option
wait
extract_current_page
collect_links
stop

The Actor does not allow:

arbitrary Python execution
shell execution
unrestricted JavaScript execution
login/cookie automation in V1
posting, commenting, or messaging automation
paywall bypass
CAPTCHA bypass
destructive actions
purchases or form submissions that change state

Deterministic vs LLM-assisted mode

Deterministic mode

If llmProvider = none, the Actor still works.

It will:

open start URLs
collect page snapshots
infer task keywords
follow obvious task-relevant links within limits
extract visible text, headings, links, tables, prices, emails, phones, and structured candidates
produce a best-effort task result

LLM-assisted mode

If an LLM provider is configured, the Actor may use the model for:

safe action planning
choosing the next allowed action
mapping page snapshots into extractionSchema
summarizing findings

The LLM is constrained to structured JSON outputs and validated before use.

If the LLM fails, the Actor falls back to deterministic behavior.

Optional lead-audit template

The older lead/contact-auditor behavior is still available in:

"mode": "audit_lead"

That mode keeps:

lead-audit heuristics
page-level contact findings
company profile aggregation
compatibility outputs for outreach workflows

It is now an optional template mode, not the main identity of the Actor.

Example use cases

Extract pricing plans from a SaaS website
Extract a table from a public webpage
Research product features from a company website
Find contact or sales paths from a public website
Summarize same-domain public pages related to a topic
Generate a reusable Playwright script for a repeated browser task

Limitations

This Actor is intended for public websites.
It does not support login-heavy or state-changing automation in V1.
Deterministic extraction is heuristic and best-effort.
Some websites hide key data behind scripts, forms, or client-side UI patterns.
LLM mode can improve planning and summarization, but it is still constrained and fallback-safe.
It is not an anti-bot bypass product.

Troubleshooting

I only got the homepage

Increase maxDepth or start from a more relevant public page.

The final result is incomplete

Increase:

maxPages
maxDepth
timeoutSeconds

and consider using extract mode with an extractionSchema.

The Actor did not visit the page I expected

Check:

TASK_TRAJECTORY.json
PAGE_SNAPSHOTS.json
EXTRACTED_ITEMS.json
REPORT.html

These artifacts show what the Actor actually saw and did.

I want the older contact-audit behavior

Use:

"mode": "audit_lead"

The Actor should not submit forms or log in

That is expected in V1. The safety model deliberately avoids state-changing automation.

Web Scraper Task

undrtkr984/web-scraper-task

Matt

135

Prompt to Apify Task Builder

junipr/prompt-to-apify-task-builder

Convert a natural-language goal into an Apify task configuration for a chosen actor.

junipr

AI Web Scraper with Playwright Browser (No-Code, MCP)

data_rig/ai-web-scraper

Run a real Playwright browser as an AI web scraper. Extract structured data from any site using natural language—no selectors or scripts. Handles JS-heavy pages, pagination, and interactions. Built for MCP agents like OpenCode and Claude Code.

Data Rig

Task Template Builder

sovanza.inc/task-template-builder

Task Template Builder creates reusable SOPs, checklists, project tasks, and automation templates from task titles and goals. It generates steps, subtasks, acceptance criteria, dependencies, tags, roles, time estimates, and automation hints for business workflows.

Sovanza

5.0

FarmTech Task Inventory

funguystudios-owner/farmtech-task-inventory

Our task inventory improves the performance of the company. It requires an intial data of your company. It then directs further tasks without intervention. Complete the form enter your data and lead your company with FarmTech Task Inventory.

Solomon Ubani

Task Memory Orchestrator

tri_angle/task-memory-orchestrator

Tri⟁angle

Apify Task Usage Reporter

vittuhy/apify-task-usage-reporter

This actor scans your Apify account and provides a detailed summary of your platform usage and costs, broken down by task. It helps you understand which tasks consume the most resources over a specific period.

Vít Tuhý

Playwright Test Agent MCP Server

bronze_quarterback/playwright-test-agent-mcp

Generate, run, and debug Playwright E2E tests through natural language. Run specs, analyze failures, generate new tests, and list test files.

Segun Zubair

Puppeteer MCP

meysamazing/puppeteer-mcp

AI-powered browser automation via Model Context Protocol. Enable Claude, ChatGPT, and other AI assistants to control browsers, scrape data, and automate web tasks through natural language.

Meysam

Playwright MCP Actor

aluminum_jam/playwright-mcp-actor

The Playwright MCP Actor integrates the robust browser automation capabilities of Playwright with the Model Context Protocol (MCP), enabling AI agents and language models to perform web scraping, testing, and automation tasks through a standardized interface.

anuj upadhyay

5.0