Playwright MCP Server — Browser Automation via MCP Protocol
Pricing
$6.50 / 1,000 browser actions
Playwright MCP Server — Browser Automation via MCP Protocol
Model Context Protocol (MCP) server providing Playwright browser automation tools. Navigate, click, fill forms, take screenshots, evaluate JS — all via WebSocket MCP transport for AI agents.
Playwright MCP Server
Introduction
Playwright MCP Server runs a headless Chromium browser on Apify infrastructure and exposes it as a Model Context Protocol (MCP) server over WebSocket. AI agents — including Claude, GPT-4, LangChain agents, and any MCP-compatible client — connect to the server and instantly gain access to a full browser automation toolkit: navigate URLs, click elements, fill forms, take screenshots, extract page content, run JavaScript, and generate PDFs.
Every connection gets an isolated browser context with its own cookies, session storage, and browsing history. No shared state between agents. Sessions auto-close on idle or after a configurable maximum duration.
Primary use cases:
- AI agents that need real browser access without managing infrastructure
- Claude or GPT-4 workflows requiring live web interaction
- QA automation with AI-driven test generation and execution
- Research agents that browse and extract structured data from dynamic sites
- Startups building AI assistants that interact with third-party web apps
Key advantages: zero server setup, MCP-native protocol (no REST adapters), built-in Apify proxy rotation, pay-per-action pricing, and full session isolation with audit logs.
Why Use This Actor
Managing a self-hosted Playwright MCP server means provisioning a VPS, installing Chromium, handling memory leaks, setting up proxy rotation, and paying for always-on compute — even when your agent isn't running. This actor eliminates all of that.
No server to manage. Start an actor run, grab the WebSocket URL from the output, connect your agent. The browser is ready in under 3 seconds. When you're done, the actor stops and you stop paying.
MCP-native protocol. Unlike Browserbase or Steel.dev (which expose REST APIs requiring custom adapters), this actor speaks MCP directly. Any MCP-compatible client works out of the box — Claude Desktop, LangChain, LlamaIndex, custom agents.
Cost comparison:
| Scenario | This Actor | Self-Hosted VPS | Browserbase |
|---|---|---|---|
| 100 actions/day | ~$0.65/day | $5–50/day server | ~$1.50/day |
| 1,000 actions/day | ~$6.50/day | $5–50/day server | ~$15/day |
| Idle time | $0 | Full server cost | Per-minute |
Built-in Apify proxy. Datacenter and residential proxies rotate automatically — no separate proxy account needed.
How to Use
Step 1: Start the actor
Run with zero configuration. All defaults are production-ready. The actor starts a browser, binds a WebSocket server, and writes the connection URL to the run output.
Step 2: Get the WebSocket URL
Once the actor status changes to RUNNING, open the Key-Value Store tab and look for the OUTPUT record. It contains:
{"wsUrl": "wss://...","status": "ready","tools": ["navigate", "click", "fill", "screenshot", ...]}
Step 3: Connect your MCP client
Using @modelcontextprotocol/sdk (Node.js):
import { Client } from '@modelcontextprotocol/sdk/client/index.js';import { WebSocketClientTransport } from '@modelcontextprotocol/sdk/client/websocket.js';const transport = new WebSocketClientTransport(new URL(wsUrl));const client = new Client({ name: 'my-agent', version: '1.0.0' }, { capabilities: {} });await client.connect(transport);// List available toolsconst { tools } = await client.listTools();// Navigate to a URLconst result = await client.callTool({ name: 'navigate', arguments: { url: 'https://example.com' } });
Using Claude Desktop (claude_desktop_config.json):
{"mcpServers": {"playwright": {"command": "npx","args": ["-y", "mcp-remote", "wss://YOUR_ACTOR_WS_URL/mcp"]}}}
Input Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
transport | string | websocket | MCP transport: websocket or sse |
maxSessionDurationSecs | integer | 1800 | Max session length before auto-close (60–7200s) |
idleTimeoutSecs | integer | 300 | Inactivity timeout (30–1800s) |
maxConcurrentSessions | integer | 1 | Max simultaneous browser contexts (1–10) |
viewport | object | {width:1280, height:720} | Default viewport size |
locale | string | en-US | Browser locale |
timezone | string | America/New_York | Browser timezone |
blockResources | array | [] | Resource types to block (image, stylesheet, font, media, script) |
allowedDomains | array | [] | Restrict navigation to these domains (empty = all allowed) |
blockDomains | array | [] | Domains the browser cannot navigate to |
authToken | string | "" | Bearer token for MCP connection auth |
enabledTools | array | ["*"] | MCP tools to expose (["*"] = all 14 tools) |
screenshotFormat | string | png | Default screenshot format: png, jpeg, webp |
screenshotQuality | integer | 80 | JPEG/WebP quality (1–100) |
saveScreenshotsToKV | boolean | true | Save screenshots to Key-Value Store |
saveSessionLogs | boolean | true | Log all tool calls to dataset |
proxyConfiguration | object | {useApifyProxy: true} | Proxy settings |
Common configurations:
Quick session (defaults): Just run the actor — no parameters needed.
Stealth browsing: Set proxyConfiguration to use residential proxies, add a custom userAgent, and set locale/timezone to match the proxy location.
Restricted session: Set allowedDomains: ["example.com"] and enabledTools: ["navigate", "getContent", "screenshot"] to limit what the agent can do.
Available MCP Tools
| Tool | Description |
|---|---|
navigate | Navigate to a URL, wait for page load |
click | Click an element by CSS selector |
fill | Fill a form input with text |
screenshot | Capture viewport or full-page screenshot |
evaluate | Run JavaScript in the page context |
getContent | Extract page content as text, HTML, markdown, or accessibility tree |
select | Select a <select> dropdown option |
hover | Hover over an element |
scroll | Scroll the page or a specific element |
waitFor | Wait for a selector to appear/disappear |
getCookies | Get cookies from the browser context |
setCookies | Set cookies on the browser context |
keyboard | Type text or press keyboard keys |
pdf | Generate a PDF of the current page |
Example: Navigate and extract content
// Tool call{ "name": "navigate", "arguments": { "url": "https://news.ycombinator.com" } }// Response{ "content": [{ "type": "text", "text": "Navigated to https://news.ycombinator.com — Page title: Hacker News — Status: 200" }] }// Follow up{ "name": "getContent", "arguments": { "format": "text", "selector": ".itemlist" } }
Tip: For dynamic pages, call waitFor with a selector before click or fill to ensure the element is ready. Use getContent with format: "accessibility" to give your LLM a structured view of interactive elements.
Tips and Advanced Usage
Session management: Each WebSocket connection is a separate isolated session. Multiple agents can run concurrently (up to maxConcurrentSessions) without sharing cookies or storage. Set maxConcurrentSessions: 3 and connect three agents simultaneously for parallel workflows.
Handling authenticated sites: Use setCookies to inject session cookies before navigating to protected pages, or use fill + click to complete a login flow. Cookies persist for the entire session duration.
Proxy for blocked sites: Switch from datacenter to residential proxies in proxyConfiguration for sites that block data center IPs. Residential proxies have significantly higher success rates on sites with bot protection.
Screenshot storage: All screenshots are saved to the Apify Key-Value Store with keys like screenshot-{sessionId}-{timestamp}.png. Retrieve them via the Apify API after the session ends. Set saveScreenshotsToKV: false to disable storage and reduce run costs.
Security: Use allowedDomains in production deployments to prevent agents from navigating to unintended sites. Combine with authToken to restrict which clients can connect to the MCP server.
PDF generation: The pdf tool works best after a full page load. PDFs are saved to the KV store alongside screenshots. Use printBackground: true to capture CSS backgrounds.
Pricing
$6.50 per 1,000 browser actions (Pay-Per-Event)
Pricing includes all platform compute costs — no hidden fees.
Each MCP tool call (navigate, click, fill, screenshot, etc.) counts as one action. Protocol messages (initialize, tools/list, ping) are free.
| Scenario | Actions | Cost |
|---|---|---|
| Simple page scrape (navigate + getContent) | 2 | $0.01 |
| Form fill flow (navigate + 5 fills + click + screenshot) | 8 | $0.05 |
| Multi-page browse session (50 actions) | 50 | $0.33 |
| Automated testing session (200 actions) | 200 | $1.30 |
| Heavy agent workflow (1,000 actions/day) | 1,000 | $6.50 |
Free tier: First 500 actions free to try the actor.
Compare: Self-hosted browser servers cost $5–50/day in compute regardless of usage. Browserbase charges per minute (~$0.60/min), making idle time expensive. This actor charges only for actual browser actions.
FAQ
What is MCP and how does it work?
Model Context Protocol (MCP) is an open standard from Anthropic that defines how AI models communicate with external tools and services. An MCP server exposes a list of "tools" (functions) that a connected AI agent can call. The agent sends tool call requests over the connection, the server executes them, and returns results. This actor implements an MCP server where the "tools" are browser automation actions backed by Playwright.
Which AI models and agents are compatible?
Any MCP-compatible client works: Claude Desktop (via claude_desktop_config.json), Claude API with tool use, GPT-4 with function calling via an MCP adapter, LangChain agents, LlamaIndex agents, AutoGen, CrewAI, and custom agents using the @modelcontextprotocol/sdk. The actor speaks standard MCP over WebSocket — if your framework supports MCP, it connects directly.
Can I use this with Claude Desktop?
Yes. Add the actor's WebSocket URL to your Claude Desktop config using mcp-remote as a bridge (since Claude Desktop uses stdio transport internally). The mcp-remote package handles the WebSocket-to-stdio translation automatically.
How do I handle login-protected sites?
Two approaches: (1) Use fill and click tools to complete a login form as a human would, then navigate to protected pages — cookies persist for the session. (2) Use setCookies to inject a pre-authenticated session cookie directly, then navigate immediately to protected content.
Is there a limit on session duration?
Yes — configurable via maxSessionDurationSecs (default: 1800s / 30 min, max: 7200s / 2 hours). Sessions also close after idleTimeoutSecs of inactivity (default: 300s / 5 min). The actor sends warning notifications before closing so agents can wrap up gracefully.
Can multiple agents share one browser?
No — each WebSocket connection gets its own isolated browser context. Cookies, storage, and browsing history are completely separate between sessions. Concurrent sessions share the same browser process but not any state. This is by design for security and reproducibility.
How are screenshots stored?
Screenshots are saved to the Apify Key-Value Store with keys formatted as screenshot-{sessionId}-{timestamp}.{format}. Access them via the Apify API: GET https://api.apify.com/v2/key-value-stores/{storeId}/records/{key}. Set saveScreenshotsToKV: false if you only need the base64 data returned in the tool response.
What happens if the browser crashes?
The actor catches browser process exits and returns a BROWSER_CRASHED error to the connected client. The actor itself remains running and new sessions can be created. If the browser cannot be restarted, the actor exits with an error code for Apify's failure webhook.

