Browser Mcp

Under maintenance

Pricing

from $0.01 / 1,000 results

Try for free

Go to Apify Store

Browser Mcp

Under maintenance

Try for free

The Browser MCP Actor combines Browser with the Model Context Protocol (MCP) to let AI agents control web browsers via a standardized interface. It enables navigation, data extraction, form filling, testing, and complex web automation.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Nahom D

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

5 days ago

Last modified

Browser MCP Actor - RAG Web Browser

Browser automation bridge for AI agents using the Model Context Protocol (MCP) with RAG optimization

The Browser MCP Actor integrates the robust browser automation capabilities of Scrapling (powered by Camoufox) with the Model Context Protocol (MCP), enabling AI agents and language models to perform web scraping, testing, and automation tasks through a standardized interface. This Actor acts as a bridge between AI systems and web browsers, allowing models to navigate websites, extract data, fill forms, and perform complex browser interactions.

NEW: Includes RAG-optimized content extraction, Google Search integration, and intelligent content processing for LLM consumption.

🎯 Key Features

MCP Protocol Integration: Facilitates seamless communication between AI agents and web browsers
RAG-Optimized Content Extraction: Clean markdown, text, and HTML output perfect for LLM consumption
Google Search Integration: Query Google and scrape top results automatically
Intelligent Content Processing: Remove navigation, ads, cookie banners; extract readable content
Comprehensive Browser Automation: Supports multiple browsers with device emulation and stealth mode
Cloudflare Challenge Solving: Automatically bypasses Cloudflare protection when enabled
Intelligent Element Detection: Includes retry mechanisms and error handling for robust automation
Multiple Output Formats: Text, Markdown, and HTML formats (["markdown", "text", "html"])
Session Management: Maintains persistent browser sessions for complex multi-step workflows
State Persistence: Automatically saves and restores state during server migrations
Proxy Support: Built-in support for Apify proxy and custom proxy configurations

👥 Target Audience

AI Developers: Building autonomous agents that need web interaction capabilities
QA Engineers: Implementing AI-assisted testing workflows
Data Scientists: Requiring intelligent web scraping solutions
Businesses: Looking to automate web-based processes through conversational AI interfaces

🚀 Benefits

Reduced Development Time: Eliminates the need for custom browser automation code
Enhanced Reliability: Features AI-driven error recovery and adaptive element selection
Improved Accessibility: Allows non-technical users to describe tasks in natural language
Scalable Automation: Handles dynamic websites and complex user workflows with minimal manual intervention

📋 Available Tools

RAG-Optimized Tools

1. RAG Web Browser (⭐ Recommended)

The all-in-one tool for RAG pipelines: Search Google or scrape a URL, automatically extract clean content optimized for LLM consumption.

Parameters:

query (required): Google Search keywords OR a direct URL to scrape
maxResults (optional): Maximum search results to scrape (default: 3, max: 10)
outputFormats (optional): Array of formats - text, markdown, html (default: ["markdown"])
htmlTransformer (optional): none or readability for main content extraction (default: none)
removeElements (optional): Array of CSS selectors for elements to remove
removeCookieWarnings (optional): Remove cookie consent dialogs (default: true)
solveCloudflare (optional): Solve Cloudflare challenges (default: true)
proxy (optional): Proxy URL
timeout (optional): Request timeout in milliseconds (default: 40000)

Example - Search and scrape:

{
  "command": "rag_web_browser",
  "arguments": {
    "query": "python async programming best practices",
    "maxResults": 3,
    "outputFormats": ["markdown", "text"],
    "htmlTransformer": "readability",
    "solveCloudflare": true
  }
}

Example - Direct URL:

{
  "command": "rag_web_browser",
  "arguments": {
    "query": "https://docs.python.org/3/library/asyncio.html",
    "outputFormats": ["markdown"]
  }
}

2. Google Search

Search Google and return top organic results with URLs, titles, and descriptions.

Parameters:

query (required): Search query
maxResults (optional): Maximum results (default: 5, max: 20)
proxy (optional): Proxy URL

Example:

{
  "command": "google_search",
  "arguments": {
    "query": "machine learning tutorials site:medium.com",
    "maxResults": 10
  }
}

3. Extract Content

Extract and process content from a current browser session in RAG-optimized formats.

Parameters:

session_id (required): Active browser session ID
outputFormats (optional): Array of formats (default: ["markdown"])
htmlTransformer (optional): none or readability (default: none)
removeElements (optional): CSS selectors to remove
includeMetadata (optional): Include page metadata (default: true)

Example:

{
  "command": "extract_content",
  "arguments": {
    "session_id": "abc-123",
    "outputFormats": ["markdown", "text", "html"],
    "htmlTransformer": "readability",
    "includeMetadata": true
  }
}

Browser Automation Tools

4. Navigate

Navigate to a URL with comprehensive configuration options.

Parameters:

url (required): Target URL to navigate to
headless (optional): Run in headless mode (default: true)
solve_cloudflare (optional): Automatically solve Cloudflare challenges (default: false)
network_idle (optional): Wait for network to be idle (default: false)
wait_selector (optional): CSS selector to wait for before returning
timeout (optional): Timeout in milliseconds (default: 30000)
proxy (optional): Proxy URL or configuration
disable_resources (optional): Disable loading of images, fonts, etc. for speed
session_id (optional): Session ID to reuse existing browser session

Example:

{
  "command": "navigate",
  "arguments": {
    "url": "https://example.com",
    "solve_cloudflare": true,
    "headless": true,
    "network_idle": true
  }
}

5. Extract

Extract data from the current page using CSS selectors.

Parameters:

session_id (required): Session ID of the browser
selector (required): CSS selector to extract data from
attribute (optional): Attribute to extract (text, href, src, etc.) - default: text
multiple (optional): Extract multiple elements (default: false)
format (optional): Output format - json, text, or html (default: text)

Example:

{
  "command": "extract",
  "arguments": {
    "session_id": "abc-123",
    "selector": "h1.title",
    "attribute": "text",
    "format": "json"
  }
}

6. Screenshot

Take a screenshot of the current page or specific element.

Parameters:

session_id (required): Session ID of the browser
selector (optional): CSS selector of element to screenshot (full page if omitted)
format (optional): Image format - png or jpeg (default: png)
full_page (optional): Capture full scrollable page (default: false)

Example:

{
  "command": "screenshot",
  "arguments": {
    "session_id": "abc-123",
    "full_page": true,
    "format": "png"
  }
}

7. Click

Click on an element identified by CSS selector.

Parameters:

session_id (required): Session ID of the browser
selector (required): CSS selector of element to click
wait_navigation (optional): Wait for navigation after click (default: false)
timeout (optional): Timeout in milliseconds

Example:

{
  "command": "click",
  "arguments": {
    "session_id": "abc-123",
    "selector": "button.submit",
    "wait_navigation": true
  }
}

8. Fill Form

Fill form fields with provided data.

Parameters:

session_id (required): Session ID of the browser
fields (required): Map of CSS selectors to values to fill

Example:

{
  "command": "fill_form",
  "arguments": {
    "session_id": "abc-123",
    "fields": {
      "input[name='email']": "user@example.com",
      "input[name='password']": "secret123"
    }
  }
}

9. Execute Script

Execute JavaScript code on the current page.

Parameters:

session_id (required): Session ID of the browser
script (required): JavaScript code to execute
args (optional): Arguments to pass to the script

Example:

{
  "command": "execute_script",
  "arguments": {
    "session_id": "abc-123",
    "script": "return document.title"
  }
}

10. Wait

Wait for a specific condition on the page.

Parameters:

session_id (required): Session ID of the browser
selector (optional): CSS selector to wait for
state (optional): State to wait for - attached, detached, visible, hidden (default: attached)
timeout (optional): Timeout in milliseconds

Example:

{
  "command": "wait",
  "arguments": {
    "session_id": "abc-123",
    "selector": ".content-loaded",
    "state": "visible",
    "timeout": 5000
  }
}

11. Get Page Info

Get information about the current page (URL, title, cookies, etc.).

Parameters:

session_id (required): Session ID of the browser

Example:

{
  "command": "get_page_info",
  "arguments": {
    "session_id": "abc-123"
  }
}

12. Close Session

Close a browser session and free resources.

Parameters:

session_id (required): Session ID of the browser to close

Example:

{
  "command": "close_session",
  "arguments": {
    "session_id": "abc-123"
  }
}

🔧 Full Browser Configuration

The Actor supports all Browser configuration options when navigating:

Argument	Description	Optional
`url`	Target URL	❌
`headless`	Run browser in headless (true) or headful (false) mode	✔️
`disable_resources`	Drop unnecessary resources (font, image, media) for speed	✔️
`cookies`	Set cookies for the request	✔️
`useragent`	Custom user agent string	✔️
`network_idle`	Wait until no network connections for 500ms	✔️
`load_dom`	Wait for JavaScript to fully load (default: true)	✔️
`timeout`	Timeout in milliseconds (default: 30000)	✔️
`wait`	Additional wait time after page load	✔️
`wait_selector`	Wait for specific CSS selector	✔️
`wait_selector_state`	State to wait for selector (default: attached)	✔️
`google_search`	Set referer as Google search (default: true)	✔️
`extra_headers`	Dictionary of extra HTTP headers	✔️
`proxy`	Proxy string or configuration	✔️
`solve_cloudflare`	Solve Cloudflare challenges automatically	✔️
`block_webrtc`	Force WebRTC to respect proxy settings	✔️
`hide_canvas`	Add noise to canvas for fingerprint prevention	✔️
`allow_webgl`	Enable WebGL (default: true)	✔️
`real_chrome`	Use installed Chrome browser	✔️
`locale`	Specify user locale (e.g., en-GB, de-DE)	✔️
`timezone_id`	Change browser timezone	✔️

💡 Usage Examples

Example 1: Simple Web Scraping

{
  "command": "navigate",
  "arguments": {
    "url": "https://example.com",
    "wait_selector": "article.post"
  }
}

After navigation, extract titles:

{
  "command": "extract",
  "arguments": {
    "session_id": "<returned-session-id>",
    "selector": "h2.post-title",
    "multiple": true,
    "format": "json"
  }
}

Example 2: Cloudflare-Protected Site

{
  "command": "navigate",
  "arguments": {
    "url": "https://cloudflare-protected-site.com",
    "solve_cloudflare": true,
    "headless": true,
    "network_idle": true
  }
}

Example 3: Form Automation

Navigate, fill, and submit a form:

{
  "command": "navigate",
  "arguments": {
    "url": "https://example.com/login"
  }
}

Then fill and submit:

{
  "command": "fill_form",
  "arguments": {
    "session_id": "<session-id>",
    "fields": {
      "#username": "myuser",
      "#password": "mypass"
    }
  }
}

{
  "command": "click",
  "arguments": {
    "session_id": "<session-id>",
    "selector": "button[type='submit']",
    "wait_navigation": true
  }
}

Example 4: Using Apify Proxy

{
  "command": "navigate",
  "arguments": {
    "url": "https://example.com"
  },
  "proxyConfig": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

🏃 Running the Actor

On Apify Platform

Create a new Actor
Upload this code
Configure input in JSON format
Run the Actor

Locally

$apify run

Input Schema

{
  "command": "navigate",
  "arguments": {
    "url": "https://example.com",
    "solve_cloudflare": true
  },
  "proxyConfig": {
    "useApifyProxy": true
  }
}

MCP Server Mode

If no command is specified, the Actor runs in MCP server mode, accepting commands via stdio:

$python -m src.main

📊 Output

The Actor outputs structured data to the Apify dataset:

{
  "command": "extract",
  "arguments": {
    "session_id": "abc-123",
    "selector": "h1"
  },
  "result": [
    {
      "data": "Example Title",
      "selector": "h1"
    }
  ],
  "status": "success"
}

🛠️ Development

Install Dependencies

$pip install -r pyproject.toml

Project Structure

browser-mcp/
├── src/
│   ├── __init__.py
│   ├── __main__.py
│   ├── main.py              # Main entry point
│   ├── mcp_server.py        # MCP protocol implementation
│   └── browser_session.py   # Browser session management
├── pyproject.toml
└── README.md

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License.

🐛 Troubleshooting

Common Issues

Cloudflare not solving: Ensure solve_cloudflare: true is set in arguments
Session not found: Always use the returned session_id from navigate command
Timeout errors: Increase the timeout value or use network_idle: true
Element not found: Verify selectors and use wait_selector to ensure page is loaded
Sessions lost after migration: This is expected behavior. Browser sessions cannot be fully restored after server migration. Clients should re-create sessions when they receive migration notifications.

State Persistence & Migrations

The Actor implements automatic state persistence to handle Apify server migrations:

How it works:

When a migration event is detected, the Actor saves the current state (active session IDs and metadata)
The Actor then reboots automatically to speed up the migration process
On restart, the Actor checks for previously saved state

Important notes:

Browser sessions cannot be fully restored - Active browser contexts and page states are lost during migration
Session IDs are preserved for tracking purposes, but the underlying browser instances must be recreated
Clients should handle reconnection - After a migration, clients need to create new sessions via the navigate command
For long-running operations - Consider checkpointing your workflow at logical points

Migration frequency:

Migrations can occur at any time due to server maintenance, load balancing, or crashes
The Actor is optimized to complete migrations within seconds

Best practices:

Design workflows to be resumable
Store extracted data frequently using Actor.push_data()
For critical operations, implement retry logic in your client code

📞 Support

For issues and questions:

Open an issue on GitHub
Contact via Apify platform
Check Browser documentation for browser-specific issues

Usage

Configure your search criteria and preferences in the project settings.
Run the scraper to collect the latest job opportunities.
Review the results in the output dataset or file.

Playwright MCP Actor

aluminum_jam/playwright-mcp-actor

The Playwright MCP Actor integrates the robust browser automation capabilities of Playwright with the Model Context Protocol (MCP), enabling AI agents and language models to perform web scraping, testing, and automation tasks through a standardized interface.

anuj upadhyay

5.0

Web Search MCP Server

abotapi/ai-search-mcp-server

An Apify MCP Server that provides real-time web search capabilities for AI agents via the Model Context Protocol (MCP).

AbotAPI

Puppeteer MCP

meysamazing/puppeteer-mcp

AI-powered browser automation via Model Context Protocol. Enable Claude, ChatGPT, and other AI assistants to control browsers, scrape data, and automate web tasks through natural language.

Meysam

Playwright MCP Server

jiri.spilka/playwright-mcp-server

A Model Context Protocol (MCP) server that provides browser automation capabilities using Playwright

Jiří Spilka

126

Browserbase MCP Server

agentify/browserbase-mcp-server

A Model Context Protocol (MCP) server that provides browser automation capabilities using Browserbase.

agentify

5.0

Time MCP Server

agentify/time-mcp-server

An MCP server implementing the Model Context Protocol (MCP) for time-related operations.

agentify

RAG Web Browser

apify/rag-web-browser

Web browser for OpenAI Assistants, RAG pipelines, or AI agents, similar to a web browser in ChatGPT. It queries Google Search, scrapes the top N pages, and returns their content as Markdown for further processing by an LLM. It can also scrape individual URLs.

Apify

8.5K

4.9

Firecrawl MCP Server

agentify/firecrawl-mcp-server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl MCP for web scraping capabilities

agentify

308

Mcp Validator

rocketagro/mcp-validator

**MCP Validator** is a professional validation and compliance testing tool for [Model Context Protocol (MCP)](https://modelcontextprotocol.io) servers. It ensures your MCP server’s **tools, resources, prompts, and templates** are correct, compliant, and production-ready.

Jahid Hasan

Tester MCP Client

jiri.spilka/tester-mcp-client

A model context protocol (MCP) client that connects to any MCP server using Streamable HTTP and displays the conversation in a chat-like UI. It is a standalone Actor server designed for testing MCP servers over Stremable HTTP.

Jiří Spilka

1.1K

5.0

Browser Mcp

Browser Mcp

Browser MCP Actor - RAG Web Browser

🎯 Key Features

👥 Target Audience

🚀 Benefits

📋 Available Tools

RAG-Optimized Tools

1. RAG Web Browser (⭐ Recommended)

2. Google Search

3. Extract Content

Browser Automation Tools

4. Navigate

5. Extract

6. Screenshot

7. Click

8. Fill Form

9. Execute Script

10. Wait

11. Get Page Info

12. Close Session

🔧 Full Browser Configuration

💡 Usage Examples

Example 1: Simple Web Scraping

Example 2: Cloudflare-Protected Site

Example 3: Form Automation

Example 4: Using Apify Proxy

🏃 Running the Actor

On Apify Platform

Locally

Input Schema

MCP Server Mode

📊 Output

🛠️ Development

Install Dependencies

Project Structure

🤝 Contributing

📄 License

🔗 Related Resources

🐛 Troubleshooting

Common Issues

State Persistence & Migrations

📞 Support

Usage

You might also like

Playwright MCP Actor

Web Search MCP Server

Puppeteer MCP

Playwright MCP Server

Browserbase MCP Server

Time MCP Server

RAG Web Browser

Firecrawl MCP Server

Mcp Validator

Tester MCP Client