Pricing

Pay per usage

Wick Web Fetcher — Browser-Grade Content Extraction

Fetch web pages using Chrome's real TLS fingerprint. Returns clean markdown for LLMs and RAG pipelines. No headless browser needed — fast and lightweight.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Adam Fisk

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

Wick Web Fetcher

A lightweight content extraction Actor powered by Wick, an open-source tool that uses Chrome's real network stack (Cronet) to fetch web pages. Because requests go through the same TLS implementation as a real Chrome browser (BoringSSL, HTTP/2, QUIC), Wick reaches sites that block raw HTTP clients.

When to use this Actor

Quick single-page fetches where spinning up a full browser is overkill
LLM and RAG pipelines that need clean markdown from web pages
Lightweight content extraction at low memory cost (256 MB)
Complement to browser-based Actors -- use Wick for the pages that don't need JS rendering, save browser compute for the pages that do

How it works

Under the hood, this Actor runs the Wick binary as a local HTTP API server inside the container. Wick makes requests using Cronet -- Chrome's network stack extracted as a standalone library. The response HTML is converted to clean markdown, stripping navigation, ads, and boilerplate.

No headless browser is launched. This makes it fast (~1-3s per page) and lightweight (256 MB vs typical 1-4 GB for browser-based Actors).

Getting started

Run the Actor with a list of URLs:

{
    "urls": ["https://www.nytimes.com", "https://docs.example.com"],
    "mode": "fetch",
    "format": "markdown"
}

Or crawl a whole site:

{
    "urls": ["https://docs.example.com"],
    "mode": "crawl",
    "maxDepth": 2,
    "maxPages": 20
}

Results appear in the Output tab as a table. Each row is one page with its URL, title, content, status code, and timing.

Modes

Fetch (default)

Fetches one or more URLs and returns clean content. Each URL becomes one row in the output dataset with title, content, status code, and timing.

Crawl

Starts from a URL and follows same-domain links. Returns content for every page discovered, each as a separate dataset row. Control depth (1-5) and max pages (1-50).

Map

Discovers all URLs on a site by checking sitemap.xml and following links. Returns a URL list without fetching content -- useful for planning a targeted crawl or building a sitemap.

Output

Each dataset row contains:

Field	Description
`url`	The URL that was fetched
`title`	Page title
`content`	Page content in markdown, HTML, or plain text
`statusCode`	HTTP response status
`timingMs`	Fetch duration in milliseconds
`format`	Output format used
`fetchedAt`	ISO 8601 timestamp

Residential IP mode (optional)

For additional anti-detection, you can connect this Actor to your own Wick instance running on your machine. Requests then route through your residential IP, combining Apify's scheduling and monitoring with your own network.

Install Wick on your machine (brew install wick or npm i -g wick-mcp)
Start the API server: wick serve --api
Expose it via a tunnel (Cloudflare Tunnel, ngrok, etc.)
Enter the tunnel URL in the Wick Tunnel URL input field

Limitations

This Actor uses Wick's Cronet-only build. The full Wick binary also supports CEF-based JavaScript rendering, but the Cronet build keeps this Actor lightweight (256 MB vs ~1.5 GB with CEF bundled). For JS-heavy SPAs, run Wick locally with the CEF renderer or pair this with a browser-based Actor.
Best for content pages. Wick excels at articles, documentation, blogs, and product pages. For structured data extraction (e.g., specific fields from a listing), consider combining Wick's output with an LLM or a purpose-built scraper.

Integrations

Wick's output works with Apify's built-in integrations. Some ideas:

Pinecone / Qdrant / PGVector -- Crawl a docs site, then push the markdown straight into a vector database for RAG.
OpenAI Vector Store -- Feed crawled content to an OpenAI Assistant.
Google Sheets -- Export fetched pages to a spreadsheet for review.
Zapier / Make / n8n -- Trigger downstream workflows when a crawl finishes.

Set these up from the Integrations tab on your Actor run page.

Cost estimate

This Actor uses 256 MB of memory and runs fast, so compute costs are low:

Task	Approximate cost
Fetch 10 URLs	~$0.001
Crawl 50 pages	~$0.005
Map a site (100 URLs)	~$0.001

You only pay for Apify compute units. The Wick engine is fully open source (MIT license) — no subscription, no paid tier.

Resources

🧠 RAG Web Browser — Web Content for AI & LLMs

nexgendata/rag-web-browser

Web browser for RAG pipelines and AI agents. Search Google, scrape top results, return clean Markdown. Feed your LLM with real-time web data. Works with Claude, GPT, LangChain, CrewAI. No API key needed.

NexGenData

RAG Web Browser

apify/rag-web-browser

Web search and fetch tool for AI agents and RAG pipelines. It queries Google Search, scrapes the top N pages using a full web browser, and returns their content as clean Markdown for further processing by an LLM. Can also fetch individual URLs.

Apify

106K

3.8

RAG Web Browser

scraper-engine/rag-web-browser

Scraper Engine

RAG Web Browser

api-empire/rag-web-browser

API Empire

RAG Web Browser

simpleapi/rag-web-browser

SimpleAPI

RAG Web Browser API - Search & Extract

tugelbay/rag-web-browser

Google search + public URLs to Markdown/text/HTML for RAG and AI agents. Guide: https://konabayev.com/tools/rag-web-browser/?utm_source=apify_info&utm_medium=referral&utm_campaign=rag-web-browser

Tugelbay Konabayev

RAG Web Browser

parseforge/rag-web-browser

Give your AI agents real-time web access! Search the web on any topic and get full page content as clean Markdown, ready for LLMs, RAG pipelines, or OpenAI Assistants. Includes titles, descriptions, links, authors, images, and metadata. Start grounding your AI with fresh data in minutes!

ParseForge

RAG Web Browser

crawlerbros/rag-web-browser

Search the web or fetch direct URLs and return clean markdown for LLM/RAG pipelines. filters: domainAllowlist/Blocklist, minTextLength, keywordsAnyOf. No login, no cookies.

Crawler Bros

Web Scraper For Llms

abotapi/web-scraper-for-llms

Stealth web scraping engine built for LLMs. Converts any web page to clean markdown or HTML

AbotAPI

AI Agent Web Fetcher

abotapi/ai-fetch-python

An advanced web fetcher that can fetch almost all websites and convert them to LLM-friendly Markdown format. Perfect for AI agents, RAG systems, and integration with search actors.

AbotAPI