Enter Google Search keywords or a URL of a specific web page. Supports advanced search operators. Examples: san francisco weather, https://www.cnn.com, function calling site:openai.com. Leave empty only when using URLs (optional bulk) below.

Type:string

URLs (optional bulk)

urls

Optional

Skip Google Search and scrape these URLs directly. If set, the search term above is ignored.

Type:array

Default:

[]

Maximum results

maxResults

Optional

The maximum number of top organic Google Search results whose web pages will be extracted. If the query is a URL, this field is ignored and only that page is fetched.

Type:integer

Minimum:1

Maximum:100000

Default:10

Output formats

outputFormats

Optional

Select one or more formats to which the target web pages will be extracted and saved in the resulting dataset.

Type:string[]

Default:

[
  "markdown"
]

SERP proxy group

serpProxyGroup

Optional

Overrides the default Apify Proxy group used for fetching Google Search results.

Type:string

Default:GOOGLE_SERP

Options:

GOOGLE_SERPSHADER

SERP max retries

serpMaxRetries

Optional

The maximum number of times the Actor will retry fetching Google Search results on error. If the last attempt fails, the entire search step fails.

Type:integer

Minimum:0

Maximum:5

Default:2

Proxy configuration

proxyConfiguration

Optional

Apify Proxy settings for scraping target web pages. When enabled, requests start on datacenter proxies. If a site blocks the request, the Actor automatically escalates to residential proxies (up to 3 retries, then stays on residential for the rest of the run).

Type:object

Default:

{
  "useApifyProxy": true
}

Select a scraping tool

scrapingTool

Optional

Raw HTTP is fast and works for most static sites. Browser (Playwright) mode is not available in this Python build — if selected, Raw HTTP is used instead.

Type:string

Default:raw-http

Options:

raw-httpbrowser-playwright

Remove HTML elements (CSS selector)

removeElementsCssSelector

Optional

CSS selectors for elements removed from the DOM before conversion to text or Markdown. Set to a non-matching selector like dummy_keep_everything to disable removal.

Type:string

Default:nav, footer, script, style, noscript, svg, img[src^='data:'], [role="alert"], [role="banner"], [role="dialog"], [role="alertdialog"], [role="region"][aria-label*="skip" i], [aria-modal="true"]

HTML transformer

htmlTransformer

Optional

How to transform HTML after element removal. None keeps the cleaned page; Readable text extracts the main article body.

Type:string

Default:none

Options:

nonereadable

Target page max retries

maxRequestRetries

Optional

Per-page retry budget after the first failure, before the page is skipped or proxy escalation continues.

Type:integer

Minimum:0

Maximum:3

Default:1

Target page dynamic content timeout

dynamicContentWaitSecs

Optional

Maximum seconds to wait for dynamic content when using Browser mode. Ignored for Raw HTTP (the default scraping tool in this build).

Type:integer

Minimum:0

Maximum:60

Default:10

Remove cookie warnings

removeCookieWarnings

Optional

Remove cookie consent banners before extraction. Slightly increases processing time.

Type:boolean

Default:true

Enable debug mode

debugMode

Optional

Store debugging information (proxy tier, final URL, byte length) in each dataset record under the debug field.

Type:boolean

Default:false

RAG Web Browser

api-empire/rag-web-browser

API Empire

RAG Web Browser

simpleapi/rag-web-browser

SimpleAPI

RAG Web Browser

scraper-engine/rag-web-browser

Scraper Engine

RAG Web Browser

scrapier/rag-web-browser

🌐 RAG Web Browser (rag-web-browser) is an intelligent tool for retrieving and generating answers from web sources with RAG. ⚡ Speed up research, get accurate citations, and streamline workflows for developers & analysts.

Scrapier

RAG Web Browser API - Search & Extract

tugelbay/rag-web-browser

Google search + public URLs to Markdown/text/HTML for RAG and AI agents. Guide: https://konabayev.com/tools/rag-web-browser/?utm_source=apify_info&utm_medium=referral&utm_campaign=rag-web-browser

Tugelbay Konabayev

🧠 RAG Web Browser — Web Content for AI & LLMs

nexgendata/rag-web-browser

Web browser for RAG pipelines and AI agents. Search Google, scrape top results, return clean Markdown. Feed your LLM with real-time web data. Works with Claude, GPT, LangChain, CrewAI. No API key needed.

NexGenData

Web Scraper Pro

autonova/web-scraper-pro

AutoNova

RAG Web Browser

apify/rag-web-browser

Web search and fetch tool for AI agents and RAG pipelines. It queries Google Search, scrapes the top N pages using a full web browser, and returns their content as clean Markdown for further processing by an LLM. Can also fetch individual URLs.

Apify

124K

4.0

RAG Web Browser Scraper

datapilot/rag-web-browser-scraper

RAG Web Browser Search & Crawl Actor uses to search Bing or crawl URLs, then extracts page content as clean markdown. It captures title, description, language, HTTP status, and structured metadata. Supports multiple queries, proxies, and outputs organized crawl + search results.

Data Pilot

RAG Web Browser

parseforge/rag-web-browser

Give your AI agents real-time web access! Search the web on any topic and get full page content as clean Markdown, ready for LLMs, RAG pipelines, or OpenAI Assistants. Includes titles, descriptions, links, authors, images, and metadata. Start grounding your AI with fresh data in minutes!