Selenium Cloud Runner avatar

Selenium Cloud Runner

Pricing

from $8.00 / 1,000 scraped pages

Go to Apify Store
Selenium Cloud Runner

Selenium Cloud Runner

Selenium Cloud Runner scrapes JavaScript-heavy websites using Selenium and headless Chrome. It extracts data with CSS or XPath rules, supports scrolling, popup handling, screenshots, proxies, retries, and structured dataset exports.

Pricing

from $8.00 / 1,000 scraped pages

Rating

0.0

(0)

Developer

Sovanza

Sovanza

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

🌐 Selenium Cloud Runner – Dynamic Website Scraper & Browser Automation Tool

A powerful Selenium-based web scraper that renders JavaScript-heavy pages and extracts data using flexible CSS selectors or XPath rules. This actor works as a universal browser scraping engine, allowing you to scrape almost any website without building a custom scraper from scratch.


πŸš€ Start Scraping Dynamic Websites

Extract data from modern web apps and SPA pages.

πŸ‘‰ Render JavaScript-heavy websites
πŸ‘‰ Extract data using CSS or XPath
πŸ‘‰ Handle scrolling and popups automatically
πŸ‘‰ Export results to JSON, CSV, or Excel

Click β€œRun” and start scraping instantly.


🧠 What This Tool Actually Solves

Most scraping tools fail on modern websites because:

Content loads dynamically via JavaScript
HTML is not available in initial response
Pages require scrolling or interaction

This actor solves that by using:

Selenium + real browser rendering
Dynamic DOM extraction
Configurable waits and interactions
Flexible extraction rules


⚑ Key Features

Selenium web scraper (headless Chrome)
Dynamic website scraping (SPA support)
CSS selector and XPath extraction
Infinite scroll handling
Popup and modal handling
Retry logic and error handling
Screenshot capture support
Proxy support for blocked sites


πŸ”„ How It Works

Open URL in headless browser
Wait for page to load (selector-based)
Handle popups and scrolling (optional)
Extract data using rules
Save structured results to dataset


πŸ“Š Data You Can Extract

Using extraction rules, you can collect:

Text content
HTML elements
Attributes (links, images, etc.)
Lists of elements (multiple matches)


🎯 Real-World Use Cases

🌐 Dynamic Website Scraping
Extract data from JavaScript-heavy sites and SPAs.

πŸ›’ E-commerce Scraping
Collect product listings, prices, and details.

πŸ“Š Content Monitoring
Track changes in dashboards, feeds, or listings.

πŸ” Search Result Extraction
Scrape search pages with infinite scrolling.

πŸ€– Automation Workflows
Integrate scraping into pipelines and APIs.


πŸ› οΈ How to Use

Add URLs to scrape
Define extraction rules (CSS/XPath)
Configure waits and scrolling
Enable proxy if needed
Run the actor
Export or integrate results


πŸ› οΈ How to Use Selenium Cloud Runner on Apify (aligned to this implementation)

At a high level:

  1. Add one or more URLs in urls.
  2. Set waitForSelector so the actor knows when the page is β€œready”.
  3. Define extract rules to collect fields using CSS or XPath.
  4. (Optional) Enable scroll for infinite-scroll feeds and set closeSelectors for popups.
  5. (Optional) Enable takeScreenshot (saves SCREENSHOT_00001.png etc. to the default key-value store).
  6. (Optional) Enable proxyConfiguration for blocked sites (residential proxy recommended).
  7. Run the actor and export results from the default Dataset (JSON/CSV/Excel) or fetch via API.

Input example

Full schema: INPUT_SCHEMA.json. Example:

{
"urls": ["https://example.com/"],
"waitForSelector": "h1",
"waitTimeoutSecs": 25,
"extract": [
{ "name": "title", "selector": "h1", "type": "text", "all": false },
{ "name": "links", "selector": "a", "type": "attr", "attr": "href", "all": true }
],
"scroll": {
"enabled": false,
"maxRounds": 10,
"scrollBy": 900,
"pauseSecs": 1.0,
"stopOnNoNewHeightRounds": 2
},
"closeSelectors": [],
"maxRetries": 2,
"retryDelaySecs": 3,
"includeVisibleText": true,
"includeHtml": false,
"takeScreenshot": false,
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"],
"apifyProxyCountry": "US"
}
}
  • urls (required): One or more URLs to open.
  • waitForSelector (optional): CSS or XPath selector to wait for before extraction (default body).
  • extract (optional): Extraction rules (CSS or XPath) that produce custom fields.
  • scroll (optional): Infinite scroll behavior.
  • closeSelectors (optional): CSS selectors to click to close popups/modals.
  • maxRetries / retryDelaySecs (optional): Retry policy per URL.
  • includeVisibleText (optional): Include page visible text (default true).
  • includeHtml (optional): Include full HTML (default false).
  • takeScreenshot (optional): Save screenshots to key-value store (default false).
  • proxyConfiguration (optional): Apify proxy settings.

βš™οΈ Input Configuration

Required
urls β†’ One or more URLs

Key Options
waitForSelector β†’ Page readiness condition
extract[] β†’ Data extraction rules
scroll β†’ Infinite scroll configuration
closeSelectors β†’ Popup handling
maxRetries / retryDelaySecs β†’ Retry logic
includeVisibleText β†’ Extract full text
includeHtml β†’ Include page HTML
takeScreenshot β†’ Save page screenshots
proxyConfiguration β†’ Use proxy


🧩 Example Extraction Rules

{ "extract": [ { "name": "title", "selector": "h1", "type": "text" }, { "name": "links", "selector": "a", "type": "attr", "attr": "href", "all": true } ] }


πŸ“¦ Output

Each dataset item includes:

inputUrl
finalUrl
status (OK / ERROR)
pageTitle
extracted data (custom fields)
visibleText (optional)
html (optional)
screenshotKey (optional)
error (if failed)
timestamp

Output example

Example dataset item (illustrative):

{
"inputUrl": "https://example.com/",
"finalUrl": "https://example.com/",
"status": "OK",
"pageTitle": "Example Domain",
"extracted": {
"title": "Example Domain",
"links": ["https://www.iana.org/domains/example"]
},
"visibleText": "Example Domain ...",
"screenshotKey": null,
"error": null,
"timestamp": "2026-04-29T12:00:00Z"
}

Example error row:

{
"inputUrl": "https://example.com/",
"finalUrl": "https://example.com/",
"status": "ERROR",
"pageTitle": null,
"extracted": {},
"error": "Navigation failed (Chrome error page). chrome_error_url:chrome-error://chromewebdata/",
"timestamp": "2026-04-29T12:00:00Z"
}

πŸ” Anti-Blocking & Best Practices

To improve success rate:

Use residential proxy
Increase wait times
Use correct selectors
Limit request frequency


FAQ

What is Selenium Cloud Runner used for?

It is a dynamic website scraper that extracts data from JavaScript-heavy pages using Selenium and browser automation.

When should I use Selenium instead of simple HTTP scraping?

Use Selenium when:

Content is loaded dynamically
JavaScript rendering is required
Pages need interaction (scrolling, clicks)

Can I scrape multiple pages in one run?

Yes. You can provide multiple URLs and process them in a single run.

What is the difference between CSS and XPath selectors?

Both are used to locate elements in the DOM:

CSS is simpler and faster
XPath is more powerful for complex queries

Why am I getting empty results?

Possible reasons:

Incorrect selector
Page not fully loaded
Site blocking requests

Can I extract multiple elements at once?

Yes. Use "all": true in extraction rules to return lists.

Can I take screenshots of pages?

Yes. Enable takeScreenshot to save screenshots to the key-value store.

What formats can I export data in?

JSON, CSV, and Excel β€” or via the Apify platform API.

Is this suitable for large-scale scraping?

Yes, but you should configure proxies, retries, and concurrency properly.

Can I integrate this into automation workflows?

Yes. This actor is designed for pipelines, APIs, and scheduled jobs.


πŸ“ˆ Why Use This Tool?

Instead of building custom Selenium scripts, you get:

Ready-to-use browser scraper
Flexible extraction system
Scalable scraping workflows
Structured data output


πŸš€ Get Started

Add your URLs and extraction rules to start scraping dynamic websites instantly.