A static list of URLs to check for captchas. To be able to add new URLs on the fly, enable the Use request queue option.

For details, see Start URLs in README.

Type:array

Proxy Configuration

proxyConfiguration

Optional

Specifies proxy servers that will be used by the scraper in order to hide its origin.

For details, see Proxy configuration in README.

Type:object

Default:

{}

Cheerio

checkers.cheerio

Optional

Crawl with Cheerio

Type:boolean

Default:true

Puppeteer

checkers.puppeteer

Optional

Crawl with Puppeteer

Type:boolean

Default:true

Playwright

checkers.playwright

Optional

Crawl with Playwright

Type:boolean

Default:true

Enabled

saveSnapshot

Optional

Will save HTML for Cheerio and HTML + screenshot for Puppeteer/Playwright

Type:boolean

Default:true

Enqueue any URL on domain (no need for link selector or pseudo URLs)

enqueueAllOnDomain

Optional

Will enqueue any URLs on the domain

Type:boolean

Default:true

Link Selector

linkSelector

Optional

A CSS selector saying which links on the page (<a> elements with href attribute) shall be followed and added to the request queue. This setting only applies if Use request queue is enabled. To filter the links added to the queue, use the Pseudo-URLs setting.

If Link selector is empty, the page links are ignored.

For details, see Link selector in README.

Type:string

Pseudo-URLs

pseudoUrls

Optional

Specifies what kind of URLs found by Link selector should be added to the request queue. A pseudo-URL is a URL with regular expressions enclosed in [] brackets, e.g. http://www.example.com/[.*]. This setting only applies if the Use request queue option is enabled.

If Pseudo-URLs are omitted, the actor enqueues all links matched by the Link selector.

For details, see Pseudo-URLs in README.

Type:array

Default:

[]

Allow only links from the same domain

allowOnlyLinksFromSameDomain

Optional

Additional check to make sure that only link related to the same domain are enqueued.

Type:boolean

Repeat checks on provided URLs

repeatChecksOnProvidedUrls

Optional

Will access each URL multiple times. Useful to test the same URL or bypass blocking of the first page.

Type:integer

Max number of pages checked per domain

maxNumberOfPagesCheckedPerDomain

Optional

The maximum number of pages that the checker will load. The checker will stop when this limit is reached. It's always a good idea to set this limit in order to prevent excess platform usage for misconfigured scrapers. Note that the actual number of pages loaded might be slightly higher than this value.

If set to 0, there is no limit.

Type:integer

Maximum concurrent pages checked per domain

maxConcurrentPagesCheckedPerDomain

Optional

Specifies the maximum number of pages that can be processed by the checker in parallel for one domain. The checker automatically increases and decreases concurrency based on available system resources. This option enables you to set an upper limit, for example to reduce the load on a target website.

Type:integer

Minimum:1

Default:500

Maximum number of concurrent domains checked

maxConcurrentDomainsChecked

Optional

Specifies the maximum number of domains that should be checked at a time. This setting is relevant when passing in more than one URL to check.

Type:integer

Minimum:1

Maximum:10

Default:5

Retire browser instance after request count

retireBrowserInstanceAfterRequestCount

Optional

How often will the browser itself rotate. Pick a higher number for smaller consumption, pick a lower number to rotate (test) more proxies.

Type:integer

Minimum:1

Default:10

Navigation timeout (seconds)

navigationTimeoutSecs

Optional

Specifies the maximum time in seconds the request will wait for the page to load. If the page is not loaded within this time, the browser will throw an error and the page will be marked as failed.

Type:integer

Minimum:1

Default:60

Headfull browser (XVFB)

puppeteer.headfull

Optional

Only works for Puppeteer type!

Type:boolean

Use Chrome

puppeteer.useChrome

Optional

Only works for Puppeteer type! Be careful that Chrome is not guaranteed to work with Puppeteer.

Type:boolean

Wait for

puppeteer.waitFor

Optional

Only works for Puppeteer type. Will wait on each page. You can provide number in ms or a selector.

Type:string

Default:2000

Memory

puppeteer.memory

Optional

Must be power of 2 between 128 and 32768.

Type:integer

Minimum:1024

Maximum:32768

Default:4096

Chrome

playwright.chrome

Optional

Use Chrome when checking

Type:boolean

Default:false

Firefox

playwright.firefox

Optional

Use Firefox when checking

Type:boolean

Default:true

Safari (Webkit)

playwright.webkit

Optional

Use Safari when checking

Type:boolean

Use Chrome instead of Chromium

playwright.useChrome

Optional

Only works for Playwright type! Be careful that Chrome is not guaranteed to work with Playwright.

Type:boolean

Headfull browser (XVFB)

playwright.headfull

Optional

If the browser should be headfull or not

Type:boolean

Wait for

playwright.waitFor

Optional

Only works for playwright type. Will wait on each page. You can provide number in ms or a selector.

Type:string

Default:2000

Memory

playwright.memory

Optional

Must be power of 2 between 128 and 32768.

Type:integer

Minimum:1024

Maximum:32768

Default:4096

HTTP Status Code Checker

automation-lab/http-status-checker

Check HTTP status codes and redirects in bulk for any list of URLs. Detect 404 errors, 301/302 redirects, redirect chains, and broken links for SEO audits and site maintenance.

Stas Persiianenko

Simple HTTP Status Code Checker

onescales/simple-http-status-code-checker

This Apify Actor allows you to check in bulk HTTP status codes and redirects for a list of URL's. It's a useful tool for SEO audits, website maintenance, and link checking.

One Scales

1.7K

5.0

Bulk URL Status Checker

logiover/bulk-url-status-checker

Bulk URL checker for HTTP status codes (200/301/302/404/410/500) with final URL, redirect chain and response time. Fast parallel link audit for SEO, site migrations, monitoring and QA. Export clean results to CSV/JSON.

Logiover

Broken Link Checker

parseforge/broken-link-checker

Scan thousands of URLs instantly and detect broken links, 404s, redirects, and slow pages. Get comprehensive link health reports with status codes, response times, redirect chains, and detailed error information. Perfect for website maintenance, SEO audits, and quality assurance.

ParseForge

5.0

Insta Scraper

devdarkknight/insta-scraper

Quickly scrape public metadata from any Instagram profile by providing the Instagram username. This actor fetches information such as follower and following counts, bio text and links, full name, profile picture URL, and verification status, all without needing login credentials or session cookies.

devdarkknight

133

Event Lead Extractor — Speakers & Attendees

ryanclinton/event-lead-extractor

Turn any conference, trade show, or event page into a qualified lead list — complete with emails, phone numbers, social profiles, and lead scores. Paste in event URLs from Eventbrite, Lu.ma, Sched, Bizzabo, or any custom conference website and the actor does the rest.

ryan clinton

Humanize AI Text - #1 AI Humanizer

neatrat/humanize-ai-text

Produce 100% human-like content from ChatGPT, Gemini, Bing, or any other AI text without altering and changing its meaning, tone and context.

Neatrat

184

Website Checker Workload

lukaskrivka/website-checker-workload

Creates reasonable workloads for analyzing any website with the Website Checker actor and combines the resulting data. This is the easiest way to analyze any website for compute unit usage and anti-scraping blocking.

Lukáš Křivka

Website Checker Runner Playwright

lukaskrivka/website-checker-playwright

Checks the provided website using Playwright. This is a low level runner, most likely you want to use the high level master actor - https://apify.com/lukaskrivka/website-checker

Lukáš Křivka

171