Pricing

Pay per usage

Go to Apify Store

AI-Ready Website Crawler

Try for free

Crawl websites and convert to clean markdown for AI/RAG, LLM fine-tuning, and document pipelines.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Fulcria Labs

Actor stats

Bookmarked

Total users

Monthly active users

4 months ago

Last modified

Categories

Start URL

startUrl

Required

The primary URL to start crawling from. The crawler will follow links within the same domain.

Type:string

Additional URLs

additionalUrls

Optional

Optional list of additional URLs to crawl. Each URL will be crawled independently.

Type:string[]

Default:

[]

Max Pages

maxPages

Optional

Maximum number of pages to crawl. Set to 0 for unlimited (not recommended).

Type:integer

Minimum:1

Maximum:10000

Default:50

Max Crawl Depth

maxDepth

Optional

Maximum link depth to follow from the start URL. Depth 0 means only the start URL itself.

Type:integer

Minimum:0

Maximum:50

Default:3

Requests Per Second

requestsPerSecond

Optional

Maximum number of requests per second. Lower values are more polite to target servers.

Type:number

Minimum:0.1

Maximum:20

Default:2

Respect robots.txt

respectRobotsTxt

Optional

Whether to respect robots.txt rules. Strongly recommended to keep enabled.

Type:boolean

Default:true

Include URL Patterns

includeUrlPatterns

Optional

Only crawl URLs matching these regex patterns. Leave empty to crawl all URLs on the same domain.

Type:string[]

Default:

[]

Exclude URL Patterns

excludeUrlPatterns

Optional

Skip URLs matching these regex patterns. Common exclusions: login pages, API endpoints, media files.

Type:string[]

Default:

[
  "\\.(pdf|zip|tar|gz|mp4|mp3|avi|mov|wmv|jpg|jpeg|png|gif|svg|ico|woff|woff2|ttf|eot)$",
  "/api/",
  "/login",
  "/logout",
  "/signin",
  "/signup",
  "/auth/"
]

Remove CSS Selectors

removeSelectors

Optional

CSS selectors for elements to remove before converting to markdown. Defaults remove nav, footer, ads, etc.

Type:string[]

Default:

[
  "nav",
  "footer",
  "header",
  "aside",
  ".sidebar",
  ".nav",
  ".navigation",
  ".menu",
  ".footer",
  ".header",
  ".advertisement",
  ".ad",
  ".ads",
  ".social-share",
  ".cookie-banner",
  ".cookie-consent",
  ".popup",
  ".modal",
  ".breadcrumb",
  ".pagination",
  "#comments",
  ".comments",
  "script",
  "style",
  "noscript",
  "iframe",
  "svg"
]

Content CSS Selectors

contentSelectors

Optional

CSS selectors to target main content. If specified, only content within these selectors is extracted. Leave empty to auto-detect.

Type:string[]

Default:

[]

Request Timeout (seconds)

requestTimeoutSecs

Optional

Timeout for each HTTP request in seconds.

Type:integer

Minimum:5

Maximum:120

Default:30

User Agent

userAgent

Optional

Custom User-Agent header. Leave empty for default.

Type:string

Default:Mozilla/5.0 (compatible; AIReadyWebsiteCrawler/1.0; +https://apify.com)

Wayback Machine CDX URL List Scraper

parseforge/wayback-cdx-scraper

Pull every archived URL the Internet Archive has captured for any domain or URL prefix. Get timestamps, MIME types, status codes, content digests, and direct snapshot links. Filter by date range, status, MIME, and uniqueness. Export to JSON, CSV, or Excel for SEO recovery and competitive research.

ParseForge

Domain Inspector

visita/domain-inspector

A powerful, all-in-one tool to perform DNS lookups, WHOIS queries, HTTP status checks, and SSL certificate validation for a list of domains. It can clean full URLs down to the bare domain (e.g., https://www.apify.com/store → apify.com) and run all checks in a single batch.

Visita Intelligence

353

5.0

Domain Availability Checker — Bulk DNS & WHOIS Lookup

automation-lab/domain-availability-checker

Check exact domain names in bulk and return available/registered verdicts with DNS/WHOIS method, registrar, creation/expiry dates, name servers, timing, and errors in structured JSON.

Stas Persiianenko

Expired Domain Finder with SEO Metrics

constant_quadruped/expired-domain-finder

Find expired and expiring domains enriched with WHOIS data, backlink counts, domain authority, and Wayback Machine history. Quality scored and ranked.

SEO Rank Checker

masiting/seo-rank-checker

SEO Rank Checker lets you instantly check domain SEO metrics using Semrush, Moz, and Majestic. Built for automation, APIs, and scalable SEO workflows.

Rafi Halilintar

5.0

Keyword Metrics Pro - Google + Bing Volume, CPC & Trend

doesaiknow/doesaiknow-keyword-metrics-apify

Multi-engine keyword research API: Google + Bing search volume, CPC, competition, 12-month trends. Up to 1,000 keywords per scan, pay-per-scan, no subscription. Cache hits free. Structured JSON for SEO dashboards, n8n / Make / Zapier, AI agents. Ahrefs / Semrush alternative.

Dawid S

153

Semrush Keyword Magic Tool

burbn/semrush-keyword-magic-tool

Extract Semrush Keyword Magic Tool keyword ideas and variations into a clean dataset. Get average monthly search volume, Low/High CPC, competition level & index, search intent + confidence, SERP feature type, monetization score, and monthly search trends. Perfect for SEO keyword research.

Kevin

427

Whois Domain Lookup

agenscrape/whois-domain-lookup

Fast WHOIS domain lookup. Get domain registration data including status, nameservers, registrar info, expiration dates, DNSSEC, and contacts. Supports all major TLDs (.com, .org, .io, .uk, etc). $0.001 per result.

Agenscrape

123

Domain Availability, Expiry, WHOIS, DNS, IP, ASN, 70+ TLD

datascoutapi/DomainDaddy

Domain availability and expiry dates, WHOIS & RDAP data, DNS (A, MX, NS, TXT), IP geolocation and ASN details, calculates domain age, and supports batch processing. Supports 70+ TLDs, handles errors gracefully, and delivers clean, structured JSON output.

halam

122

5.0

iOS & Android App Rankings Scraper

slothtechlabs/ios-android-app-rankings-scraper

Scrape Apple App Store and Google Play top chart rankings (Top Free, Top Paid, Top Grossing) across 60+ countries and 50+ categories in a single run. Track app rankings daily with batch processing. The affordable Sensor Tower alternative — get the same ranking data at 1/100th the cost.