Firecrawl Website Crawler avatar
Firecrawl Website Crawler

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Firecrawl Website Crawler

Firecrawl Website Crawler

Enhanced Website Crawling with Superior JS Rendering Enhanced website crawler using Firecrawl's Crawl API for superior JavaScript rendering, smart rate limiting, anti-bot bypass, and clean markdown extraction.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

John Rippy

John Rippy

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

2

Monthly active users

10 days ago

Last modified

Share

Enhanced website crawler using Firecrawl's Crawl API for superior JS rendering, smart rate limiting, anti-bot bypass, and clean markdown extraction. Built by John Rippy (https://www.linkedin.com/in/johnrippy/).


What is Firecrawl?

Firecrawl is a web scraping API that handles JavaScript-heavy websites, anti-bot protection, and converts pages to clean markdown. It's particularly good at:

  • Rendering JavaScript (React, Vue, Angular sites)
  • Bypassing Cloudflare and other bot protection
  • Extracting clean, structured content

This actor uses Firecrawl under the hood - you provide your Firecrawl API key and this actor orchestrates the crawling, handles rate limits, and delivers structured results to Apify.

Why use this instead of Firecrawl directly?

  • Runs on Apify's infrastructure (no server needed)
  • Integrates with Apify datasets, webhooks, and scheduling
  • Works with Zapier, Make, n8n via webhooks
  • Pay-per-result pricing on Apify

How to Get Your Firecrawl API Key

  1. Go to firecrawl.dev (or firecrawl.link/john-rippy for 10% off)
  2. Sign up for an account (free tier: 500 credits/month)
  3. Go to your dashboard → API Keys
  4. Copy your API key
  5. Paste it in the firecrawlApiKey field below

Features

  • Superior JS Rendering - Handles complex JavaScript-heavy websites
  • Anti-Bot Bypass - Built-in techniques to avoid blocking
  • Smart Rate Limiting - Automatic throttling to prevent IP bans
  • Clean Markdown Output - Get beautifully formatted content
  • Subdomain Crawling - Optionally include subdomains
  • URL Pattern Filtering - Include/exclude specific URL patterns
  • Screenshot Capture - Optional visual snapshots of pages
  • Geo-Targeting - Crawl from specific countries
  • Demo Mode - Test without an API key using sample data

Quick Start

Try it first (Free - Demo Mode)

{
"demoMode": true
}

This returns sample crawl data so you can see the output format without charges or API key.

Crawl a Website

{
"firecrawlApiKey": "fc-XXXXXXXXXXXXXXXX",
"url": "https://example.com",
"maxPages": 50,
"outputFormat": "markdown",
"demoMode": false
}

Crawl Specific Sections Only

{
"firecrawlApiKey": "fc-XXXXXXXXXXXXXXXX",
"url": "https://docs.example.com",
"maxPages": 100,
"includePatterns": ["/docs/.*", "/guides/.*"],
"excludePatterns": ["/blog/.*", "/changelog/.*"],
"demoMode": false
}

Deep Crawl with Screenshots

{
"firecrawlApiKey": "fc-XXXXXXXXXXXXXXXX",
"url": "https://example.com",
"maxPages": 200,
"maxDepth": 10,
"includeSubdomains": true,
"includeScreenshots": true,
"demoMode": false
}

JavaScript-Heavy Site

{
"firecrawlApiKey": "fc-XXXXXXXXXXXXXXXX",
"url": "https://react-app.example.com",
"maxPages": 25,
"waitForSelector": ".main-content",
"demoMode": false
}

Demo Mode

Set demoMode: true to test with sample data (no charges). When you're ready for real results, set demoMode: false or omit it.

{
"demoMode": true,
...
}

Input Parameters

FieldTypeDescriptionDefault
urlstringWebsite URL to crawlRequired
maxPagesnumberMaximum pages to crawl100
maxDepthnumberMaximum crawl depth5
includeSubdomainsbooleanInclude subdomainsfalse
excludePatternsarrayURL patterns to exclude (regex)-
includePatternsarrayOnly include matching URLs (regex)-
outputFormatstringContent format: markdown, html, text, linksmarkdown
includeScreenshotsbooleanCapture page screenshotsfalse
waitForSelectorstringCSS selector to wait for (JS-heavy sites)-
countrystringCountry code for geo-targeting-
firecrawlApiKeystringYour Firecrawl API key-
webhookUrlstringURL for completion notification-
demoModebooleanRun with sample datafalse

Get Your Firecrawl API Key

Get 10% off at firecrawl.link/john-rippy - Sign up for free tier (500 credits/month) or paid plans.

Output Format

{
"url": "https://example.com/page",
"title": "Page Title",
"description": "Meta description of the page",
"markdown": "# Page Title\n\nFull markdown content...",
"wordCount": 450,
"statusCode": 200,
"crawledAt": "2024-01-15T10:30:00Z"
}

Common Problems & Solutions

"Invalid API key" error

Cause: Your Firecrawl API key is wrong or expired. Fix: Get your API key from firecrawl.dev dashboard and copy it exactly.

"Rate limit exceeded" error

Cause: You've hit Firecrawl's rate limits or used all your credits. Fix:

  • Check your usage at firecrawl.dev dashboard
  • Upgrade your plan for more credits
  • Reduce maxPages to crawl fewer pages

Empty or missing content

Cause: The site may require JavaScript rendering or has bot protection. Fix:

  • Use waitForSelector to wait for content to load
  • Try a different outputFormat (html instead of markdown)
  • Some heavily protected sites may not work

Crawl taking too long

Cause: Large sites with many pages take time to crawl. Fix:

  • Reduce maxPages or maxDepth
  • Use includePatterns to only crawl specific sections
  • Use excludePatterns to skip unnecessary areas (blog, changelog, etc.)

Demo data showing instead of real results

Cause: demoMode is still set to true. Fix: Set demoMode: false and provide your firecrawlApiKey.

Firecrawl Actors Comparison

ActorBest ForUse When...
Firecrawl Website Crawler (this one)Full site crawlingYou need to crawl an entire website
Firecrawl ScrapeSingle pagesYou have specific URLs to scrape
Firecrawl SearchFinding pagesYou need to find URLs before scraping
Firecrawl Site MapperSite structureYou need a sitemap/URL list

Pricing

This actor uses pay-per-event billing:

EventDescriptionPrice
Crawl StartedCharged when a website crawl is initiated$0.02
Pages Crawled (per 10)Charged per 10 pages successfully crawled$0.01

Use Cases

  • Content Migration - Extract all content for website migrations
  • SEO Audits - Crawl sites for technical SEO analysis
  • Research & Analysis - Gather content for competitive research
  • Data Extraction - Collect structured data from websites
  • Archival - Create markdown backups of website content
  • Training Data - Gather content for AI/ML training datasets

Built by John Rippy | Actor Arsenal