Firecrawl Website Crawler avatar
Firecrawl Website Crawler

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Firecrawl Website Crawler

Firecrawl Website Crawler

Enhanced Website Crawling with Superior JS Rendering Enhanced website crawler using Firecrawl's Crawl API for superior JavaScript rendering, smart rate limiting, anti-bot bypass, and clean markdown extraction.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

John Rippy

John Rippy

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

13 hours ago

Last modified

Share

"Enhanced Website Crawling with Superior JS Rendering" by John Rippy | johnrippy.link

🏆 2025 Zapier Automation Hero of the YearProject Phoenix: A 95-step AI sales pipeline cutting development time by 50%. Read more →


Enhanced website crawler using Firecrawl's Crawl API for superior JavaScript rendering, smart rate limiting, anti-bot bypass, and clean markdown extraction.

Features

  • Superior JS Rendering - Handles complex JavaScript-heavy websites
  • Anti-Bot Bypass - Built-in techniques to avoid blocking
  • Smart Rate Limiting - Automatic throttling to prevent IP bans
  • Clean Markdown Output - Get beautifully formatted content
  • Subdomain Crawling - Optionally include subdomains
  • URL Pattern Filtering - Include/exclude specific URL patterns
  • Screenshot Capture - Optional visual snapshots of pages
  • Geo-Targeting - Crawl from specific countries
  • Demo Mode - Test without an API key using sample data

Use Cases

  • Content Migration - Extract all content for website migrations
  • SEO Audits - Crawl sites for technical SEO analysis
  • Research & Analysis - Gather content for competitive research
  • Data Extraction - Collect structured data from websites
  • Archival - Create markdown backups of website content
  • Training Data - Gather content for AI/ML training datasets

Input

FieldTypeDescriptionDefault
urlstringWebsite URL to crawlRequired
maxPagesnumberMaximum pages to crawl100
maxDepthnumberMaximum crawl depth5
includeSubdomainsbooleanInclude subdomainsfalse
excludePatternsarrayURL patterns to exclude (regex)-
includePatternsarrayOnly include matching URLs (regex)-
outputFormatstringContent format: markdown, html, text, linksmarkdown
includeScreenshotsbooleanCapture page screenshotsfalse
waitForSelectorstringCSS selector to wait for (JS-heavy sites)-
countrystringCountry code for geo-targeting-
firecrawlApiKeystringYour Firecrawl API key-
webhookUrlstringURL for completion notification-
demoModebooleanRun with sample datafalse

Output

{
"url": "https://example.com/page",
"title": "Page Title",
"description": "Meta description of the page",
"markdown": "# Page Title\n\nFull markdown content...",
"wordCount": 450,
"statusCode": 200,
"crawledAt": "2024-01-15T10:30:00Z"
}

Output Formats

FormatDescription
markdownClean, formatted markdown with headers and links
htmlRaw HTML content
textPlain text with markdown stripped
linksOnly extracted links from each page

Pricing

This actor uses pay-per-event pricing:

EventDescriptionPrice
Crawl StartedCharged when a website crawl is initiated$0.02
Pages Crawled (per 10)Charged per 10 pages successfully crawled$0.01

Getting Your Firecrawl API Key

  1. Visit firecrawl.dev
  2. Sign up for an account
  3. Copy your API key from the dashboard

Demo Mode

Enable Demo Mode to test without an API key. Demo mode returns realistic sample crawl data from a fictional website showing various page types and content.

Examples

Basic Crawl

{
"url": "https://example.com",
"maxPages": 50,
"outputFormat": "markdown"
}

Deep Crawl with Filtering

{
"url": "https://example.com",
"maxPages": 500,
"maxDepth": 10,
"includeSubdomains": true,
"excludePatterns": ["/admin/*", "/login/*"],
"includePatterns": ["/blog/*", "/docs/*"]
}

JS-Heavy Site with Screenshots

{
"url": "https://spa-example.com",
"waitForSelector": ".content-loaded",
"includeScreenshots": true,
"maxPages": 100
}

Best Practices

  1. Start Small - Test with a low maxPages first to verify results
  2. Use Filters - Exclude admin/login pages to focus on public content
  3. Wait for JS - Use waitForSelector for single-page applications
  4. Rate Limiting - Firecrawl handles this automatically, but lower page counts are faster

Support

For questions or issues, contact support@localhowl.com


Built by John Rippy | johnrippy.link


Keywords

firecrawl, website crawler, web scraping, javascript rendering, anti-bot bypass, markdown extraction, content migration, seo audit, site crawl, apify actor