Website Content Crawler
Pricing
from $0.01 / 1,000 results
Website Content Crawler
Crawl websites for SEO audits. Extracts HTML, title, meta tags, headings, links, & text content from pages. Automatic sitemap detection & parsing Extracts metadata (title, description, OG tags) Heading structure (H1, H2, H3) Internal & external link analysis Image extraction w/alt text Word count
Pricing
from $0.01 / 1,000 results
Rating
0.0
(0)
Developer
The Howlers
Actor stats
1
Bookmarked
74
Total users
13
Monthly active users
42 days
Issues response
10 days ago
Last modified
Categories
Share
Website Crawler - SEO Audit Crawler with Markdown Extraction
Fast, reliable website crawler built for SEO audits and AI/LLM content analysis. Auto-discovers sitemaps, extracts metadata, headings, and LLM-ready markdown content from every page.
BYOK (Bring Your Own Key) -- you provide your own API credentials.
Before You Start
This actor requires your own API credentials to fetch real data.
Where to get your key: Your Firecrawl API key for web crawling. When provided, Firecrawl is used as the primary crawler (faster, handles JS rendering). Falls back to Playwright if not provided or if Firecrawl fails.
You can test with Demo Mode first (free, no key needed) to see the output format before committing.
Quick Start
Test with Demo Mode (free, no API key needed)
{"demoMode": true,"startUrls": "https://example.com"}
Run with real data
{"demoMode": false,"startUrls": "https://example.com","maxCrawlPages": 25,"maxCrawlDepth": 2,"crawlSitemap": true,"firecrawlApiKey": "YOUR_API_KEY_HERE","useFirecrawl": true}
Input Parameters
| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
startUrls | array | - | No | URLs to start crawling from |
maxCrawlPages | integer | 25 | No | Maximum number of pages to crawl |
maxCrawlDepth | integer | 2 | No | Maximum link depth to follow |
crawlSitemap | boolean | true | No | Try to find and use sitemap for URL discovery |
firecrawlApiKey | string | - | Yes* | Your Firecrawl API key for web crawling. When provided, Firecrawl is used as the primary crawler (faster, handles JS rendering). Falls back to Playwright if not provided or if Firecrawl fails. |
useFirecrawl | boolean | true | No | Use Firecrawl as the primary crawling method. Requires Firecrawl API key. Falls back to Playwright if disabled or if Firecrawl fails. |
demoMode | boolean | false | No | Run in demo mode without real credentials. Returns mock success response for testing. |
webhookUrl | string | - | No | URL to POST results when scraping completes (Zapier, Make, n8n, custom endpoint) |
*Required when Demo Mode is off.
Pricing
This actor uses pay-per-event billing:
| Event | Description | Price |
|---|---|---|
| Page Crawled | Per page crawled with full HTML, text, and markdown extraction | $0.05 |
| Sitemap Discovered | Per sitemap discovered and parsed for URL extraction | $0.05 |
Demo mode is free -- no charges for sample data.
Troubleshooting
"API key is required"
You have Demo Mode turned off but didn't provide an API key. Either:
- Turn Demo Mode on to test with sample data
- Add your API key in the input
"API error 403" or "Unauthorized"
Your API key is invalid, expired, or doesn't have access to this specific API endpoint. Double-check your key and account permissions.
"API error 429" or "Rate limit"
Too many requests. Wait a minute and try again, or reduce the number of items per run.
No results or empty dataset
Check the run log for error messages. Common causes:
- Invalid input format (check the examples above)
- API key without proper permissions
- The target data doesn't exist or is too small to track
How do I test without an API key?
Enable Demo Mode in the input. This returns realistic sample data so you can verify the output format works for your workflow.
Built by John Rippy | Actor Arsenal
