πŸ”₯ FireScrape AI Website Content Markdown Scraper avatar
πŸ”₯ FireScrape AI Website Content Markdown Scraper

Pricing

$30.00/month + usage

Go to Store
πŸ”₯ FireScrape AI Website Content Markdown Scraper

πŸ”₯ FireScrape AI Website Content Markdown Scraper

Developed by

mohamed el hadi msaid

Maintained by Community

Advanced web scraper powered by Crawlee and Puppeteer β€” extracts website content, converts it to Markdown, and structures it for LLM training datasets.

5.0 (1)

Pricing

$30.00/month + usage

1

Monthly users

17

Runs succeeded

>99%

Last modified

a month ago

Overview

FireScrape is a powerful web scraper built with Crawlee and Puppeteer. It crawls websites, extracts content, converts it into Markdown format, and structures the data β€” perfect for generating datasets for LLMs.


🎯 Features

  • Extracts visible text or full HTML content
  • Converts content to Markdown
  • Captures screenshots
  • Supports proxy configurations
  • Follows links for deep crawling

πŸ› οΈ Input Schema

1{
2  "title": "FireScrape Input Schema",
3  "type": "object",
4  "schemaVersion": 1,
5  "properties": {
6    "startUrls": {
7      "title": "Start URLs",
8      "type": "array",
9      "description": "List of URLs to start crawling from.",
10      "editor": "requestListSources",
11      "prefill": [{ "url": "https://apify.com" }]
12    },
13    "maxPages": {
14      "title": "Maximum Pages",
15      "type": "integer",
16      "description": "The maximum number of pages to crawl.",
17      "default": 50,
18      "minimum": 1
19    },
20    "proxyConfig": {
21      "title": "Proxy Configuration",
22      "type": "object",
23      "description": "Select proxy settings.",
24      "editor": "proxy",
25      "default": { "useApifyProxy": true }
26    },
27    "screenshot": {
28      "title": "Take Screenshots",
29      "type": "boolean",
30      "description": "Enable this to capture a screenshot of each page.",
31      "default": true
32    },
33    "enqueue": {
34      "title": "Enqueue Links",
35      "type": "boolean",
36      "description": "Whether to follow and enqueue new links on the page.",
37      "default": true
38    },
39    "getText": {
40      "title": "Extract Text Content",
41      "type": "boolean",
42      "description": "Extract only the visible text content from the page.",
43      "default": false
44    },
45    "getHtml": {
46      "title": "Extract HTML Content",
47      "type": "boolean",
48      "description": "Extract the full HTML content of the page.",
49      "default": false
50    }
51  },
52  "required": ["startUrls"]
53}

βœ… Output Format

Each successfully scraped page will output a structured JSON object:

1{
2  "url": "https://example.com",
3  "title": "Example Page",
4  "metadata": { "description": "An example page", "keywords": ["example", "page"] },
5  "markdown": "# Example Page\n\nThis is an example page content...",
6  "textContent": "This is an example page content...",
7  "htmlContent": "<html><body><h1>Example Page</h1>...</body></html>",
8  "screenshot": "data:image/png;base64,iVBORw..."
9}

πŸš€ How to Run

  1. Deploy the actor on Apify.
  2. Input the desired URLs and configuration.
  3. Start the scraper and monitor progress.
  4. Download results as JSON or Markdown.

πŸ”§ Customization

Feel free to extend FireScrape with additional features β€” like handling dynamic content, authentication, or specialized formatting.

Happy scraping! πŸš€πŸ”₯

Pricing

Pricing model

RentalΒ 

To use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.

Free trial

1 day

Price

$30.00