Pricing

$30.00/month + usage

Go to Store

🔥 FireScrape AI Website Content Markdown Scraper

Try for free

Developed by

mohamed el hadi msaid

Advanced web scraper powered by Crawlee and Puppeteer — extracts website content, converts it to Markdown, and structures it for LLM training datasets.

3.5 (3)

Pricing

$30.00/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

13 days ago

Developer tools

Automation

Overview

FireScrape is a powerful web scraper built with Crawlee and Puppeteer. It crawls websites, extracts content, converts it into Markdown format, and structures the data — perfect for generating datasets for LLMs.

🎯 Features

Extracts visible text or full HTML content
Converts content to Markdown
Captures screenshots
Supports proxy configurations
Follows links for deep crawling

🛠️ Input Schema

{
  "title": "FireScrape Input Schema",
  "type": "object",
  "schemaVersion": 1,
  "properties": {
    "startUrls": {
      "title": "Start URLs",
      "type": "array",
      "description": "List of URLs to start crawling from.",
      "editor": "requestListSources",
      "prefill": [{ "url": "https://apify.com" }]
    },
    "maxPages": {
      "title": "Maximum Pages",
      "type": "integer",
      "description": "The maximum number of pages to crawl.",
      "default": 50,
      "minimum": 1
    },
    "proxyConfig": {
      "title": "Proxy Configuration",
      "type": "object",
      "description": "Select proxy settings.",
      "editor": "proxy",
      "default": { "useApifyProxy": true }
    },
    "screenshot": {
      "title": "Take Screenshots",
      "type": "boolean",
      "description": "Enable this to capture a screenshot of each page.",
      "default": true
    },
    "enqueue": {
      "title": "Enqueue Links",
      "type": "boolean",
      "description": "Whether to follow and enqueue new links on the page.",
      "default": true
    },
    "getText": {
      "title": "Extract Text Content",
      "type": "boolean",
      "description": "Extract only the visible text content from the page.",
      "default": false
    },
    "getHtml": {
      "title": "Extract HTML Content",
      "type": "boolean",
      "description": "Extract the full HTML content of the page.",
      "default": false
    }
  },
  "required": ["startUrls"]
}

✅ Output Format

Each successfully scraped page will output a structured JSON object:

{
  "url": "https://example.com",
  "title": "Example Page",
  "metadata": { "description": "An example page", "keywords": ["example", "page"] },
  "markdown": "# Example Page\n\nThis is an example page content...",
  "textContent": "This is an example page content...",
  "htmlContent": "<html><body><h1>Example Page</h1>...</body></html>",
  "screenshot": "data:image/png;base64,iVBORw..."
}

🚀 How to Run

Deploy the actor on Apify.
Input the desired URLs and configuration.
Start the scraper and monitor progress.
Download results as JSON or Markdown.

🔧 Customization

Feel free to extend FireScrape with additional features — like handling dynamic content, authentication, or specialized formatting.

🎁 Bonus: n8n Workflow Integration

As a free bonus for using FireScrape, you can integrate these n8n workflows with this actor:

These workflows can help automate post-scraping actions and expand your automation capabilities.

Happy scraping! 🚀🔥

On this page

🔥 FireScrape AI Website Content Markdown Scraper

Share Actor:

Website Content to Markdown for LLM Training

easyapi/website-content-to-markdown-for-llm-training

🚀 Transform web content into clean, LLM-ready Markdown! 📘 Scrape multiple pages, extract main content, and convert to Markdown format. Perfect for AI researchers, data scientists, and LLM developers. Fast, efficient, and customizable. Supercharge your AI training data today! 🌐📝🧠

EasyApi

5.0

🔥fireScraper AI Prompt Website Content Markdown Scraper

mohamedgb00714/fireScraper-AI-prompt-Website-Content-Markdown-Scraper

fireScrape AI is an advanced web scraper built with Crawlee and Puppeteer. It crawls websites, extracts meaningful content, converts it into Markdown, then runs your custom prompt on the extracted text—ideal for generating enriched datasets, summaries or analyses for LLMs and AI pipelines

mohamed el hadi msaid

5.0

AI Website Content Markdown Scraper

quaking_pail/ai-website-content-markdown-scraper

This Apify Actor, "Website Content Crawler with Markdown Extraction," is designed to perform a comprehensive crawl of specified websites, extract their text content, convert it into Markdown format, and store it in a structured dataset. The extracted content is suitable for feeding LLMs.

AI_Builder

607

4.3

Dynamic Markdown Scraper

louisdeconinck/dynamic-markdown-scraper

Effortlessly feed LLM AIs with clean Markdown using our advanced web scraper. Seamlessly scrape dynamic, JavaScript-rendered websites while preserving original formatting. Ideal for AI training, documentation, and content migration.

Louis Deconinck

5.0

Ai Ready Web Page To Markdown Converter

mustafa.irshaid.113/ai-ready-web-page-to-markdown-converter

Convert any webpage into structured Markdown and HTML using just a URL. Get the page title, link, and content—perfect for SEO, devs, and AI crawlers. Fast, clean, and ideal for repurposing or analysis. Start turning websites into Markdown instantly.

Mustafa Irshaid

Website to MarkDown (AI-Ready)

mintii/website-to-markdown-ai-ready

Use this to scrape webpages and use for AI Tools and LLMs.

Martin from Mintii

AI-Powered Web Content & Link Extractor

scrapercoder/ai-powered-web-content-link-extractor

Crawls websites to extract clean, structured content for AI/LLM use, ideal for training datasets, knowledge bases, and RAG systems. Json output includes: * text: Normalized page content * links: Extracted sub-URLs

wallnut.ai

Webpage to Markdown

extremescrapes/webpage-to-markdown

This actor cost-effectively converts websites into structured markdown optimized for AI processing. It extracts webpage content, formats it into clean markdown, and ensures compatibility with AI models.

Extreme Scrapes

Fast Website Content Crawler

6sigmag/fast-website-content-crawler

A high-performance web scraper that rapidly extracts and analyzes content from multiple websites simultaneously. Perfect for competitive research, content aggregation, and website structure analysis.

David Deng

1.4K

4.6

🔥fireSummarize AI Summarize any Website Content

mohamedgb00714/fireScraper-AI-sammarize-Website-Content

fireSummarize is an AI-powered tool that scrapes any website using Crawlee and Puppeteer, extracts and converts content into Markdown, and then summarizes it using a custom prompt — perfect for generating clean, structured insights from any webpage.

mohamed el hadi msaid

5.0