Create a plain-text dataset from documentation pages

Crawl a small public documentation section and save plain text plus page metadata for search indexing, classification, or simple knowledge-base imports.

Try for free
Website Content Crawler Lite
Website Content Crawler Litefetch_cat/website-content-crawler-lite
Requested URL
Loaded URL
Title
Description
+12 fields
Text
Number
Boolean
List
Object

Input

🌐 Start URLs(required)
url:https://docs.apify.com/platform/actors
📄 Maximum pages:4
🔗 Maximum link depth:1
Stay on the same domain:true
Include URL globs:https://docs.apify.com/**
Exclude URL globs:**/login**+1
Main content format:text
Respect robots.txt:true
Request timeout (seconds):20

Output fields

Requested URL
Loaded URL
Title
Description
H1
Text
Markdown
HTML
Links
Status
Content type
Depth
Parent URL
Fetched at
Error
Skipped reason

How it works

Sign up on Apify01

Create your Apify account to access the Website Content Crawler Lite.

Start the run02

The Actor will start running based on the input automatically.

Receive the output03

Monitor the progress in real-time. You will be notified as soon as your dataset is complete and ready for review.

Integrate into your workflow04

The final output is delivered in JSON, CSV, or Excel format, ready to be plugged into your workflow.

ImageImage

Integrate Actor directly into your workflow

Choose from one of 100+ integration options we provide or integrate via API

WebhookWebhook

Webhook

n8n

n8n

Make

Make

Zapier

Zapier

Airbyte

Airbyte

Keboola

Keboola

IFTTTIFTTT

IFTTT

Hubspot

Hubspot

GDrive

GDrive

Gmail

Gmail

Apify MCPApify MCP

Apify MCP

GitHubGitHub

GitHub

Slack

Slack

LangChainLangChain

LangChain

LlamaIndex

LlamaIndex

Flowise

Flowise

PineconePinecone

Pinecone

OpenAIOpenAI

OpenAI

MastraMastra

Mastra

Clay

Clay