Create a plain-text dataset from documentation pages
Created by
Hanna Nosova
Crawl a small public documentation section and save plain text plus page metadata for search indexing, classification, or simple knowledge-base imports.
Website Content Crawler Litefetch_cat/website-content-crawler-lite
Requested URL
Loaded URL
Title
Description
+12 fieldsTextNumberBooleanListObject
Input
🌐 Start URLs(required)
url:https://docs.apify.com/platform/actors
📄 Maximum pages:4
🔗 Maximum link depth:1
Stay on the same domain:true
Include URL globs:https://docs.apify.com/**
Exclude URL globs:**/login**+1
Main content format:text
Respect robots.txt:true
Request timeout (seconds):20
Output fields
Requested URL
Loaded URL
Title
Description
H1
Text
Markdown
HTML
Links
Status
Content type
Depth
Parent URL
Fetched at
Error
Skipped reason
Sign up on Apify01
Create your Apify account to access the Website Content Crawler Lite.
Start the run02
The Actor will start running based on the input automatically.
Receive the output03
Monitor the progress in real-time. You will be notified as soon as your dataset is complete and ready for review.
Integrate into your workflow04
The final output is delivered in JSON, CSV, or Excel format, ready to be plugged into your workflow.

