Site to llms.txt Generator avatar

Site to llms.txt Generator

Pricing

Pay per usage

Go to Apify Store
Site to llms.txt Generator

Site to llms.txt Generator

Generate a complete llms.txt file for any website in one run. Crawls up to 200 same-origin pages, extracts titles and meta descriptions, and outputs a clean, spec-compliant llms.txt that makes your site readable for AI assistants and agents.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Steffano van Hoven

Steffano van Hoven

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

20 hours ago

Last modified

Share

What does Site to llms.txt do?

Site to llms.txt crawls any website and generates a llms.txt file in one run. The llms.txt standard gives AI assistants a structured, machine-readable overview of your site — so tools like Claude, ChatGPT, and Perplexity can accurately answer questions about your content. Point the Actor at your docs, marketing site, or product pages, and receive a ready-to-publish llms.txt within minutes.

Why use Site to llms.txt?

  • AI discoverability — LLMs increasingly respect llms.txt the way search engines respect robots.txt. A well-structured file improves how AI tools cite and represent your content.
  • Zero setup — no code, no CLI, no configuration files. Paste a URL and run.
  • Same-origin crawl — only pages on your own domain are collected, so you stay in control.
  • Runs on Apify — full API access, scheduling, webhook notifications, and run history out of the box.

How to use Site to llms.txt

  1. Open the Actor in the Apify Console and click Try for free.
  2. Enter the Website URL (e.g. https://docs.yoursite.com).
  3. Optionally set Max pages (default 30, maximum 200).
  4. Click Start and wait for the run to finish (typically under 2 minutes for 30 pages).
  5. Download your llms.txt from the Key-Value Store output tab.

Input

FieldTypeRequiredDefaultDescription
urlstringyesStart URL of the website to crawl
maxPagesintegerno30Maximum pages to crawl (1–200)
siteNamestringnohostnameOverrides the H1 heading in llms.txt
summarystringnometa descriptionOne-line summary line in llms.txt

Example input:

{
"url": "https://docs.apify.com",
"maxPages": 10
}

Output

The Actor produces two outputs:

Key-Value Store — llms.txt (text/plain): The generated file, ready to publish at https://yoursite.com/llms.txt.

Dataset: One row per run with url, pagesCrawled, and llmsTxt fields.

Example llms.txt output (first 10 lines from a real run on docs.apify.com):

# docs.apify.com
> Overview of docs.apify.com
## academy
- [Apify Academy | Academy | Apify Documentation](https://docs.apify.com/academy): Learn everything about web scraping and automation with our free courses that will turn you into an expert scraper developer.
## api
- [Apify API | Apify Documentation](https://docs.apify.com/api/v2): The Apify API (version 2) provides programmatic access to the Apify

You can download the dataset in various formats such as JSON, HTML, CSV, or Excel.

Pricing

This Actor uses pay-per-event pricing: 1 event is charged per successfully generated llms.txt file, regardless of how many pages were crawled. You are not charged for failed runs or runs that produced no output.

Check the Apify pricing page for the current cost per event. For most sites, a single run costs less than $0.01.

Limitations

  • Same-origin only — links to external domains are not followed.
  • Maximum 200 pages — for larger sites, crawl sections separately and merge the results.
  • No JavaScript rendering — pages that require JavaScript to load their content will return empty or partial data. Use a Playwright-based Actor for JS-heavy sites.
  • Meta description as summary — if the homepage has no <meta name="description">, the summary falls back to a generic Overview of <hostname> line. Override it with the summary input field for a better result.

FAQ and support

Is this legal? The Actor only crawls pages your web server already serves publicly. It respects server-imposed limits (timeouts, connection errors). Always verify you have the right to crawl the target site.

The summary says "Overview of ..." — why? Your homepage does not have a <meta name="description"> tag, so the Actor used its generic fallback. Set the summary input field to provide a better one manually.

Can I automate this? Yes — use the Apify scheduler to regenerate your llms.txt weekly, or trigger it via webhook whenever your docs are published.

For bugs or feature requests, open an issue in the Issues tab. For a custom enterprise solution, contact Apify support.