# Crawl Help Center Content for AI Search

**Use case:** 

Extract clean Markdown and text from help center pages so the content can be indexed for AI search, support bots, RAG, and knowledge base workflows.

## Input

```json
{
  "startUrls": [
    {
      "url": "https://help.openai.com/en/"
    }
  ],
  "maxPages": 5,
  "crawlMode": "website",
  "sitemapUrls": [
    "https://docs.apify.com/sitemap.xml"
  ],
  "maxDepth": 1,
  "sameDomainOnly": true,
  "outputFormat": "markdown",
  "saveCleanHtml": false,
  "minTextLength": 0,
  "excludeUrlGlobs": [
    "**/search/**",
    "**/login/**"
  ],
  "navigationTimeoutSecs": 25
}
```

## Output

```json
{
  "url": {
    "label": "Url"
  },
  "title": {
    "label": "Title"
  },
  "description": {
    "label": "Description"
  },
  "contentFormat": {
    "label": "Content format"
  },
  "wordCount": {
    "label": "Word count"
  },
  "language": {
    "label": "Language"
  },
  "canonicalUrl": {
    "label": "Canonical url"
  },
  "depth": {
    "label": "Depth"
  },
  "httpStatusCode": {
    "label": "Http status code"
  },
  "crawledAt": {
    "label": "Crawled at"
  }
}
```

## About this Actor

This example demonstrates how to use [Website Content Extractor for RAG: Markdown, HTML, Text](https://apify.com/nezha/website-content-crawler) with a specific input configuration. Visit the [Actor detail page](https://apify.com/nezha/website-content-crawler) to learn more, explore other use cases, and run it yourself.