XML Sitemap Checker avatar

XML Sitemap Checker

Pricing

$3.99/month + usage

Go to Apify Store
XML Sitemap Checker

XML Sitemap Checker

Verify if your website has a properly configured XML sitemap. Checks robots.txt and common paths, validates accessibility, XML structure, content type, and URL count — ensuring search engines can easily crawl and index your site.

Pricing

$3.99/month + usage

Rating

0.0

(0)

Developer

Luffy

Luffy

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

7 days ago

Last modified

Share

Verify if your website has a properly configured XML sitemap, ensuring search engines can easily crawl and index your site's pages. This actor checks both robots.txt and common sitemap paths, then validates each discovered sitemap for accessibility, XML validity, correct content type, and URL count.


Features

  • Discovers sitemaps from robots.txt and common fallback paths (/sitemap.xml, /sitemap_index.xml, etc.)
  • Validates each sitemap: accessibility, valid XML structure, correct content type header
  • Counts the number of URLs/entries in each sitemap
  • Reports whether each sitemap is listed in robots.txt
  • Outputs a flat list — one row per sitemap for easy filtering and export
  • Supports single or bulk URL checking
  • Fast and lightweight (async with aiohttp)
  • Built-in proxy support

Input

The actor accepts the following JSON input:

{
"url": "example.com",
"timeout": 30
}
ParameterTypeRequiredDescription
urlstringNo*A single website URL to check
urlsarrayNo*Multiple website URLs (one per line)
timeoutintegerNoRequest timeout in seconds (default: 30)

*At least one of url or urls must be provided.

The input will be automatically normalized, so partial domains like example.com or www.example.com are acceptable.


Output

Each discovered sitemap produces one row in the output dataset:

FieldTypeDescription
source_websitestringThe normalized website URL that was checked
sitemap_urlstringThe discovered sitemap URL
is_accessiblebooleanWhether the sitemap returned HTTP 200
http_statusnumberThe HTTP status code returned
is_valid_xmlbooleanWhether the content is valid XML sitemap format
content_typestringThe Content-Type header returned by the server
is_xml_content_typebooleanWhether the Content-Type contains "xml"
url_countnumberNumber of <url> or <sitemap> entries found
found_in_robots_txtbooleanWhether this sitemap was listed in robots.txt

Example Output

{
"source_website": "https://example.com",
"sitemap_url": "https://example.com/sitemap.xml",
"is_accessible": true,
"http_status": 200,
"is_valid_xml": true,
"content_type": "application/xml",
"is_xml_content_type": true,
"url_count": 142,
"found_in_robots_txt": true
}

If no sitemaps are found for a website, a single row is returned with sitemap_url set to "None found" and all check fields set to false/0.