Sitemap Generator - Crawl Website & Create XML Sitemap
Pricing
Pay per usage
Sitemap Generator - Crawl Website & Create XML Sitemap
Generate an XML sitemap for any website. Crawls internal pages from start URLs (with depth + page limits), deduplicates URLs, and stores a ready-to-submit sitemap.xml plus a structured dataset and summary for SEO audits.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Bikram Adhikari
Actor stats
0
Bookmarked
5
Total users
2
Monthly active users
6 days ago
Last modified
Categories
Share
Generate an XML sitemap (sitemap.xml) for any website by crawling internal pages from one or more start URLs.
This Actor is designed for:
- SEO audits (discover missing pages)
- Creating/refreshing sitemaps for search engines
- QA / monitoring of site URL coverage
What it does
- Crawls internal links (same hostname as the provided start URLs)
- Deduplicates URLs
- Stores
sitemap.xmlin the default key-value store - If the site has more than 50,000 discovered URLs, it creates multiple
sitemap-*.xmlparts plus asitemap-index.xml(andsitemap.xmlwill contain the index for compatibility) - Writes a dataset item for each crawled page (included/excluded + reason)
- Writes a
SUMMARYJSON report (counts, settings, sitemap URL count)
Input
startUrls(required): Start URLs (request list)maxPages: Max pages to crawl (limits total requests)maxDepth: Max link depth from the start URLsignoreUrlPatterns: Array of regex strings to exclude URLsincludeQueryParams: Include?query=paramsin sitemap URLsincludeFragments: Include#fragmentsin sitemap URLs (usually disabled)includeLastModified: If enabled, uses the HTTPLast-Modifiedheader for<lastmod>when availablerespectRobotsTxt: If enabled, skips URLs disallowed byrobots.txtforUser-agent: *(best-effort)robotsTxtTimeoutSecs: Timeout for downloadingrobots.txtchangefreq,priority: Optional sitemap hints applied to all URLs
Output
Key-value store
sitemap.xml(XML)sitemap-index.xml(XML, only for large sites)sitemap-1.xml,sitemap-2.xml, ... (XML parts, only for large sites)SUMMARY(JSON)
Dataset
Each item contains:
url,normalizedUrl,statusCode,contentTypedepth,discoveredFromincludedInSitemap,exclusionReasonlastModified,crawledAt
SEO keywords
sitemap generator, xml sitemap generator, website sitemap crawler, generate sitemap.xml, seo sitemap tool, internal link crawler
Quick start
Store page: https://apify.com/scrappy_garden/sitemap-generator
Paste this into Input and click Run:
{"startUrls": [{"url": "https://example.com/"}],"proxyConfiguration": {"useApifyProxy": false}}
Outputs (what you get)
- Dataset: Dataset items typically include fields like:
url,statusCode,includedInSitemap,exclusionReason,depth,lastModified,crawledAt. - Key-value store:
SUMMARY,sitemap.xml
Tips (trust + predictable results)
- Start with 1–3 URLs to validate behavior, then scale up.
- If a target blocks requests, enable Proxy and/or slow down concurrency in Input.
- Use the
SUMMARY/REPORTkeys (when present) for automation pipelines and monitoring.
Related actors
- robots-txt-validator (https://apify.com/scrappy_garden/robots-txt-validator)
- broken-link-checker (https://apify.com/scrappy_garden/broken-link-checker)
- canonical-url-checker (https://apify.com/scrappy_garden/canonical-url-checker)
- web-page-change-monitor (https://apify.com/scrappy_garden/web-page-change-monitor)
Search keywords
sitemap generator, sitemap generator - crawl website & create xml sitemap, website audit, seo, sitemap