Sitemap Generator - Crawl Website & Create XML Sitemap
Pricing
$4.99/month + usage
Sitemap Generator - Crawl Website & Create XML Sitemap
Generate an XML sitemap for any website. Crawls internal pages from start URLs (with depth + page limits), deduplicates URLs, and stores a ready-to-submit sitemap.xml plus a structured dataset and summary for SEO audits.
Pricing
$4.99/month + usage
Rating
0.0
(0)
Developer

Bikram Adhikari
Actor stats
0
Bookmarked
5
Total users
2
Monthly active users
16 days ago
Last modified
Categories
Share
Generate an XML sitemap (sitemap.xml) for any website by crawling internal pages from one or more start URLs.
This Actor is designed for:
- SEO audits (discover missing pages)
- Creating/refreshing sitemaps for search engines
- QA / monitoring of site URL coverage
What it does
- Crawls internal links (same hostname as the provided start URLs)
- Deduplicates URLs
- Stores
sitemap.xmlin the default key-value store - If the site has more than 50,000 discovered URLs, it creates multiple
sitemap-*.xmlparts plus asitemap-index.xml(andsitemap.xmlwill contain the index for compatibility) - Writes a dataset item for each crawled page (included/excluded + reason)
- Writes a
SUMMARYJSON report (counts, settings, sitemap URL count)
Input
startUrls(required): Start URLs (request list)maxPages: Max pages to crawl (limits total requests)maxDepth: Max link depth from the start URLsignoreUrlPatterns: Array of regex strings to exclude URLsincludeQueryParams: Include?query=paramsin sitemap URLsincludeFragments: Include#fragmentsin sitemap URLs (usually disabled)includeLastModified: If enabled, uses the HTTPLast-Modifiedheader for<lastmod>when availablerespectRobotsTxt: If enabled, skips URLs disallowed byrobots.txtforUser-agent: *(best-effort)robotsTxtTimeoutSecs: Timeout for downloadingrobots.txtchangefreq,priority: Optional sitemap hints applied to all URLs
Output
Key-value store
sitemap.xml(XML)sitemap-index.xml(XML, only for large sites)sitemap-1.xml,sitemap-2.xml, ... (XML parts, only for large sites)SUMMARY(JSON)
Dataset
Each item contains:
url,normalizedUrl,statusCode,contentTypedepth,discoveredFromincludedInSitemap,exclusionReasonlastModified,crawledAt
SEO keywords
sitemap generator, xml sitemap generator, website sitemap crawler, generate sitemap.xml, seo sitemap tool, internal link crawler
Quick start
Store page: https://apify.com/scrappy_garden/sitemap-generator
Paste this into Input and click Run:
{"startUrls": [{"url": "https://example.com/"}],"proxyConfiguration": {"useApifyProxy": false}}
Outputs (what you get)
- Dataset: Dataset items typically include fields like:
url,statusCode,includedInSitemap,exclusionReason,depth,lastModified,crawledAt. - Key-value store:
SUMMARY,sitemap.xml
Tips (trust + predictable results)
- Start with 1–3 URLs to validate behavior, then scale up.
- If a target blocks requests, enable Proxy and/or slow down concurrency in Input.
- Use the
SUMMARY/REPORTkeys (when present) for automation pipelines and monitoring.
Related actors
- robots-txt-validator (https://apify.com/scrappy_garden/robots-txt-validator)
- broken-link-checker (https://apify.com/scrappy_garden/broken-link-checker)
- canonical-url-checker (https://apify.com/scrappy_garden/canonical-url-checker)
- web-page-change-monitor (https://apify.com/scrappy_garden/web-page-change-monitor)
Search keywords
sitemap generator, sitemap generator - crawl website & create xml sitemap, website audit, seo, sitemap