Sitemap Generator
Pricing
from $8.00 / 1,000 results
Sitemap Generator
Automatically crawl any website and generate XML, HTML, and text sitemaps for SEO optimization. Perfect for submitting to Google Search Console, Bing Webmaster Tools, and improving search engine indexing. no manual work required. Free sitemap generator tool for WordPress, Blogger, and all website.
Pricing
from $8.00 / 1,000 results
Rating
0.0
(0)
Developer

Sachin Kumar Yadav
Actor stats
0
Bookmarked
4
Total users
1
Monthly active users
10 days ago
Last modified
Categories
Share
XML Sitemap Generator - Auto Generate SEO Sitemaps for Google Search Console
Automatically crawl any website and generate XML, HTML, and text sitemaps for SEO optimization. Perfect for submitting to Google Search Console, Bing Webmaster Tools, and improving search engine indexing. This automated sitemap generator discovers all pages on your website and creates compliant sitemaps in minutes.
π Features
- Automatic Page Discovery: Intelligently crawls websites following internal links and navigation patterns
- Customizable Crawling: Set crawling depth and apply filters to include/exclude specific pages
- Multiple Sitemap Formats:
- XML (Standard sitemap format for search engines)
- HTML (Human-readable sitemap for visitors)
- Text (Simple list of URLs)
- Built-in Validation: Ensures sitemaps comply with Google and Bing specifications
- Image Sitemap Support: Optional inclusion of images following Google's image sitemap extension
- Priority & Change Frequency: Automatic priority calculation based on URL depth
- Same-Domain Filtering: Only includes pages from the target domain
- Pattern-Based Filtering: Include/exclude URLs using regular expressions
- Proxy Support: Built-in support for Apify Proxy for reliable crawling
π Input Configuration
Required Fields
- Start URLs (required): List of URLs where crawling begins. Typically your website's homepage.
Optional Fields
-
Max Crawl Depth (default: 3): How many link levels to follow from start URLs
- 0 = Only start URLs
- 1 = Start URLs + pages they link to
- 2 = Two levels deep, etc.
-
Max Pages Per Crawl (default: 1000): Maximum number of pages to crawl
-
Include URL Patterns: Regular expressions for URLs to include
- Example:
^https://example\.com/blog/.*(only blog pages) - Leave empty to include all same-domain URLs
- Example:
-
Exclude URL Patterns: Regular expressions for URLs to exclude
- Example patterns:
.*/admin/.*(exclude admin pages).*/login.*(exclude login pages).*\?.*(exclude URLs with query parameters)
- Example patterns:
-
Sitemap Formats (default: xml): Select which formats to generate
- xml: Standard XML sitemap
- html: Human-readable HTML sitemap
- text: Plain text list of URLs
-
Respect robots.txt (default: true): Follow website's robots.txt rules
-
Change Frequency (default: weekly): Expected update frequency
- Options: always, hourly, daily, weekly, monthly, yearly, never
-
Default Priority (default: 0.5): Default page priority (0.0 to 1.0)
- Homepage automatically gets 1.0
- Priority decreases with page depth
-
Include Images (default: false): Add images to XML sitemap
-
Proxy Configuration: Use Apify Proxy for better reliability
π€ Output
The Actor generates the following outputs:
Key-Value Store Files
-
sitemap.xml (if XML format selected)
- Standard XML sitemap format
- Includes loc, lastmod, changefreq, priority
- Optional image:image elements
-
sitemap.html (if HTML format selected)
- Human-readable sitemap
- Styled with CSS for better presentation
- Shows priority and change frequency
-
sitemap.txt (if text format selected)
- Simple text file with one URL per line
- Easy to parse and import
Dataset
Statistics about the crawl:
{"totalUrls": 150,"baseDomain": "example.com","crawlDepth": 3,"generatedAt": "2025-11-05T10:30:00.000Z","formats": ["xml", "html"]}
π― Use Cases
- SEO Optimization: Submit sitemaps to Google Search Console and Bing Webmaster Tools
- Website Audits: Discover all pages on a website
- Migration Planning: Document site structure before migration
- Content Inventory: Get a complete list of all website pages
- Quality Assurance: Ensure all important pages are discoverable
π‘ Examples
Basic Usage
{"startUrls": [{ "url": "https://example.com" }],"maxCrawlDepth": 3,"sitemapFormats": ["xml"]}
Advanced Usage with Filters
{"startUrls": [{ "url": "https://example.com" }],"maxCrawlDepth": 4,"maxPagesPerCrawl": 5000,"includePatterns": ["^https://example\\.com/blog/.*","^https://example\\.com/products/.*"],"excludePatterns": [".*/admin/.*",".*/login.*",".*\\?.*"],"sitemapFormats": ["xml", "html", "text"],"includeImages": true,"changefreq": "daily","proxyConfiguration": {"useApifyProxy": true}}
E-commerce Site
{"startUrls": [{ "url": "https://shop.example.com" }],"maxCrawlDepth": 5,"includePatterns": ["^https://shop\\.example\\.com/products/.*","^https://shop\\.example\\.com/categories/.*"],"excludePatterns": [".*/cart.*",".*/checkout.*",".*/account.*"],"sitemapFormats": ["xml"],"includeImages": true}
π§ Technical Details
XML Sitemap Format
The generated XML sitemap follows the sitemaps.org protocol:
<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>https://example.com/</loc><lastmod>2025-11-05</lastmod><changefreq>weekly</changefreq><priority>1.0</priority></url><!-- More URLs --></urlset>
Priority Calculation
- Homepage: 1.0
- Depth 1 pages: 0.8
- Depth 2 pages: 0.6
- Depth 3+ pages: max(0.3, defaultPriority)
π Performance
- Speed: Crawls 10-50 pages per minute (depending on website speed)
- Concurrency: Up to 10 concurrent requests
- Memory: Efficient memory usage with streaming
- Limits: Can handle websites with 100,000+ pages
β οΈ Best Practices
- Start with Small Depth: Test with maxCrawlDepth=2 first
- Use Exclude Patterns: Filter out unnecessary pages (login, admin, etc.)
- Enable Proxy: Use Apify Proxy for better reliability
- Set Realistic Limits: Don't crawl more pages than needed
- Respect robots.txt: Keep it enabled unless you have permission
- Test Patterns: Verify your regex patterns work correctly
π Troubleshooting
Actor Finds Too Few URLs
- Increase
maxCrawlDepth - Check
excludePatternsaren't too restrictive - Verify website has internal links
Actor Finds Too Many URLs
- Decrease
maxCrawlDepth - Add more
excludePatterns - Reduce
maxPagesPerCrawl
Crawling is Slow
- Enable proxy configuration
- Check website's response time
- Reduce
maxConcurrencyif website blocks requests
π License
Apache-2.0
π€ Support
For issues, feature requests, or questions, please contact support or create an issue.
Find ME better
XML Sitemaps Generator, Sitemap generator for Blogger, Google Sitemap Generator, Sitemap generator tool, Sitemap Generator WordPress, Visual sitemap generator, Free sitemap generator.