Get URLs from link avatar

Get URLs from link

Try for free

7 days trial then $2.95/month - No credit card required now

Go to Store
Get URLs from link

Get URLs from link

boring_code/get-urls-from-link
Try for free

7 days trial then $2.95/month - No credit card required now

Extracts URLs from a sitemap or webpage with intuitive path matching. Use comma-separated patterns to include or exclude URL paths with smart matching: '/tags/' for exact paths, '/product' for paths starting with, or simple text for substring matches.

This actor extracts URLs from a sitemap or any webpage containing links. It provides intuitive URL path matching and flexible filtering options to get exactly the URLs you need.

Features

  • Extract URLs from XML sitemaps or webpages
  • Smart URL path matching:
    • Use '/tags/' to match exact path
    • Use '/product' to match paths starting with /product
    • Use 'product' to match URLs containing this text anywhere
  • Exclude specific file extensions (e.g., images)
  • Exclude URLs using the same smart path matching
  • Limit the number of processed URLs
  • Simple comma-separated syntax for filters

Input Configuration

FieldTypeDescription
linkStringURL to process (required)
urlPatternStringList of URL parts to include (comma separated). Use '*' to include all URLs. When using slashes: '/tags/' matches exact path, '/tags' matches path starting with /tags, 'tags/' matches path ending with tags/. Without slashes (e.g., 'product') matches anywhere in URL
maxUrlsIntegerMaximum number of URLs to process (0 for no limit). Good for testing purposes
excludeExtensionsStringList of file extensions to exclude (comma separated). Example: jpg,jpeg,png,gif
customExcludePatternStringList of URL parts to exclude (comma separated). Uses same pattern matching as urlPattern. Examples: '/tags/,category' or '/blog/,author'

Output

The actor outputs a dataset containing URLs that match your specified criteria. Each record has the following field:

1{
2    "url": "https://example.com/page"
3}

Usage Examples

Basic Usage

Extract all URLs from a sitemap:

1{
2    "link": "https://example.com/sitemap.xml"
3}

Smart Path Matching

Get only product URLs with different matching options:

1{
2    "link": "https://example.com/sitemap.xml",
3    "urlPattern": "/products/,productId,deals/"
4}

This will match:

  • URLs containing exact '/products/' path
  • URLs containing 'productId' anywhere
  • URLs ending with 'deals/'

Exclude File Types and Sections

Get URLs excluding images and specific sections:

1{
2    "link": "https://example.com/sitemap.xml",
3    "excludeExtensions": "jpg,jpeg,png,gif",
4    "customExcludePattern": "/tags/,/category/,author"
5}

Limit Results

Get first 100 URLs for testing:

1{
2    "link": "https://example.com/sitemap.xml",
3    "maxUrls": 100
4}
Developer
Maintained by Community

Actor Metrics

  • 11 monthly users

  • 4 stars

  • >99% runs succeeded

  • Created in Oct 2024

  • Modified 3 months ago

Categories