URLs List - Extract ALL website urls avatar
URLs List - Extract ALL website urls

Pricing

from $0.20 / 1,000 results

Go to Apify Store
URLs List - Extract ALL website urls

URLs List - Extract ALL website urls

Automatically discovers and extracts ALL URLs from any website. Perfect for SEO analysis, content inventory, and bulk URL extraction from multiple websites. Get complete URL lists with metadata including last modified dates and priority levels.

Pricing

from $0.20 / 1,000 results

Rating

5.0

(2)

Developer

Lofomachines

Lofomachines

Maintained by Community

Actor stats

1

Bookmarked

25

Total users

8

Monthly active users

12 days ago

Last modified

Share

Comprehensive URL Extractor for SEO audits, content inventory, and bulk analysis.

FeaturesCost of UsageInputOutputTroubleshooting


This actor automatically discovers and extracts ALL URLs from any target website. It is designed to be the entry point for SEO audits, site migrations, and content analysis pipelines. It crawls recursively to build a complete map of a domain.

✨ Key Features

  • 🔍 Automatic Discovery: Intelligently finds all available URLs from any website structure.
  • 💨 Fast & Efficient: Optimized for speed to handle large sites (50k+ URLs).
  • 📦 Bulk Processing: Accepts multiple domain roots to process simultaneously.
  • 🏷️ Rich Metadata: Extracts last modified dates, priority levels, and update frequency (where available).
  • 🗜️ Smart Handling: Works with standard sitemaps, recursive crawling, and standard web formats.
  • 🛡️ Resilient: Automatic retries on temporary errors and infinite loop prevention.
  • 🎯 Result Limiting: Control the maximum number of URLs extracted with maxResults or enable returnAll for complete extraction.
  • 🔎 Keyword Filtering: Filter URLs by keywords - only URLs containing all specified keywords will be returned.

🎯 Use Cases

Use CaseDescription
SEO AuditExtract all URLs to analyze site architecture and identify orphan pages.
Content InventoryCreate a comprehensive list of all existing pages for migration planning.
MonitoringTrack lastmod dates to identify which content has been updated recently.
Data PipelinesFeed the output URLs into other scrapers (e.g., Scrape HTML, Google Sheets export).
Targeted ExtractionUse keyword filtering to extract only specific sections (e.g., all blog posts, product pages).
SamplingUse maxResults to extract a sample of URLs for quick analysis without processing entire sites.

💰 Cost of Usage

This scraper is designed to be lightweight. It parses URL structures without rendering full page JavaScript (unless necessary), keeping costs low.

  • Small Sites (< 1,000 URLs): Cents per run.
  • Medium Sites (10,000 URLs): Typically < $1.00.
  • Large Sites: Efficiency scales well, but usage depends on the complexity of the target site's architecture.

Tip: Always use Apify Proxy (enabled by default) to ensure consistent access and avoid blocking.


📥 Input Configuration

The Actor expects a JSON input defining the websites to scan.

Example Input

{
"startUrls": [
{ "url": "https://apify.com" },
{ "url": "https://crawlee.dev" }
],
"proxyConfiguration": {
"useApifyProxy": true
},
"returnAll": true,
"maxResults": 1000,
"keywords": ["blog", "article"]
}

Input Parameters

ParameterTypeRequiredDefaultDescription
startUrlsArray✅ Yes[{ url: "https://apify.com" }]List of website URLs to extract pages from.
proxyConfigurationObject❌ No{ useApifyProxy: false }Proxy settings for reliable access.
returnAllBoolean❌ NotrueIf true, extracts all available URLs regardless of maxResults. If false, applies the maxResults limit.
maxResultsInteger❌ No1000Maximum number of URLs to extract. Ignored if returnAll is true or set to 0.
keywordsArray❌ No[]Filter URLs to only include those containing ALL specified keywords. Case-insensitive matching. Example: ["blog"] returns only URLs containing "blog" (e.g., https://example.com/blog/article).