Pricing

$10.00/month + usage

Go to Store

Domain Scraper

Try for free

Developed by

Iqbal R

This actor scrapes unique domains from a list of provided URLs. It crawls each page, extracts domains, and stores them in a dataset. The actor respects a defined maximum depth and filters domains based on whether they are ICANN-approved and whether private domains are allowed.

0.0 (0)

Pricing

$10.00/month + usage

Total users

Monthly users

Runs succeeded

>99%

Last modified

6 months ago

Automation

Developer tools

SEO tools

This actor scrapes domains from a list of provided URLs. It recursively crawls the pages, extracts unique domains, and stores them in a dataset. The actor respects a defined maximum depth and filters domains based on whether they are ICANN-approved and whether private domains are allowed. Only unique domains are saved, preventing duplicates during the crawling process.

Features

Domain Extraction : Extracts domains from a list of provided URLs and recursively explores linked pages to gather additional domains.
Recursive Crawling: Crawls web pages to a user-defined maximum depth, enabling detailed exploration while managing resource usage.
Domain Filtering: Processes domains based on ICANN approval and user-defined preferences for private domains.
Unique Dataset: Ensures only unique domains are saved by preventing duplicates during the crawling process.

Input Schema

Start URLs (required): A list of URLs to start crawling from.
Maximum Depth: The maximum depth for crawling, defining how deep the crawler should explore.
Allow Private Domains: Option to enable or disable crawling of private domains.
ICANN Domains Only: Option to restrict processing to ICANN-approved domains only.
Proxy Configuration: Configuration settings for selecting and using proxies during crawling.
Minimum Concurrency: The minimum number of concurrent requests or pages to process.
Maximum Concurrency: The maximum number of concurrent requests or pages to process.

Dataset Schema

domain: The full domain name.
domainWithoutSuffix: The domain without the public suffix (e.g., example from example.com).
hostname: The hostname of the domain.
isIcann: Indicates whether the domain is ICANN-approved (boolean).
publicSuffix: The public suffix of the domain (e.g., .com, .org).
isPrivate: Indicates whether the domain is a private domain (boolean).
subdomain: The subdomain part of the domain (e.g., sub.example.com).

How to Use

Set up the Actor
Start by providing a list of URLs to begin the crawling process. You can either manually input the URLs or provide a list in the actor configuration.
Configure the Input Parameters
- Start URLs: Provide the initial URLs from which the crawler will start.
- Maximum Depth: Define how deep the crawler should explore.
- Allow Private Domains: Choose whether to allow crawling of private domains.
- ICANN Domains Only: Set whether to crawl only ICANN-approved domains.
- Proxy Configuration: If necessary, configure the proxy settings for your crawler.
- Concurrency: Adjust the minimum and maximum concurrency based on your needs.
Run the Actor
Once the input parameters are configured, run the actor to start the crawling process. The actor will crawl the pages, extract unique domains, and store the results in the dataset.
View Results
After the actor finishes running, you can view the extracted domains in the dataset. The data will be displayed in a table format with the following fields:
- Domain
- Domain Without Suffix
- Hostname
- ICANN Domain
- Public Suffix
- Private Domain
- Subdomain
Export Data
You can export the dataset for further processing or analysis. The results are saved in a structured format for easy integration with other tools.
Modify Parameters
Adjust the configuration and rerun the actor as needed to gather additional data or refine the crawling process.

Conclusion

This actor provides an efficient solution for scraping and extracting unique domains from a list of URLs. It recursively crawls the provided pages, extracts domains, and stores them in a dataset. By respecting a defined maximum depth and filtering domains based on ICANN approval and private domain allowance, it ensures only relevant domains are captured.

The actor is optimized to prevent duplicates by saving only unique domains during the crawling process. This makes it a valuable tool for anyone looking to gather domain data in a structured and efficient manner, while maintaining control over the types of domains collected.

On this page

Domain Scraper

Share Actor:

Expired Domains Scraper

martin1080p/expired-domains-scraper

The Expired Domains Scraper automates finding valuable expired domains from expireddomains.com, offering filters and sorting by SEO metrics and auction details for efficient domain acquisition.

Martin Fanta

122

1.0

Naked Domains Analyzer

jancurn/analyze-domains

Crawls and downloads web pages running on a list of provided naked domains e.g. "example.com". The actor stores HTML snapshot, screenshot, text body, and HTTP response headers of all the pages. It also extracts email addresses, phones, social handles for Facebook, Twitter, LinkedIn, and Instagram.

Jan Čurn

396

2.0

Expired Domains Scraper

ib4ngz/expired-domains-scraper

This actor scrapes expired domain data from expireddomains.net, supporting user authentication with optional multi-factor authentication (MFA). It also allows users to apply filters for targeted domain scraping.

Iqbal R

Domain Availability Checker

scrap3r/domain-availability-checker

Think of it as an enhanced version of traditional domain checkers, optimized for speed and reliability. Just enter your desired domain name, and get instant results on its availability and related metrics—all at lightning-fast speeds. Start checking domain availability.

Internet Scraping

ExpiredDomains.net Scraper 🔍

easyapi/expireddomains-net-scraper

Scrape expired domains data from ExpiredDomains.net. Extract detailed domain information including domain status, backlinks, creation date, and availability across multiple TLDs.

EasyApi

4.8

Whois Scraper

salman_bareesh/whois-scraper

This Apify actor retrieves Whois information for a list of domains provided as input. It is built with Python and leverages the `whois` library to perform lookups.

Salman Bareesh

Advanced Website Domain Name Validator

saswave/advanced-website-domain-name-validator

Advanced domain scraper. Determine if a domain is still valid or has moved. We test multiple scenario before flagging the domain as invalid. Extract technologies stack, social account, emails

SASWAVE

Company Domain

apioracle/company-domain

Retrieves the official company website and social media links for a given company name.

Leo Barone

126

5.0

Broken Link Checker - Ensure Your Website's Integrity

dainty_screw/find-broken-links-of-your-website

Maintain your website's health and user experience with our Broken Link Checker. Easily identify and fix broken links to enhance your site's navigation, improve SEO, and keep visitors engaged.