Naked Domains Analyzer avatar
Naked Domains Analyzer
Try for free

No credit card required

View all Actors
Naked Domains Analyzer

Naked Domains Analyzer

jancurn/analyze-domains
Try for free

No credit card required

Crawls and downloads web pages running on a list of provided naked domains e.g. "example.com". The actor stores HTML snapshot, screenshot, text body, and HTTP response headers of all the pages. It also extracts email addresses, phones, social handles for Facebook, Twitter, LinkedIn, and Instagram.

Domains

domainsstringOptional

List of domains to crawl. The domains must be naked, i.e. specified without a protocol and sub-domains (e.g. example.com).

Default value of this property is ""

Domains file URL

domainsFileUrlstringOptional

URL of a text file that contains the list of domains to crawl. The domains must be naked, i.e. specified without a protocol and sub-domains (e.g. example.com).

This field is useful if you have a large number of domains.

Domains file offset

domainsFileOffsetintegerOptional

Indicates how many domains from the file should be skipped in the beginning. This is useful if you only want to crawl a portion of the domains.

Default value of this property is 0

Domains file count

domainsFileCountintegerOptional

Indicates how many domains from the file should be crawled, starting from the offset. This is useful if you only want to crawl a portion of the domains. Leave empty to crawl all domains.

Use Chrome

useChromebooleanOptional

If checked, the actor uses Chrome instead of Puppeteer's Chromium for the crawling. This might help to prevent blocking of some pages.

Default value of this property is false

Use Apify Proxy

useApifyProxybooleanOptional

If checked, the actor uses Apify Proxy to access the target pages. This might help to prevent blocking of some pages.

Default value of this property is false

Max page retries

maxRequestRetriesintegerOptional

Indicates how many times shall the crawler retry to load a page on error.

Default value of this property is 0

crawlLinkCountintegerOptional

Indicates how many links from the main page going to the same domain shall also be crawled.

Default value of this property is 0

Crawl HTTPS version

crawlHttpsVersionbooleanOptional

If checked, the actor attempts to crawl HTTPs version of the website (e.g. https://example.com for domain example.com).

Default value of this property is true

Crawl www. sub-domain

crawlWwwSubdomainbooleanOptional

If checked, the actor attempts to crawl www. sub-domain of the website (e.g. http://www.example.com for domain example.com).

Default value of this property is true

Save screenshots

saveScreenshotbooleanOptional

If checked, the actor stores screenshots of all loaded pages into the key-value store.

Default value of this property is true

Save HTML content

saveHtmlbooleanOptional

If checked, the actor stores HTML content of all loaded pages into the key-value store.

Default value of this property is true

Save text content

saveTextbooleanOptional

If checked, the actor stores text content of all loaded pages into the dataset results.

Default value of this property is true

Consider child frames

considerChildFramesbooleanOptional

If checked, the actor searches for social handles even in the content of the first-level child frames. The 'page.text' also contains the combined text of the main frame and direct child frames.

Default value of this property is false

Developer
Community logoMaintained by Community
Actor metrics
  • 9 monthly users
  • 99.9% runs succeeded
  • Created in Nov 2018
  • Modified 10 months ago
Categories

You might also like these Actors