Email ✉️ & Phone ☎️ Extractor
7 days trial then $30.00/month - No credit card required now
Email ✉️ & Phone ☎️ Extractor
7 days trial then $30.00/month - No credit card required now
Extract emails, phone numbers, and other contact information like Twitter, LinkedIn, Instagram... from websites you provide. Best for lead generation and data enrichment. Export data in structured formats and dominate your outreach game. Capture your leads almost for free, fast, and without limits.
what is the difference of this 3: Maximum pages per start URL Maximum link depth Total maximum pages
see image
Hello and thanks for your message
I will update the documentation that it is better from now on
Here is what I can tell you about the three settings meanwhile:
Maximum pages per start URL: This setting determines the maximum number of pages the crawler will fetch for each initial URL it encounters. For example, if this setting is set to 100, the crawler will only fetch up to 100 pages from each unique starting URL it encounters during its crawling process.
Maximum link depth: This setting specifies the maximum number of links away from the initial URL that the crawler will follow during its traversal. In other words, it limits how many levels deep the crawler will explore from the starting URL. For instance, if the maximum link depth is set to 3, the crawler will only follow links up to three levels away from the initial URL. Basically links that are not directly available in the URL you specify have less chances to be crawled.
Total maximum pages: This setting sets the overall limit during its crawling process. Once this limit is reached, the crawler will stop fetching additional pages, regardless of how many pages it has fetched from each individual starting URL or the depth of the links it has followed.
I hope it helps
##- Indtast dit svar oven for denne linje -##
Din forespørgsel (166689) er blevet besvaret - se nedenfor. Du kan tilføjer yderligere kommentarer ved at besvare denne e-mail.
Michael R, 4. apr. 2024 15.25 CEST
Hi and thanks
so as I understand you then:
Maximum pages per start URL: will "always" stay on the domaín - eks: www.domain1. /home - www.domain1. /contact - www.domain1. /product ect..... ??
Maximum link depth: will "count" all other domains - eks. starting url: www.domain1. /home - www.domain2. /home - www.domain3. /home ect..... ??
Total maximum pages: will count all the sites that are wisit: eks. www.domain1. /home + www.domain1. /contact + www.domain1. /product + www.domain2. /home + www.domain3. /home = 5 sits - and if my list is 100 domaisn and i only set this on 5 then it will stop crawling after the first 2 domains ??
Ha' en dejlig dag!
Med venlig hilsen
Michael Roger
Telefon: +45 7060 3553
Save NoR ApS / CVR: 10101913
Just-Half-Price dk / Travel-Deal dk / My-Price dk / Deal-Koeb dk
guillim, 2. apr. 2024 09.14 CEST
Dont 100% understand this 3 settings (anchor/email-phone-extractor)
Hello and thanks for your message
I will update the documentation that it is better from now on
Here is what I can tell you about the three settings mean... [trimmed]
Almost :)
"Maximum pages per start URL: will "always" stay on the domaín - eks: www.domain1. /home - www.domain1. /contact - www.domain1. /product ect..... ??"
---> When you talk about staying on the domain, this is related to another parameter called "Stay within domain". If you don't check this box, then if domain1 has a link to domain2, your crawler will follow this link. And it will count into the "maximum pages per start url". To avoid this behaviour, you need to check the box, or specify some pseudoUrl.
"Maximum link depth: will "count" all other domains - eks. starting url: www.domain1. /home - www.domain2. /home - www.domain3. /home ect..... ??"
---> yes
"Total maximum pages: will count all the sites that are wisit: eks. www.domain1. /home + www.domain1. /contact + www.domain1. /product + www.domain2. /home + www.domain3. /home = 5 sits - and if my list is 100 domaisn and i only set this on 5 then it will stop crawling after the first 2 domains ??"
---> you almost got it right. In your case, you have listed 5 pages. So if your pages limit is 5, domain1, domain2, domain3 will get crawled. But if I add another one, it will not :
1.www.domain1. /home -> crawled
2.www.domain1. /contact -> crawled
3.www.domain1. /product -> crawled
4.www.domain2. /home -> crawled
5.www.domain3. /home -> crawled
6.www.domain4. /home -> not crawled
closing this as stalling issue. feel free to reopen
Actor Metrics
190 monthly users
-
50 stars
>99% runs succeeded
3.3 hours response time
Created in Oct 2021
Modified 2 months ago