Email ✉️  & Phone ☎️ Extractor avatar

Email ✉️ & Phone ☎️ Extractor

Try for free

7 days trial then $30.00/month - No credit card required now

Go to Store
Email ✉️  & Phone ☎️ Extractor

Email ✉️ & Phone ☎️ Extractor

anchor/email-phone-extractor
Try for free

7 days trial then $30.00/month - No credit card required now

Extract emails, phone numbers, and other contact information like Twitter, LinkedIn, Instagram... from websites you provide. Best for lead generation and data enrichment. Export data in structured formats and dominate your outreach game. Capture your leads almost for free, fast, and without limits.

MI

Dont 100% understand this 3 settings

Closed

Michall2117 opened this issue
8 months ago

what is the difference of this 3: Maximum pages per start URL Maximum link depth Total maximum pages

MI

Michall2117

8 months ago

see image

anchor avatar

Anchor (anchor)

8 months ago

Hello and thanks for your message

I will update the documentation that it is better from now on

Here is what I can tell you about the three settings meanwhile:

Maximum pages per start URL: This setting determines the maximum number of pages the crawler will fetch for each initial URL it encounters. For example, if this setting is set to 100, the crawler will only fetch up to 100 pages from each unique starting URL it encounters during its crawling process.

Maximum link depth: This setting specifies the maximum number of links away from the initial URL that the crawler will follow during its traversal. In other words, it limits how many levels deep the crawler will explore from the starting URL. For instance, if the maximum link depth is set to 3, the crawler will only follow links up to three levels away from the initial URL. Basically links that are not directly available in the URL you specify have less chances to be crawled.

Total maximum pages: This setting sets the overall limit during its crawling process. Once this limit is reached, the crawler will stop fetching additional pages, regardless of how many pages it has fetched from each individual starting URL or the depth of the links it has followed.

I hope it helps

MO

Michall2117-owner

8 months ago

##- Indtast dit svar oven for denne linje -##

Din forespørgsel (166689) er blevet besvaret - se nedenfor. Du kan tilføjer yderligere kommentarer ved at besvare denne e-mail.


Michael R, 4. apr. 2024 15.25 CEST

Hi and thanks

so as I understand you then:

Maximum pages per start URL: will "always" stay on the domaín - eks: www.domain1. /home - www.domain1. /contact - www.domain1. /product ect..... ??

Maximum link depth: will "count" all other domains - eks. starting url: www.domain1. /home - www.domain2. /home - www.domain3. /home ect..... ??

Total maximum pages: will count all the sites that are wisit: eks. www.domain1. /home + www.domain1. /contact + www.domain1. /product + www.domain2. /home + www.domain3. /home = 5 sits - and if my list is 100 domaisn and i only set this on 5 then it will stop crawling after the first 2 domains ??

Ha' en dejlig dag!

Med venlig hilsen

Michael Roger

Telefon: +45 7060 3553

Save NoR ApS / CVR: 10101913

Just-Half-Price dk / Travel-Deal dk / My-Price dk / Deal-Koeb dk


guillim, 2. apr. 2024 09.14 CEST

Dont 100% understand this 3 settings (anchor/email-phone-extractor)

Hello and thanks for your message

I will update the documentation that it is better from now on

Here is what I can tell you about the three settings mean... [trimmed]

anchor avatar

Anchor (anchor)

8 months ago

Almost :)

"Maximum pages per start URL: will "always" stay on the domaín - eks: www.domain1. /home - www.domain1. /contact - www.domain1. /product ect..... ??"

---> When you talk about staying on the domain, this is related to another parameter called "Stay within domain". If you don't check this box, then if domain1 has a link to domain2, your crawler will follow this link. And it will count into the "maximum pages per start url". To avoid this behaviour, you need to check the box, or specify some pseudoUrl.

"Maximum link depth: will "count" all other domains - eks. starting url: www.domain1. /home - www.domain2. /home - www.domain3. /home ect..... ??"

---> yes

"Total maximum pages: will count all the sites that are wisit: eks. www.domain1. /home + www.domain1. /contact + www.domain1. /product + www.domain2. /home + www.domain3. /home = 5 sits - and if my list is 100 domaisn and i only set this on 5 then it will stop crawling after the first 2 domains ??"

---> you almost got it right. In your case, you have listed 5 pages. So if your pages limit is 5, domain1, domain2, domain3 will get crawled. But if I add another one, it will not :

1.www.domain1. /home -> crawled
2.www.domain1. /contact -> crawled
3.www.domain1. /product -> crawled
4.www.domain2. /home -> crawled
5.www.domain3. /home -> crawled
6.www.domain4. /home -> not crawled

anchor avatar

Anchor (anchor)

7 months ago

closing this as stalling issue. feel free to reopen

Developer
Maintained by Community

Actor Metrics

  • 189 monthly users

  • 50 stars

  • >99% runs succeeded

  • 3.3 hours response time

  • Created in Oct 2021

  • Modified 2 months ago