Contact Details Scraper avatar

Contact Details Scraper

Try for free

Pay $3.00 for 1,000 pages

View all Actors
Contact Details Scraper

Contact Details Scraper

vdrmota/contact-info-scraper
Try for free

Pay $3.00 for 1,000 pages

Free email extractor to extract and download emails, phone numbers, Facebook, Twitter, LinkedIn, and Instagram profiles from any website. Extract contact information at scale from lists of URLs and download the data as Excel, CSV, JSON, HTML, and XML.

Do you want to learn more about this Actor?

Get a demo
SP

Picking which pages to crawl

Closed

spr123 opened this issue
2 months ago

Under the Maximum link depth (optional)

Can you add functionality where you will only scrape pages with specific words.

Many websites do not have the contact details on the home page

So if you add the home page

https://florencescoveljewelry.com/

Add the word "Contact" as a url word to scrape

it will then find this

https://florencescoveljewelry.com/pages/contact-us

Its not practical searching 100's of product pages on an e commerce website.

thanks

Scott

milunnn avatar

Hi Scott,

Looks like a great idea! We will add this feature to our backlog and let you know of any updates.

SP

spr123

2 months ago

thanks so much, looking forward to this.

ondrejklinovsky avatar

Hey,

the actor already does this behind the scenes - pages containing words like contact, about or support take precedence over the others. So it should be fine to just limit the number of requests per start url:

1{
2    ...
3    "maxRequestsPerStartUrl": 3,
4    "maxDepth": 1
5}

See this run to see the results.

SP

spr123

2 months ago

This does not seem to work at taking precedence.

For example

https://console.apify.com/actors/9Sk4JJhEma9vBKqrg/runs/fNYxztiX5otE2Hdv4#output

It searches the home page then goes to

https://www.scosche.com/magic-mount-cell-phone-holder-tablet-mount

https://www.scosche.com/collections/magnetic-phone-mount

https://www.scosche.com/heavy-duty-phone-tablet-mounts

The 2nd url it should look at is

https://www.scosche.com/contact

Also when it finds all the data, say on the 2nd url, should you not be stopping the run ? Even if you have selected 5

Many thanks

Scott

SP

spr123

2 months ago

seems to work ok if you just select 2

Developer
Maintained by Apify

Actor Metrics

  • 1.3k monthly users

  • 190 stars

  • >99% runs succeeded

  • 4.1 days response time

  • Created in May 2019

  • Modified a day ago