No credit card required
Contact Details Scraper
No credit card required
Free email extractor to extract and download emails, phone numbers, Facebook, Twitter, LinkedIn, and Instagram profiles from any website. Extract contact information at scale from lists of URLs and download the data as Excel, CSV, JSON, HTML, and XML.
I've ran into a case where the following inputs:
1{ 2 "startUrls": [ 3 { 4 "url": "https://www.hotelsouthmelbourne.com/" 5 } 6 ], 7 "maxRequests": 5, 8 "maxRequestsPerStartUrl": 1, 9 "maxDepth": 2, 10 "sameDomain": true, 11 "considerChildFrames": true, 12 "proxyConfig": { 13 "useApifyProxy": true 14 } 15}
results with 3 emails:
1"emails": [ 2 "7@300x-100.jpg", 3 "17@300x-100.jpg", 4 "info@hotelsouthmelbourne.com" 5 ],
As you can see the two .jpg images fit the regex pattern for an email! Can this be updated to exclude known extensions for images and maybe fonts too?
It used to be common practice to use @
symbol in filename to target different density devices so filtering out known filetypes will ensure a bit more robustness of this otherwise great scraper!
Thanks,,
Dav
Hi,
The issue has been fixed in the latest build, we added a check for images and font extensions.
- 809 monthly users
- 96.1% runs succeeded
- 30.8 days response time
- Created in May 2019
- Modified 3 days ago