Simple Contact Info and Social Media Scraper
Pay $3.00 for 1,000 results
Simple Contact Info and Social Media Scraper
Pay $3.00 for 1,000 results
This Apify actor is designed to crawl web pages and extract social media handles, emails, and phone numbers using Puppeteer. It can handle dynamic content and navigate through multiple pages, making it suitable for comprehensive data extraction tasks.
This Apify actor is designed to crawl web pages and extract social media handles, emails, and phone numbers using Puppeteer. It can handle dynamic content and navigate through multiple pages, making it suitable for comprehensive data extraction tasks.
If you're looking for examples or want to learn more visit:
Included features
- Data Extraction: Extracts social media handles, emails, and phone numbers.
- Dynamic Content Handling: Supports crawling through links and HTML frames.
- Configurable: Set depth and request limits.
- Proxy Support: Uses Apify's proxy configuration for anonymity and IP rotation.
How it works
- Input: Define start URLs in
INPUT.json
. - Proxy Configuration: Set up proxies to avoid IP blocking.
- Crawler Setup: Use
PuppeteerCrawler
with custom routing. - Request Handling: Customize page handling in
routes.js
. - Execution: Start the crawler with
crawler.run(startUrls);
.
Input Configuration
{ "considerChildFrames": true, "maxDepth": 2, "maxRequests": 100, "sameDomain": true, "startUrls": [ { "url": "https://nonos.ph/", "method": "GET" } ] } ``
- startUrls: List of URLs to start crawling from.
- proxyConfig: Configuration for using Apify's proxy services.
- sameDomain: Restrict crawling to the same domain.
- maxDepth: Maximum depth of links to follow.
- considerChildFrames: Enable crawling of HTML frames.
- maxRequests: Total number of requests to make.
- maxRequestsPerStartUrl: Limit requests per start URL.
Output Dataset
The actor stores its results in the default dataset associated with the actor run. You can download the results in formats such as JSON, HTML, CSV, XML, or Excel. Each record in the dataset includes:
- URL: The page URL.
- Email: Extracted email addresses.
- Phone Number: Extracted phone numbers.
- Social Media Profiles: Links to social media profiles (e.g., Facebook, Twitter, LinkedIn).
Resources
If you're looking for examples or want to learn more visit:
- Crawlee + Apify Platform guide
- Documentation and examples
- Node.js tutorials in Academy
- How to scale Puppeteer and Playwright
- Video guide on getting data using Apify API
- Integration with Make, GitHub, Zapier, Google Drive, and other apps
Documentation reference
To learn more about Apify and Actors, take a look at the following resources:
Actor Metrics
2 monthly users
-
1 star
>99% runs succeeded
Created in Nov 2024
Modified 8 days ago