
LeadScraper
Pricing
$2.30 / 1,000 runs

LeadScraper
Scrape a list of urls and receive business contact information, social media links, and a description of the services. This actor will scrape across multiple pages in the sitemap and returns a confidence score to every phone number and email that it finds. webscraper, scrape leads, web scraper
0.0 (0)
Pricing
$2.30 / 1,000 runs
2
Total users
19
Monthly users
8
Runs succeeded
>99%
Last modified
2 months ago
Service Company Website Scraper
An Apify actor that scrapes service company websites and extracts structured information about the business, including contact information, services offered, hours of operation, and more.
Features
- Extracts company name, description, and contact information
- Identifies services offered by the company
- Extracts business hours, social media links, and reviews
- Finds pricing information and FAQs
- Handles multiple URLs in a single run
- Supports SSL verification options
- Optional Cloudflare bypass capability
Input
The actor accepts the following input parameters:
urls
- An array of service company website URLs to scrape (required)verifySSL
- Whether to verify SSL certificates (default:true
)bypassCloudflare
- Whether to attempt to bypass Cloudflare protection (default:true
)metadata
- Optional custom metadata to include with each result
Example input:
{"urls": ["https://www.example1.com/","https://www.example2.com/"],"verifySSL": true,"bypassCloudflare": true,"metadata": {"project_id": "example-project","source": "manual","category": "roofing"}}
Output
The actor outputs a JSON object for each URL containing the following information:
url
- The URL of the scraped websitetitle
- The title of the websitemeta_description
- The meta description of the websitemain_content
- The main content of the websitecontact_information
- Contact information extracted from the websitephones
- List of phone numbers with confidence scoresmain_phone
- The main phone number with highest confidenceemails
- List of email addresses with confidence scoresmain_email
- The main email address with highest confidenceaddress
- The physical address of the business
services
- List of services offered by the companyhours_of_operation
- Business hours by day of the weeksocial_media_links
- Links to social media profilesreviews
- Customer reviews found on the websitepricing
- Pricing information for servicesfaqs
- Frequently asked questionssuccess
- Whether the scraping was successfulerror
- Error message if scraping failed
Example Usage
const Apify = require('apify');Apify.main(async () => {const input = {urls: ["https://www.example1.com/","https://www.example2.com/"],verifySSL: true,bypassCloudflare: true,metadata: {project_id: "example-project",source: "manual",category: "roofing"}};// Run the actor and wait for it to finishconst run = await Apify.call('your-username/service-company-scraper', input);// Print the resultsconst dataset = await Apify.openDataset(run.defaultDatasetId);const { items } = await dataset.getData();console.log('Results:', items);});## Development### Project Structure- `main.py` - Entry point for the Apify actor- `scraper.py` - Contains the `ServiceCompanyScraper` class- `requirements.txt` - Python dependencies- `INPUT_SCHEMA.json` - Input schema for the Apify actor- `OUTPUT_SCHEMA.json` - Output schema for the Apify actor- `Dockerfile` - Docker configuration for the Apify actor### Adding New FeaturesTo add new extraction capabilities:1. Add a new method to the `ServiceCompanyScraper` class in `scraper.py`2. Call the method from the `scrape` method3. Update the output schema if necessary