Search Results Email & Contact Info Scraper avatar
Search Results Email & Contact Info Scraper

Pricing

$20.00/month + usage

Go to Apify Store
Search Results Email & Contact Info Scraper

Search Results Email & Contact Info Scraper

Easily gather contact information with this friendly Apify actor! Simply enter your search queries, and it will automatically extract email addresses and details from business listings, personal websites, and other public sources. It's like having a personal assistant for all your needs!.

Pricing

$20.00/month + usage

Rating

0.0

(0)

Developer

Jamshaid Arif

Jamshaid Arif

Maintained by Community

Actor stats

0

Bookmarked

28

Total users

2

Monthly active users

3 days ago

Last modified

Share

Apify Contact Scraper Actor

A robust, production-ready web scraper for extracting contact information from search results using DuckDuckGo Search.

Features

Core Capabilities

  • Multi-Source Search: Text, video, and news search support
  • Contact Extraction: Automated email and phone number extraction
  • Smart Filtering: Email provider filtering and site-specific searches
  • Proxy Support: Built-in residential proxy support with country selection
  • Data Validation: Comprehensive input validation and error handling
  • Performance Monitoring: Built-in performance metrics and logging

Technical Highlights

  • Type-Safe: Uses Python dataclasses and enums for type safety
  • Error Handling: Comprehensive exception handling at every level
  • Clean Architecture: Separation of concerns with dedicated classes
  • Async/Await: Fully asynchronous for optimal performance
  • Extensible: Easy to add new search types or extraction patterns

Usage

Basic Usage

{
"search": "lawyers in lahore",
"searchType": "text",
"maxResults": 100
}

Input Parameters

ParameterTypeRequiredDefaultDescription
searchstringYes"lawyers in lahore"Search query
searchTypestringNo"text"Search type: "text", "video", or "news"
maxResultsintegerNo100Maximum number of results
linkedSitesstringNonullLimit search to specific domain
proxyCountryCodestringNo"US"ISO country code for proxy
timeoutintegerNo30Request timeout in seconds

Output Schema

Text Search Results

{
"title": "Example Law Firm",
"link": "https://example.com",
"source": "text",
"timestamp": "2024-01-15T10:30:00.000000",
"emails": ["contact@example.com", "info@example.com"],
"phones": ["+1-555-123-4567"]
}

Video Search Results

{
"title": "Video Title",
"link": "https://youtube.com/watch?v=...",
"description": "Video description",
"source": "video",
"timestamp": "2024-01-15T10:30:00.000000"
}

News Search Results

{
"title": "News Article Title",
"link": "https://news-site.com/article",
"source": "news",
"timestamp": "2024-01-15T10:30:00.000000"
}

Key Improvements

1. Object-Oriented Design

  • Before: Procedural functions with dictionaries
  • After: Clean class-based architecture with clear responsibilities

Error Handling

The actor handles various error scenarios:

  1. Invalid Input: Validates all parameters before processing
  2. Proxy Failures: Graceful handling with clear error messages
  3. Search Errors: Catches and logs search exceptions
  4. Storage Failures: Validates data before storage
  5. Network Timeouts: Configurable timeout with retry logic

Best Practices Implemented

1. Separation of Concerns

Each class has a single, well-defined responsibility:

  • ContactExtractor: Only extracts contact info
  • SearchEngine: Only performs searches
  • DataStorage: Only handles storage

Performance Tips

  1. Adjust max_results: Start small (10-20) for testing
  2. Use site filters: Narrow searches for faster results
  3. Optimize timeout: Balance between speed and completeness
  4. Batch processing: Process multiple queries in parallel

Common Issues

No Results Found

  • Check if search query is too specific
  • Verify site filter is not too restrictive
  • Try different email provider filters

Proxy Errors

  • Verify Apify proxy is configured
  • Check country code is valid ISO format
  • Ensure sufficient Apify credits

Timeout Errors

  • Increase timeout parameter
  • Reduce max_results
  • Check network connectivity

License

This code is provided as-is for use with Apify actors.

Support

For issues or questions:

  1. Check the logs for detailed error messages
  2. Review the input parameters
  3. Test with smaller max_results first
  4. Verify proxy configuration

Changelog

Version 2.0 (Enhanced)

  • Complete architectural redesign
  • Added dataclasses and enums for type safety
  • Improved error handling and validation
  • Enhanced contact extraction algorithms
  • Added performance monitoring
  • Better logging with visual indicators
  • Comprehensive documentation

Version 1.0 (Original)

  • Basic search functionality
  • Simple contact extraction
  • Basic error handling