Contact Phone Extractor
Pricing
from $1.00 / 1,000 results
Contact Phone Extractor
Fast and highly accurate phone number extractor. Automatically crawls into relevant contact and about pages to scrape valid international phone numbers while strictly filtering out faxes, dates, and VAT IDs.
Pricing
from $1.00 / 1,000 results
Rating
5.0
(2)
Developer
CodeScraper
Maintained by CommunityActor stats
2
Bookmarked
4
Total users
3
Monthly active users
7 days ago
Last modified
Categories
Share
๐ Contact Phone Extractor โ High-Speed B2B Data Engine
This Apify actor extracts business and customer service phone numbers directly from websites with high accuracy and intelligent validation.
It combines contextual filtering, DOM targeting, intelligent contact page discovery, dynamic country detection, and libphonenumber-js validation to identify, normalize, validate, and deduplicate phone numbers while filtering out dates, zip codes, fax numbers, serial numbers, and other false positives.
๐ What It Does
For every website URL provided, the actor extracts:
๐ข Website Overview
- ๐ Input URL
- ๐ Resolved Source URL
- ๐ฆ Extraction Status
- ๐๏ธ Contact Pages Crawled
- ๐งฎ Phone Numbers Count
๐ Individual Phone Data
For each phone number found:
- ๐ฑ Phone Number
- ๐ Location Found (header, footer, body)
- ๐ฏ Confidence Score
- ๐ Context Snippet
- ๐ Source URL
- ๐ท๏ธ DOM Subsection
โก It Handles
- โ Multiple website URLs
- ๐ Automatic URL normalization
- โก High-speed raw HTML scraping
- ๐ก๏ธ Anti-blocking capabilities
- ๐ต๏ธ Automatic Contact/About page discovery
- ๐ Dynamic country code detection
- ๐ Phone validation using libphonenumber-js
- ๐งน Cross-page deduplication
- ๐ Context-based confidence scoring
- ๐ข Header, footer, navigation, and body extraction
- ๐ซ Filtering of dates, zip codes, fax numbers, and invalid numeric patterns
๐ง How It Works
-
Loads website URLs
-
Fetches raw HTML using CheerioCrawler with session rotation and retry handling
-
Scans:
- Headers
- Footers
- Navigation sections
- Body content
-
Automatically discovers and visits:
- Contact pages
- About pages
- Customer service pages
-
Extracts phone candidates from:
- Visible text
tel:links- Structured HTML content
-
Detects country based on website TLD
-
Validates numbers using libphonenumber-js
-
Calculates confidence scores from surrounding context
-
Removes duplicates and low-quality matches
-
Saves structured data to the Apify Dataset
โ๏ธ Input Configuration
| Field | Type | Description | Default |
|---|---|---|---|
startUrls | Array | List of website URLs to scrape | [] |
defaultCountryCode | String | Fallback ISO country code | "US" |
searchSections | Object | Select sections to scan | {"header":true,"footer":true,"body":false} |
deduplicationStrictness | String | Deduplication mode | "balanced" |
minPhoneLength | Integer | Minimum digits required | 8 |
maxPhoneLength | Integer | Maximum digits allowed | 15 |
excludePatterns | Array | Regex patterns to exclude | ["^800","^888"] |
includeOnlyCountryCodes | Array | Allow only specific calling codes | [] |
confidenceThreshold | Number | Minimum confidence score | 0.5 |
maxResultsPerUrl | Integer | Maximum phones returned per URL | 5 |
outputFormat | String | Output format | "both" |
๐งฉ Example Input
{"startUrls": ["https://oberlausitzer-alpakaland.de"],"defaultCountryCode": "US","deduplicationStrictness": "balanced","confidenceThreshold": 0.5,"excludePatterns": ["^800"],"searchSections": {"header": true,"footer": true,"body": false}}
๐ Example Output
{"originalInputUrl": "https://oberlausitzer-alpakaland.de","source": "https://oberlausitzer-alpakaland.de","contactPagesVisited": ["https://oberlausitzer-alpakaland.de/pages/kontakt","https://oberlausitzer-alpakaland.de/policies/contact-information","https://oberlausitzer-alpakaland.de/policies/legal-notice"],"status": "Found","phoneNumbersCount": 2,"phoneNumbers": [{"phoneNumber": "+49 35874 20425","formattedVariations": ["035874-20425", "+49 35874 20425"],"source": "https://oberlausitzer-alpakaland.de/pages/kontakt","location": "body","subsection": "main-content","confidence": 1,"context": "...alb von 24 Stunden., Telefon: 035874-20425 & 035874-223599, Bitte beacht..."},{"phoneNumber": "+49 35874 223599","formattedVariations": ["035874-223599","+49 35874 223599","035874223599"],"source": "https://oberlausitzer-alpakaland.de/pages/kontakt","location": "body","subsection": "main-content","confidence": 1,"context": "...den., Telefon: 035874-20425 & 035874-223599, Bitte beachte, dass wir kein..."}]}
If no phone numbers are found:
{"originalInputUrl": "www.emiconner.com","source": "http://www.emiconner.com/","status": "No phone numbers found","contactPagesVisited": [],"phoneNumbersCount": 0,"phoneNumbers": []}
โ Error Handling
If a website cannot be accessed or processed:
{"originalInputUrl": "http://www.jewelrybyARSA.com/","source": "http://www.jewelrybyARSA.com/","status": "Failed","contactPagesVisited": [],"error": "Request failed completely (check proxy or rate limits)","phoneNumbersCount": 0,"phoneNumbers": []}
๐ง Features
- ๐ Accurate phone number extraction
- โก High-speed HTML-based scraping
- ๐ก๏ธ Built-in anti-blocking mechanisms
- ๐ Dynamic country detection
- ๐ต๏ธ Automatic contact page crawling
- ๐ฏ Confidence-based validation
- ๐งน Automatic deduplication
- ๐ข Targeted DOM extraction
- ๐ Context-aware scoring
- ๐ซ False-positive filtering
๐ก Use Cases
- B2B Lead Generation
- CRM Data Enrichment
- Sales Prospecting
- Business Directory Building
- Contact Database Creation
- Cold Outreach Campaigns
- Business Intelligence Research
โ FAQs
1. Why is it so fast?
The actor downloads and processes raw HTML directly instead of launching a full browser session. Combined with CheerioCrawler, session pooling, and optimized parsing, it delivers high throughput with minimal resource usage.
2. Can it extract phone numbers hidden behind buttons or clicks?
No. Since the actor works with raw HTML, it does not execute JavaScript or interact with page elements. It specializes in extracting hardcoded contact information.
3. Does it filter Fax numbers?
Yes. The contextual scoring engine heavily penalizes numbers associated with terms such as Fax, Fax:, or F: to prevent them from being returned as primary contact numbers.
4. Does it support international phone numbers?
Yes. Dynamic Country Code Detection automatically maps website TLDs (such as .de, .uk, .fr) to their respective countries and validates numbers using libphonenumber-js. A fallback country can also be configured for generic domains.
๐งโ๐ป Developer Info
Author: codescraper
Email: codescraper011@gmail.com
๐ท๏ธ Tags
phone-scraper ยท contact-extractor ยท lead-generation ยท b2b-data ยท phone-number-extractor ยท data-enrichment ยท sales-intelligence ยท web-scraping