Changelog
All notable changes to the Email Verifier actor will be documented in this file.
The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.
[1.1.0] - 2025-10-07
Added - Configuration & Optimization
Memory Configuration
- minMemoryMbytes: 256 MB (minimum for testing/small batches)
- maxMemoryMbytes: 2048 MB (maximum for safety/future features)
- Memory-optimized design: ~150 MB peak usage for 10,000 emails
- Platform costs: ~$0.001 per 10,000 emails (98% profit margin)
- Comprehensive MEMORY-REQUIREMENTS.md guide added
Added - Phase 2: Analytics & Filtering Features
Output Filtering & Segmentation
- outputFilter option to filter results by recommendation level ('all', 'accept', 'review', 'reject')
- separateDatasets option to automatically split results into 3 datasets by recommendation
- Creates
valid-emails
dataset for ACCEPT results (low-risk)
- Creates
review-emails
dataset for REVIEW results (medium-risk)
- Creates
reject-emails
dataset for REJECT results (high-risk)
- Filtering statistics in OUTPUT summary when outputFilter is applied
Email Processing
- deduplicate option for case-insensitive duplicate removal before validation
- Deduplication statistics in OUTPUT summary (originalCount, duplicatesRemoved, uniqueCount)
- Saves 5-30% processing time and costs for lists with duplicates
Domain Analysis
- includeDomainAnalysis option for comprehensive domain statistics
- Top 10 domains with counts and percentages
- Provider breakdown: freeEmail, business, disposable, roleBased (counts and percentages)
- Domain diversity score using Shannon entropy (0.0-1.0 scale)
- Unique domain count tracking
- Use case: List quality assessment, fraud detection, campaign targeting
Enhanced Logging
- Detailed logging for all Phase 2 features
- Logs deduplication results (duplicates removed, savings percentage)
- Logs domain analysis summary (unique domains, diversity score, top domain)
- Logs filtering mode when outputFilter is applied
- Logs separate datasets creation with counts
- failFast option for early exit on critical validation failures
- Provides 30-50% faster validation for invalid emails
- Exits immediately on syntax, reserved domain, MX, or disposable failures
- Role-based failures do NOT trigger early exit
- DNS caching with 5-minute TTL (automatic, always enabled)
- 30-40% faster when domains repeat within same batch
- 70%+ faster for bulk validation with many duplicate domains
- 95%+ faster for second batch using same domains (within 5 minutes)
- DNS cache statistics in OUTPUT (cachedDomains, cacheHitRate)
Validation Enhancements
- Reserved domain detection (always enabled, cannot be disabled)
- Detects RFC 2606 domains (example.com, example.org, example.net)
- Detects test domains (test, localhost, *.local)
- Prevents test data pollution in production databases
- RFC 5321 email length validation (automatic)
- 320 character maximum for total email address
- 255 character maximum for domain part
- Proper validation errors for length violations
Event System & Monitoring
- Real-time event tracking system with 5 event types:
syntax
- Email format validation
reserved
- Reserved domain check
mx
- MX records DNS lookup
disposable
- Disposable email detection
role
- Role-based email check
- Debug-level logging for each validation step
- Event counts tracked in OUTPUT summary
- Use case: Real-time monitoring, debugging, progress tracking
Resource Management
- cleanup() method for proper event listener management
- Prevents memory leaks in long-running validations
- Automatic cleanup in finally block
Enhanced Output
- Performance metrics in OUTPUT summary:
- avgValidationTimeMs per email
- dnsCachedDomains count
- dnsCacheHitRate percentage
- Event counts for all validation steps
- Processing time per email
- Comprehensive logging of validation progress
Changed
- Updated INPUT_SCHEMA.json with 8 new options (4 from Phase 1, 4 from Phase 2)
- Enhanced OUTPUT format with new statistics sections
- Improved error messages and logging throughout
- Updated actor.json dataset views to include reservedDomain field
Fixed
- Removed checkReservedDomain option (library doesn't support it as optional)
- Fixed TypeScript type errors in actor input validation
- Clarified in documentation that reserved domain checking is always enabled
- Updated all examples to remove checkReservedDomain references
Documentation
- Comprehensive README updates with:
- All new options documented with descriptions and defaults
- Phase 1 features explained (Fail-Fast Mode, Reserved Domain Detection, DNS Caching, Event Tracking)
- Phase 2 features explained (Output Filtering, Separate Datasets, Deduplication, Domain Analysis)
- Input examples for basic usage, fail-fast mode, and Phase 2 features
- Enhanced output format examples with all new fields
- Use cases and best practices for each feature
- Performance metrics and benefits quantified
Testing
- 67 comprehensive tests covering all features (up from original count)
- Phase 1 tests: failFast, reserved domain detection, event tracking, cleanup
- Phase 2 tests: outputFilter, separateDatasets, deduplicate, includeDomainAnalysis
- Integration tests with EmailValidator library
- Validation tests for all input options
- Combined options scenario tests
[1.0.0] - Initial Release
Added
- Basic email validation with syntax checking
- DNS/MX record verification
- Disposable email detection (500+ services)
- Role-based email identification
- Bulk processing with configurable concurrency (up to 10,000 emails)
- Confidence scoring with risk levels (LOW/MEDIUM/HIGH)
- Recommendations (ACCEPT/REVIEW/REJECT)
- Progress tracking with percentage callbacks
- Comprehensive error handling
- Dataset output for individual results
- Key-value store OUTPUT for summary statistics
Features
- Configurable validation options (checkMX, checkDisposable, checkRoleBased)
- Adjustable DNS timeout (1000-30000ms)
- Concurrent validation (1-50 concurrent emails)
- Weighted scoring algorithm
- Detailed validation results per email
- Summary statistics (total, valid, invalid, risky, processing time)
Migration Guides
Upgrading from 1.0.x to 1.1.0
No Breaking Changes
This is a minor version update with backward compatibility. All existing configurations will continue to work.
Fixed/Removed
Removed: checkReservedDomain
option (bug fix)
- Reason: This option never actually worked - reserved domain checking is always performed by the underlying library and cannot be disabled
- Impact: If you were using this option, it had no effect. Simply remove it from your configurations.
- Action Required: Optional - clean up your input configurations if you were using this non-functional option
New Features You Can Use
Output Filtering - Save only the results you need:
{
"emails": [...],
"outputFilter": "accept"
}
Deduplication - Save processing costs:
{
"emails": [...],
"deduplicate": true
}
Domain Analysis - Understand your list composition:
{
"emails": [...],
"includeDomainAnalysis": true
}
Fail-Fast Mode - Speed up validation for invalid emails:
{
"emails": [...],
"failFast": true
}
Feature | Performance Gain | Use Case |
---|
Fail-Fast Mode | 30-50% faster | Invalid email validation |
DNS Caching (same batch) | 30-40% faster | Duplicate domains |
DNS Caching (bulk) | 70%+ faster | Many duplicate domains |
DNS Caching (repeated) | 95%+ faster | Re-validating same domains |
Deduplication | 5-30% saved | Lists with duplicates |
Feature Adoption Recommendations
Email Marketing Campaigns
Use: outputFilter: "accept"
, deduplicate: true
, failFast: true
- Only get safe-to-send emails
- Remove duplicates to save costs
- Faster validation of invalid addresses
Lead Validation
Use: separateDatasets: true
, includeDomainAnalysis: true
- Process each risk category separately
- Understand lead source quality
- Detect fraud patterns
List Cleanup
Use: deduplicate: true
, failFast: true
, outputFilter: "reject"
- Focus on problematic addresses
- Fast processing of invalid emails
- Remove duplicates first
Quality Assurance
Use: includeDomainAnalysis: true
- Assess list diversity and quality
- Identify potential issues
- Compare before/after cleaning