Yellow pages Kenya Scraper avatar
Yellow pages Kenya Scraper

Pricing

Pay per event

Go to Apify Store
Yellow pages Kenya Scraper

Yellow pages Kenya Scraper

Developed by

Calvin Kimathi

Calvin Kimathi

Maintained by Community

Yellow Pages Kenya Scraper This Apify actor scrapes business listings from Yellow Pages Kenya. Features ๐Ÿ” Search by category or keyword ๐Ÿ“ž Extracts name, phone, email, address, and website ๐Ÿ“„ Automatic pagination handling ๐Ÿ”’ Proxy support (Apify Proxy) โšก Configurable max items limit

0.0 (0)

Pricing

Pay per event

0

4

1

Last modified

4 days ago

A reliable and efficient Apify actor that scrapes business listings from Yellow Pages Kenya. Extracts clean, verified contact information including names, phone numbers, emails, addresses, and websites.

โœจ Features

  • ๐Ÿ” Smart Category Search - Search by business category (hotels, restaurants, dentists, etc.)
  • ๐Ÿ“ž Clean Contact Data - Automatically filters placeholder emails and phone numbers
  • ๐ŸŒ Website Extraction - Finds real business websites (filters out social media)
  • ๐Ÿ“„ Automatic Pagination - Scrapes multiple pages until maxItems reached
  • ๐Ÿ”’ Proxy Support - Built-in Apify Proxy support with optimized settings
  • โšก Optimized Performance - Handles 50-100 businesses efficiently (~10-18 minutes)
  • ๐ŸŽฏ High Data Quality - 85-95% success rate with accurate information

๐Ÿ“Š What Gets Scraped

Each business listing includes:

FieldDescriptionSuccess Rate
nameBusiness name (cleaned)100%
phoneKenyan phone numbers in +254 format85-95%
emailBusiness email addresses70-85%
addressPhysical address in Kenya60-75%
urlBusiness website (excludes social media)40-60%

Sample Output

[
{
"name": "Hilton Nairobi",
"phone": "+254719026000, +254732120000",
"email": "hilton_nairobi@hilton.com",
"address": "Mama Ngina Street, Nairobi",
"url": "https://www.hilton.com/nairobi"
},
{
"name": "Sarova Stanley Hotel",
"phone": "+254202228830",
"email": "stanley@sarovahotels.com",
"address": "Corner Kenyatta Avenue & Kimathi Street, Nairobi",
"url": "https://www.sarovahotels.com"
}
]

๐Ÿš€ Quick Start

Basic Usage

{
"searchTerm": "hotels",
"maxItems": 50,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Common Search Categories

  • Hospitality: hotels, restaurants, bars, cafeterias
  • Healthcare: dental-clinic, doctors, hospitals, pharmacies
  • Professional: lawyers-advocates, accountants, real-estate-agents
  • Construction: building-contractors, plumbers, electricians
  • Beauty: beauty-salons, beauty-salons-spas, fitness-centres
  • Automotive: motorvehicle-dealers-new, garages, car-wash
  • Technology: website-designers, software-developers, computers-hardware-maintenance

โš™๏ธ Input Configuration

Required Fields

searchTerm (string)

The business category to search for. Use lowercase with hyphens.

Examples:

  • "hotels" - Hotels and lodging
  • "dental-clinic" - Dental clinics and dentists
  • "real-estate-agents" - Real estate agencies
  • "restaurants" - Restaurants and eateries

maxItems (integer)

Maximum number of businesses to scrape.

Recommendations:

  • Testing: 10-20 items (~2-4 minutes)
  • Production: 50 items (~10 minutes)
  • Large scrapes: 100 items (~18 minutes)
  • Unlimited: Set to 0 (not recommended due to time)

Optional Fields

startUrls (array)

Direct URLs to scrape instead of using search. Useful for specific business pages.

{
"startUrls": [
{"url": "https://www.yellowpageskenya.com/business-category/hotels"}
]
}

proxyConfiguration (object)

Proxy settings for the scraper.

Recommended (Datacenter):

{
"useApifyProxy": true,
"apifyProxyGroups": ["SHADER"]
}

For Difficult Sites (Residential):

{
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}

No Proxy (Testing Only):

{
"useApifyProxy": false
}

๐Ÿ“ˆ Performance & Costs

Scraping Time

ItemsTimeMemory Used
10~2 min1.5 GB
20~4 min1.5 GB
50~10 min2 GB
100~18 min2 GB

Cost Estimates

Compute Units:

  • 50 businesses: 0.02 CU ($0.005)
  • 100 businesses: 0.04 CU ($0.01)

Proxy Costs:

  • Datacenter (SHADER): ~$0.10-0.20 per 100 businesses
  • Residential: ~$2-3 per 100 businesses

Total per 100 businesses:

  • Datacenter: ~$0.21
  • Residential: ~$3.01

๐ŸŽฏ Best Practices

1. Start Small

Always test with 10-20 items first to verify the category works.

2. Use Correct Category Names

Yellow Pages Kenya uses specific category URLs:

  • โœ… "dental-clinic" (correct)
  • โŒ "dentists" (will work but might get different results)
  • โŒ "Dental Clinic" (case matters)

3. Monitor Your First Run

Check the logs for:

  • Success rate (should be >85%)
  • Proxy errors (minimal)
  • Data quality (no placeholders)

4. Batch Large Scrapes

Instead of scraping 200 items at once:

  • Run 4 times with 50 items each
  • More reliable and easier to debug

5. Use Datacenter Proxies

Unless you're getting blocked, use SHADER (datacenter) proxies:

  • 10x cheaper than residential
  • Faster performance
  • Sufficient for most cases

โš ๏ธ Important Notes

What Gets Filtered Out

The actor automatically removes:

  • โŒ Placeholder phone numbers: +254 700 000 000, 0551037607
  • โŒ Template emails: info@yellowpageskenya.com, contact@company.com
  • โŒ Placeholder URLs: paginasamarelas.co.ao, leafletjs.com
  • โŒ Social media links: Facebook, Twitter, Instagram, LinkedIn
  • โŒ Invalid Kenyan phone numbers

Data Quality

Expected Results:

  • 85-95% of businesses will have phone numbers
  • 70-85% will have email addresses
  • 40-60% will have websites
  • 60-75% will have addresses

Some businesses legitimately don't have:

  • Email addresses (phone-only businesses)
  • Websites (local shops, street vendors)
  • Complete addresses (mobile services)

This is normal and expected!

๐Ÿ› Troubleshooting

Issue: Actor Times Out

Solution:

  1. Reduce maxItems to 50 or less
  2. Increase timeout in actor settings (25+ minutes recommended)
  3. Use datacenter proxies instead of residential

Issue: Many Proxy Errors

Symptoms: Logs show ERR_TUNNEL_CONNECTION_FAILED

Solution:

  1. Reduce maxItems to 20
  2. Switch to residential proxies:
    "apifyProxyGroups": ["RESIDENTIAL"]
  3. Wait 5-10 minutes and try again

Issue: No Results Found

Possible Causes:

  • Category name is incorrect
  • Category exists but has no listings
  • Website structure changed

Solution:

  1. Check Yellow Pages Kenya website manually
  2. Try a common category like "hotels" to verify actor works
  3. Use startUrls with direct category URL

Issue: Too Many Placeholder Emails/Phones

This should not happen with the current version, but if it does:

  • Report the issue with examples
  • The actor filters these automatically

๐Ÿ”„ Integration with n8n

This actor works seamlessly with n8n workflows. Use the HTTP Request node:

// n8n HTTP Request Node
{
"method": "POST",
"url": "https://api.apify.com/v2/acts/YOUR_ACTOR_ID/run-sync-get-dataset-items",
"qs": {
"token": "YOUR_APIFY_TOKEN",
"timeout": 600
},
"body": {
"searchTerm": "hotels",
"maxItems": 50,
"proxyConfiguration": {
"useApifyProxy": true
}
}
}

๐Ÿ“ž Use Cases

1. Cold Email Outreach

  • Scrape businesses in your target industry
  • Verify emails with services like Reoon
  • Create personalized email campaigns

2. Lead Generation

  • Build prospect lists for B2B sales
  • Export to CRM systems
  • Enrich with additional data sources

3. Market Research

  • Analyze business density by category
  • Identify competitors in specific regions
  • Track industry trends

4. Data Enrichment

  • Complete existing business databases
  • Verify contact information
  • Update outdated records

๐Ÿ› ๏ธ Advanced Configuration

Custom Category URL

If you know the exact Yellow Pages category URL:

{
"startUrls": [
{
"url": "https://www.yellowpageskenya.com/business-category/hotels"
}
],
"maxItems": 50,
"proxyConfiguration": {
"useApifyProxy": true
}
}

Multiple Categories

Run the actor multiple times with different categories, or create a workflow that loops through categories.

๐Ÿ“œ Changelog

Version 1.0.1 (Current)

  • โœ… Fixed proxy timeout issues
  • โœ… Optimized concurrency (5โ†’2) for reliability
  • โœ… Added rate limiting (30 requests/min)
  • โœ… Improved session management
  • โœ… Reduced wait times for faster scraping
  • โœ… Better error handling and logging
  • โœ… Default memory: 2048 MB
  • โœ… Default timeout: 25 minutes

Version 1.0.0

  • Initial release
  • Basic scraping functionality
  • Proxy support

๐Ÿ“ Notes

  • Data Freshness: Data is scraped in real-time from Yellow Pages Kenya
  • Legal: Scraping publicly available business information for legitimate purposes
  • Rate Limits: Actor respects website resources with built-in rate limiting
  • Support: For issues, contact via Apify Console

๐ŸŽ“ Tips for Success

  1. Test First: Always run with 10-20 items before large scrapes
  2. Check Logs: Monitor for errors and adjust settings accordingly
  3. Batch Processing: Split large scrapes into smaller runs
  4. Data Validation: Use email verification services for best results
  5. Stay Updated: Yellow Pages Kenya may update their website structure

๐ŸŒŸ Happy Scraping!

This actor is optimized for reliability and data quality. For best results:

  • Start with small batches
  • Use datacenter proxies
  • Monitor your first run
  • Adjust based on results

Need help? Check the troubleshooting section or contact support via Apify Console.


License: Apache-2.0
Author: Apify Community
Maintained: Yes โœ…
Last Updated: October 2025