# Yellow Pages UAE Scraper (`khadinakbar/yellow-pages-uae-scraper`) Actor

Scrape UAE business listings from Yello (yellowpages UAE) by category and emirate: name, phone, website, address, geo. MCP-ready. $0.003 per business.

- **URL**: https://apify.com/khadinakbar/yellow-pages-uae-scraper.md
- **Developed by:** [Khadin Akbar](https://apify.com/khadinakbar) (community)
- **Categories:** Lead generation, MCP servers, AI
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.00 / 1,000 business scrapeds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Yellow Pages UAE Scraper

Scrape UAE business listings from **Yello (yello.ae)** — the United Arab Emirates' verified online business directory — by category, emirate, or direct URL. Built for B2B lead generation, sales prospecting, and market research, and designed to be called directly by AI agents (MCP-ready).

### What you get

One clean record per business, parsed primarily from each listing's structured `schema.org/LocalBusiness` data:

| Field | Description |
|---|---|
| `businessName` | Company name |
| `phone` | Primary phone number (UAE format) |
| `website` | Business website URL (null if none listed) |
| `email` | Best-effort contact email (often null — Yello gates email behind an enquiry form) |
| `streetAddress` | Street / area address |
| `city` / `emirate` | Locality and emirate |
| `country` | Always `AE` |
| `latitude` / `longitude` | Geo-coordinates |
| `primaryCategory` / `categories` | Business category and full breadcrumb categories |
| `workingHours` | Opening hours per day |
| `establishedYear` | Year the business was established (when published) |
| `description` | Business description (when published) |
| `imageUrl` | Logo / listing image |
| `profileUrl` / `companyId` | Yello profile URL and ID |

### When to use it

- Building **UAE B2B lead lists** by industry and emirate (restaurants in Dubai, estate agents in Abu Dhabi, etc.).
- **Sales prospecting** with phone, website, and location in one row.
- **Market / competitor research** across a UAE category.
- Feeding a downstream enrichment or CRM pipeline.

Not for non-UAE directories. For US Yellow Pages or Google Maps leads, use a dedicated actor.

### Pricing

Pay-per-event:

- **$0.00005** per run (actor start).
- **$0.003** per business record returned.

A run that returns 1,000 businesses costs about **$3.00**. The `maxResults` input is a hard cap on how many businesses are returned and billed.

### Input

All inputs are optional, but provide at least one of `categories`, `city`, or `startUrls`.

| Input | Type | Description |
|---|---|---|
| `categories` | array | Category slugs/names, e.g. `["restaurants", "estate-agents"]`. |
| `city` | enum | Emirate filter applied to categories: `dubai`, `abu-dhabi`, `sharjah`, `ajman`, `ras-al-khaimah`, `fujairah`, `umm-al-quwain`, `al-ain`, `jebel-ali-free-zone`. Empty = all UAE. |
| `startUrls` | array | Direct yello.ae URLs: `/category/<slug>`, `/location/<emirate>`, or `/company/<id>/<slug>`. |
| `maxResults` | integer | Max businesses to return (default 1000). |
| `enrichDetails` | boolean | Visit each business page for full data (default `true`). Set `false` for fast, listing-only runs. |

#### Example input

```json
{
  "categories": ["restaurants"],
  "city": "dubai",
  "maxResults": 200,
  "enrichDetails": true
}
````

### Run it via API

JavaScript (Apify client):

```javascript
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('khadinakbar/yellow-pages-uae-scraper').call({
    categories: ['estate-agents'],
    city: 'abu-dhabi',
    maxResults: 500,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
```

Python (Apify client):

```python
from apify_client import ApifyClient

client = ApifyClient("YOUR_TOKEN")
run = client.actor("khadinakbar/yellow-pages-uae-scraper").call(run_input={
    "categories": ["doctors-and-clinics"],
    "city": "sharjah",
    "maxResults": 300,
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item["businessName"], item.get("phone"))
```

### Use with AI agents (MCP)

This actor is MCP-ready and exposed through the Apify MCP server as `khadinakbar/yellow-pages-uae-scraper`. An agent can call it with a category and emirate and receive structured JSON business records — ideal for autonomous lead-research workflows.

### How it works

1. Builds start requests from your categories (+ optional city filter), location pages, or direct URLs.
2. Paginates each category listing page (`/category/<slug>/<page>`), collecting business profile links.
3. Visits each business page and extracts data from its `schema.org/LocalBusiness` JSON-LD block plus on-page fields.
4. Returns one flat record per business and stops at `maxResults`.

Yello.ae serves static HTML with structured data and no aggressive anti-bot protection, so runs are fast and reliable. Only directory content paths allowed by `robots.txt` are crawled.

### FAQ

**Why is `email` often null?**
Yello.ae hides most business emails behind a "Send Enquiry" form rather than publishing them. The actor returns an email only when one is publicly present on the page. Phone and website are available for the vast majority of listings.

**Can I scrape an entire emirate?**
Yes — provide only a `city` (no categories) and the actor expands that emirate's category index, bounded by `maxResults`.

**How do I scrape specific businesses?**
Put their `/company/<id>/<slug>` URLs in `startUrls`.

**Does it deduplicate?**
Yes — businesses are deduplicated by company ID within a run.

### Legal

This actor collects publicly available business directory information from yello.ae for legitimate business purposes such as lead generation and market research. You are responsible for using the data in compliance with yello.ae's Terms of Service, the UAE PDPL, GDPR, and any other applicable laws and regulations. Do not use scraped personal data for unsolicited communication that violates applicable anti-spam or privacy law. This actor is not affiliated with, endorsed by, or sponsored by Yello or yellowpages.ae.

# Actor input Schema

## `categories` (type: `array`):

Business categories to scrape from yello.ae, as slugs or plain names (e.g. 'restaurants', 'estate-agents', 'doctors and clinics'). Each is resolved to a yello.ae category listing and paginated. Leave empty if you provide Start URLs instead. NOT a free-text keyword search - use a real directory category.

## `city` (type: `string`):

Optional emirate or city to restrict category results to (applied as a yello.ae city filter). Pick one value; leave empty to scrape all of the UAE. Only applies to the Categories input above, not to direct company Start URLs. Example: 'dubai'.

## `startUrls` (type: `array`):

Direct yello.ae URLs to scrape, one per line: category pages (https://www.yello.ae/category/<slug>), location pages (https://www.yello.ae/location/<emirate>), or company pages (https://www.yello.ae/company/<id>/<slug>). Category and location URLs are paginated; company URLs are scraped directly. Use this for precise control. NOT for non-yello.ae domains.

## `maxResults` (type: `integer`):

Maximum number of business records to return across the whole run (hard cap on billing). The run stops and finishes gracefully once this many businesses are scraped. Defaults to 1000; set lower for a quick test. Counts final business records, not intermediate listing pages.

## `enrichDetails` (type: `boolean`):

When true (default), each business's own page is visited to extract full data (phone, website, address, geo, working hours, established year) from its JSON-LD block. When false, only the lighter listing-page fields (name, profile URL, address) are returned - faster and cheaper but less complete. Keep true for lead-generation use cases.

## `proxyConfiguration` (type: `object`):

Proxy settings for outbound requests. Defaults to Apify datacenter proxy, which is sufficient because yello.ae has no anti-bot protection. Leave as default unless you have a specific network requirement.

## Actor input object example

```json
{
  "categories": [
    "restaurants",
    "estate-agents"
  ],
  "city": "dubai",
  "startUrls": [],
  "maxResults": 50,
  "enrichDetails": true,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `businesses` (type: `string`):

Dataset of UAE business records.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "categories": [
        "restaurants"
    ],
    "city": "dubai",
    "startUrls": [],
    "maxResults": 50
};

// Run the Actor and wait for it to finish
const run = await client.actor("khadinakbar/yellow-pages-uae-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "categories": ["restaurants"],
    "city": "dubai",
    "startUrls": [],
    "maxResults": 50,
}

# Run the Actor and wait for it to finish
run = client.actor("khadinakbar/yellow-pages-uae-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "categories": [
    "restaurants"
  ],
  "city": "dubai",
  "startUrls": [],
  "maxResults": 50
}' |
apify call khadinakbar/yellow-pages-uae-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=khadinakbar/yellow-pages-uae-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Yellow Pages UAE Scraper",
        "description": "Scrape UAE business listings from Yello (yellowpages UAE) by category and emirate: name, phone, website, address, geo. MCP-ready. $0.003 per business.",
        "version": "0.1",
        "x-build-id": "5PmitzWnZge6noBvh"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/khadinakbar~yellow-pages-uae-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-khadinakbar-yellow-pages-uae-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/khadinakbar~yellow-pages-uae-scraper/runs": {
            "post": {
                "operationId": "runs-sync-khadinakbar-yellow-pages-uae-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/khadinakbar~yellow-pages-uae-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-khadinakbar-yellow-pages-uae-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "categories": {
                        "title": "Categories",
                        "type": "array",
                        "description": "Business categories to scrape from yello.ae, as slugs or plain names (e.g. 'restaurants', 'estate-agents', 'doctors and clinics'). Each is resolved to a yello.ae category listing and paginated. Leave empty if you provide Start URLs instead. NOT a free-text keyword search - use a real directory category.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "city": {
                        "title": "Emirate / City filter",
                        "enum": [
                            "",
                            "dubai",
                            "abu-dhabi",
                            "sharjah",
                            "ajman",
                            "ras-al-khaimah",
                            "fujairah",
                            "umm-al-quwain",
                            "al-ain",
                            "jebel-ali-free-zone"
                        ],
                        "type": "string",
                        "description": "Optional emirate or city to restrict category results to (applied as a yello.ae city filter). Pick one value; leave empty to scrape all of the UAE. Only applies to the Categories input above, not to direct company Start URLs. Example: 'dubai'.",
                        "default": ""
                    },
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Direct yello.ae URLs to scrape, one per line: category pages (https://www.yello.ae/category/<slug>), location pages (https://www.yello.ae/location/<emirate>), or company pages (https://www.yello.ae/company/<id>/<slug>). Category and location URLs are paginated; company URLs are scraped directly. Use this for precise control. NOT for non-yello.ae domains.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxResults": {
                        "title": "Max results",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of business records to return across the whole run (hard cap on billing). The run stops and finishes gracefully once this many businesses are scraped. Defaults to 1000; set lower for a quick test. Counts final business records, not intermediate listing pages.",
                        "default": 1000
                    },
                    "enrichDetails": {
                        "title": "Enrich from detail pages",
                        "type": "boolean",
                        "description": "When true (default), each business's own page is visited to extract full data (phone, website, address, geo, working hours, established year) from its JSON-LD block. When false, only the lighter listing-page fields (name, profile URL, address) are returned - faster and cheaper but less complete. Keep true for lead-generation use cases.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings for outbound requests. Defaults to Apify datacenter proxy, which is sufficient because yello.ae has no anti-bot protection. Leave as default unless you have a specific network requirement.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
