# Alibaba Product Extractor (`kawsar/alibaba-product-extractor`) Actor

Extracts high-quality product data, price ranges, MOQ, ratings, and supplier details from Alibaba search results and category URLs using built-in bypass infrastructure.

- **URL**: https://apify.com/kawsar/alibaba-product-extractor.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## 🏷️ Alibaba Product Extractor

Alibaba Product Extractor is a high-performance, enterprise-grade web scraping tool designed to harvest structured product data, pricing tiers, minimum order quantities (MOQ), star ratings, and detailed supplier information from Alibaba. 

By leveraging advanced, built-in network bypass systems, this extractor automatically routes requests through premium residential IP pools and solves security challenges invisibly. Whether you are monitoring competitor prices, analyzing bulk discount thresholds, sourcing manufacturers, or running large-scale market research, this scraper delivers pristine, structured datasets in **JSON, CSV, or Excel** formats.

---

### 🌟 Key Features

*   **⚡ SEO-Optimized Fast Extraction:** Automatically translates standard searches into optimized, pre-rendered endpoints to bypass heavy browser loads and speed up scraping times.
*   **🎯 Dual search input modes:** Run extractions using either raw search keywords (e.g. `wireless earbuds`) or paste direct filtered category/search result page URLs.
*   **🛡️ Robust Security Bypass:** Seamlessly handles anti-bot defenses, CAPTCHA walls, and IP blocks. Includes configurable toggle controls for **Residential Proxy routing** and **JavaScript headless browser rendering**.
*   **📦 Deep Supplier Analytics:** Collects critical vendor fields such as company name, official supplier homepages, vendor country of origin, and certified Gold Supplier years.
*   **📊 Comprehensive Product Details:** Captures product names, high-res main images, rating scores, transaction volumes/orders, price ranges, discount/promotional rates, and MOQs.
*   **🔄 Smart Auto-Pagination:** Effortlessly traverses multiple search pages to retrieve exactly the volume of items requested.

---

### 🚀 How to Use

1.  **Configure Input Targets:** Add product search keywords or paste raw Alibaba search/listing URLs under the **Search Keywords** or **Search / Category URLs** sections.
2.  **Fine-tune settings:** Adjust limit parameters (default is `30` items) and toggles for Residential Proxy routing or browser rendering as needed.
3.  **Execute & Run:** Run the Actor. It will dynamically paginate, resolve captcha hurdles, and extract products.
4.  **Download Clean Data:** Instantly export the collected records from the Apify Dataset tab into your preferred format (JSON, CSV, or Excel).

---

### ⚙️ Input Parameters

The extractor is fully customizable through the following input parameters:

| Parameter Field | Type | Default | Description |
| :--- | :--- | :--- | :--- |
| **Search Keywords** (`queries`) | Array | `["wireless earbuds"]` | List of terms to search on Alibaba. (e.g., `["industrial valves", "leather shoes"]`). |
| **Search / Category URLs** (`urls`) | Array | `[]` | Direct raw Alibaba listing or category page URLs to scrape. |
| **Max Items per Query** (`maxItems`) | Integer | `30` | Maximum number of products to extract per keyword or URL (Max: 1000). |
| **Use Residential Proxies** (`useResidential`) | Boolean | `true` | Route traffic through premium mobile and residential IP pools to prevent rate-limits and blocks. |
| **Use JS Browser Rendering** (`useRender`) | Boolean | `false` | Enable headless rendering to compile dynamic JS scripts or solve heavy security layers. |
| **Request Timeout** (`requestTimeoutSecs`) | Integer | `30` | Timeout in seconds for fetching each target page. |


---

### 📊 Extracted Dataset Schema

Every collected item generates a clean record with the following structured keys:

| Field Name | Type | Description |
| :--- | :--- | :--- |
| `id` | String | Unique product ID on Alibaba. |
| `name` | String | Fully sanitized product title (HTML tags automatically stripped). |
| `productUrl` | String | Direct, absolute URL to the product detail page. |
| `imageUrl` | String | Absolute link to the primary high-resolution product image. |
| `price` | String | Base listing price text (in original or target currency). |
| `priceRange` | String | Range of pricing or wholesale price tiers if available. |
| `moq` | String | Minimum Order Quantity required by the supplier (e.g., `Min. order: 10 pieces`). |
| `companyName` | String | Legal name of the manufacturer or supplier. |
| `supplierHref` | String | Absolute link to the supplier's store profile on Alibaba. |
| `goldSupplierYears` | Integer | Certified number of years active as an Alibaba Gold Supplier. |
| `rating` | Float | Product rating score out of 5.0. |
| `reviewCount` | Integer | Total count of customer reviews written for this listing. |
| `soldOrder` | String | Total order volume or quantity sold (e.g., `10,000+ orders`). |
| `countryName` | String | Full name of the supplier's country/region of operation. |
| `countryCode` | String | Standard ISO 2-letter country code of the supplier (e.g., `CN`, `US`). |
| `promotionPrice` | String | Discounted/promotion price currently active. |
| `discount` | String | Discount percentage or discount value string. |
| `query` | String | The search keyword or target URL that produced this result. |
| `scrapedAt` | String | ISO formatted UTC timestamp of when this item was extracted. |

---

### 📝 Sample Output

Here is a typical JSON dataset output format:

```json
{
  "id": "1601723870128",
  "name": "Daily Waterproof Wireless Call Music Touch Screen Smartwatch with Earbuds 2 in 1 Multifunctional Sports Smart Watch",
  "productUrl": "https://www.alibaba.com/product-detail/Daily-Waterproof-Wireless-Call-Music-Touch_1601723870128.html",
  "imageUrl": "https://s.alicdn.com/@sc04/kf/Hbe720e2d36304910aafb70ec7c4d92bfz.png_300x300.png",
  "price": "$12.45 - $14.80",
  "priceRange": "$12.45 - $14.80",
  "moq": "Min. order: 500 pieces",
  "companyName": "Dongguan Cenyuan Electronic Technology Co., Ltd.",
  "supplierHref": "https://cyadult.en.alibaba.com/company_profile.html",
  "goldSupplierYears": 2,
  "rating": 4.5,
  "reviewCount": 57,
  "soldOrder": "50,000+ sold",
  "countryName": "China",
  "countryCode": "CN",
  "promotionPrice": "$11.20",
  "discount": "10% off",
  "query": "wireless earbuds",
  "scrapedAt": "2026-06-09T05:07:15.000Z"
}
````

***

### ⚖️ Disclaimer & Legal Notice

This scraper is built strictly for personal research, educational analysis, and public-data monitoring. You are solely responsible for ensuring your data extraction complies with Alibaba's Terms of Service, Privacy Policies, and any local laws regarding automated data scraping. Use responsibly and schedule scrapers with appropriate intervals to respect server load.

# Actor input Schema

## `queries` (type: `array`):

List of keywords or search queries to run on Alibaba (e.g. 'wireless earbuds', 'leather shoes').

## `urls` (type: `array`):

List of raw Alibaba search result URLs or category page URLs to scrape.

## `useResidential` (type: `boolean`):

Highly recommended for Alibaba! Routes requests through residential and mobile IP pools to bypass strict blocks (uses super=true).

## `useRender` (type: `boolean`):

Uses headless browser rendering to execute JavaScript and solve security challenges (uses render=true).

## `maxItems` (type: `integer`):

Maximum number of products to extract per keyword or URL.

## `requestTimeoutSecs` (type: `integer`):

Timeout in seconds for fetching each page.

## Actor input object example

```json
{
  "queries": [
    "wireless earbuds"
  ],
  "urls": [
    "https://www.alibaba.com/trade/search?keywords=wireless+earbuds"
  ],
  "useResidential": true,
  "useRender": false,
  "maxItems": 30,
  "requestTimeoutSecs": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        "wireless earbuds"
    ],
    "urls": [
        "https://www.alibaba.com/trade/search?keywords=wireless+earbuds"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/alibaba-product-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "queries": ["wireless earbuds"],
    "urls": ["https://www.alibaba.com/trade/search?keywords=wireless+earbuds"],
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/alibaba-product-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    "wireless earbuds"
  ],
  "urls": [
    "https://www.alibaba.com/trade/search?keywords=wireless+earbuds"
  ]
}' |
apify call kawsar/alibaba-product-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/alibaba-product-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Alibaba Product Extractor",
        "description": "Extracts high-quality product data, price ranges, MOQ, ratings, and supplier details from Alibaba search results and category URLs using built-in bypass infrastructure.",
        "version": "0.0",
        "x-build-id": "9BiCZdLGs8fBJSBxx"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~alibaba-product-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-alibaba-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~alibaba-product-extractor/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-alibaba-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~alibaba-product-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-alibaba-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "queries": {
                        "title": "Search Keywords",
                        "type": "array",
                        "description": "List of keywords or search queries to run on Alibaba (e.g. 'wireless earbuds', 'leather shoes').",
                        "items": {
                            "type": "string"
                        }
                    },
                    "urls": {
                        "title": "Search / Category URLs",
                        "type": "array",
                        "description": "List of raw Alibaba search result URLs or category page URLs to scrape.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "useResidential": {
                        "title": "Use Residential Proxies",
                        "type": "boolean",
                        "description": "Highly recommended for Alibaba! Routes requests through residential and mobile IP pools to bypass strict blocks (uses super=true).",
                        "default": true
                    },
                    "useRender": {
                        "title": "Use JS Browser Rendering",
                        "type": "boolean",
                        "description": "Uses headless browser rendering to execute JavaScript and solve security challenges (uses render=true).",
                        "default": false
                    },
                    "maxItems": {
                        "title": "Max Items per Query",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of products to extract per keyword or URL.",
                        "default": 30
                    },
                    "requestTimeoutSecs": {
                        "title": "Request Timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Timeout in seconds for fetching each page.",
                        "default": 30
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
