# Alibaba Product & Supplier Scraper (`sovanza.inc/alibaba-product-supplier-scraper`) Actor

Scrape Alibaba product data including title, price, images, description, reviews, specs, and variations. Handles anti-bot, proxies, and regional layouts. Export clean JSON for eCommerce, market research, and competitor tracking.

- **URL**: https://apify.com/sovanza.inc/alibaba-product-supplier-scraper.md
- **Developed by:** [Sovanza](https://apify.com/sovanza.inc) (community)
- **Categories:** E-commerce
- **Stats:** 3 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

$15.00/month + usage

To use this Actor, you pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period.You also pay for the Apify platform usage, which gets cheaper the higher Apify subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#rental-actors

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### Alibaba Product & Supplier Scraper – Extract Data, Specs & Reviews

### What is Alibaba Scraper and How Does It Work?

Alibaba Scraper is a powerful Alibaba product data extraction tool built on Apify that allows you to scrape complete product information, specifications, supplier details, and reviews from Alibaba listings. It is designed for businesses, wholesalers, dropshippers, and researchers who want to automate product sourcing, supplier analysis, competitor research, and bulk data extraction — without manual effort.

➡️ This scraper helps turn Alibaba into a structured dataset for business decisions and automation.

### Why Use This Alibaba Scraper?

Use this scraper to:
- Extract product data from Alibaba at scale
- Analyze suppliers, pricing, and product specifications
- Discover trending and profitable products
- Monitor competitor listings and suppliers
- Automate sourcing and research workflows

### Features

- Scrape detailed Alibaba product listings
- Extract specifications, features, and descriptions
- Capture pricing information (and MOQ where available)
- Extract ratings and review metrics (when available / enabled)
- Optional extraction of variants/options (size, model, etc.)
- Proxy support and retries for reliability
- Structured output exportable in JSON, CSV, or Excel via Apify datasets

### How to Use Alibaba Product & Supplier Scraper on Apify

#### Using the Actor

To use this actor on Apify, follow these simple steps:

1. **Go to the Alibaba Product & Supplier Scraper** on the Apify platform.

2. **Input Configuration**:
   - Enter one or more Alibaba product URLs you want to scrape.
   - Enable optional extraction settings (reviews, variants, details) as needed.
   - Select language and proxy country if required.

#### Input Configuration

The actor accepts the following input parameters (based on `INPUT_SCHEMA.json`):

```json
{
  "startUrls": [
    { "url": "https://www.alibaba.com/product-detail/EXAMPLE.html" }
  ],
  "url": "https://www.alibaba.com/product-detail/EXAMPLE.html",
  "scrapeReviews": false,
  "scrapeProductVariants": false,
  "scrapeProductDetails": false,
  "language": "en",
  "proxyCountry": "AUTO_SELECT_PROXY_COUNTRY"
}
````

- `startUrls` (optional): List of Alibaba product page URLs to scrape (request list format).
- `url` (optional): Single product URL (legacy; use `startUrls` for multiple).
- `scrapeReviews` (optional): Whether to scrape product reviews (default: `false`).
- `scrapeProductVariants` (optional): Whether to scrape product variants/options (default: `false`).
- `scrapeProductDetails` (optional): Whether to scrape detailed product specifications (default: `false`).
- `language` (optional): Language to use on Alibaba (default: `en`).
- `proxyCountry` (optional): Proxy country (`AUTO_SELECT_PROXY_COUNTRY`, `US`, `GB`, `DE`, `FR`, `JP`, `CA`, `IT`).

3. **Run the Actor**:
   - Click the **Start** button to begin scraping.
   - The actor will process each URL and store extracted items in the default dataset.

4. **Access Your Results**:
   - View results in the **Dataset** tab.
   - Export in JSON, CSV, or Excel.
   - Access via Apify API for automation workflows.

5. **Schedule Regular Runs** (Optional):
   - Schedule recurring scraping to monitor competitor listings and sourcing opportunities.
   - Use webhooks to trigger downstream workflows.

### Output

All results are stored in Apify dataset storage. According to the dataset schema, each item typically includes:

- `url`: Product URL.
- `title`: Product title.
- `price`: Current price (string).
- `availability`: Stock availability status (string).
- `images`: Array of product image URLs.
- `description`: Product description.
- `features`: Array of product feature bullets.
- `average_rating`: Average rating (string or `null`).
- `review_count`: Review count (string or `null`).
- `product_details`: Object with additional product specifications and extracted details.
- `timestamp`: ISO date-time string indicating when the product was scraped.

Example item (simplified):

```json
{
  "url": "https://www.alibaba.com/product-detail/EXAMPLE.html",
  "title": "Example Alibaba Product",
  "price": "$19.99 - $29.99",
  "availability": "In Stock",
  "images": ["https://example.com/image1.jpg"],
  "description": "Detailed product description...",
  "features": ["Feature 1", "Feature 2"],
  "average_rating": "4.5",
  "review_count": "120",
  "product_details": {
    "Brand Name": "Example Brand",
    "Material": "Stainless Steel",
    "MOQ": "100 Pieces",
    "Supplier": "Example Supplier Co., Ltd."
  },
  "timestamp": "2025-06-18T09:30:00Z"
}
```

➡️ Output is clean, structured, and ready for sourcing, analytics, and automation.

### How the Scraper Works

In general, the actor:

1. Loads each Alibaba product URL (using a browser automation approach where needed).
2. Extracts core product data (title, price, images, description).
3. Extracts structured specifications/details into `product_details` when enabled.
4. Extracts ratings/review metrics when available (and when enabled).
5. Saves each product as a structured dataset item.

### Anti-blocking Measures

To improve reliability on Alibaba:

- Supports proxy configuration (country selection).
- Retries failed requests where appropriate.
- Uses browser-like behavior when needed to handle dynamic page structures.

### Performance Optimization

- Scrape multiple product URLs per run with `startUrls`.
- Enable only the options you need (`scrapeReviews`, `scrapeProductVariants`, `scrapeProductDetails`) for faster runs.
- Schedule runs to monitor changes over time.

### Why Choose This Actor?

- Extract product data, specs, and sourcing signals in one run
- No official Alibaba API required
- Automation-ready via Apify API, scheduling, and webhooks
- Clean structured datasets for sourcing, dropshipping, and analytics
- Built for serious business workflows

### FAQ

#### How does Alibaba Scraper work?

It extracts publicly available data directly from Alibaba product pages. When you provide a product URL, the actor loads the page, collects structured information such as product details, specifications, pricing, and supplier data, and then organizes it into a clean dataset.

#### Can I scrape multiple Alibaba products at once?

Yes. Use `startUrls` to provide multiple product URLs and scrape them in a single run.

#### Does this scraper require Alibaba API access?

No. It works independently by extracting publicly available data directly from product pages.

#### What kind of supplier information can I extract?

Supplier/company details can be captured when available on the product page. Depending on the listing, this may appear under `product_details` in the output.

#### Can I extract product specifications and technical details?

Yes. Enable `scrapeProductDetails` to capture detailed specifications and structured attributes.

#### Is the extracted data accurate and up-to-date?

Yes. Data is extracted in real time directly from Alibaba pages at the moment of scraping.

#### Can I automate scraping and run it regularly?

Yes. Use Apify scheduling and API features to run at regular intervals.

#### What output formats are supported?

JSON, CSV, and Excel via Apify dataset export.

#### Is scraping Alibaba legal?

Scraping publicly available data is generally allowed, but you should ensure compliance with Alibaba’s terms of service and applicable laws.

### SEO Keywords

alibaba product scraper\
alibaba supplier scraper\
alibaba data scraper\
alibaba product data extractor\
alibaba scraping api\
alibaba dropshipping scraper\
alibaba sourcing tool\
alibaba competitor analysis tool\
alibaba product research tool

### Actor permissions

This Actor is designed to work with **limited permissions**. It only reads input and writes to its default dataset; it does not access other user data or require full account access.

**To set limited permissions in Apify Console:**

1. Open your Actor on the Apify platform.
2. Go to the **Source** tab (or **Settings**).
3. Click **Review permissions** (or open **Settings** → **Permissions**).
4. Select **Limited permissions** and save.

Using limited permissions improves trust and can improve your Actor's quality score in the Store.

### Limitations

- Alibaba pages and protections change frequently, which may require scraper updates.
- Some supplier or review data may not be available on all listings.
- Large-scale scraping may require appropriate proxy configuration and Apify resources.

### License

This project is licensed under the MIT License - see the LICENSE file for details.

### Get Started

Start extracting Alibaba product and supplier data to power sourcing, dropshipping, and automation workflows today. 🚀

# Actor input Schema

## `startUrls` (type: `array`):

List of Alibaba product page URLs to scrape (one per line)

## `url` (type: `string`):

URL of a single Alibaba product page to scrape (use 'Start URLs' for multiple)

## `scrapeReviews` (type: `boolean`):

Whether to scrape product reviews

## `scrapeProductVariants` (type: `boolean`):

Whether to scrape product variants/options

## `scrapeProductDetails` (type: `boolean`):

Whether to scrape detailed product specifications

## `language` (type: `string`):

Language to use on Alibaba

## `proxyCountry` (type: `string`):

Country for proxy (AUTO\_SELECT\_PROXY\_COUNTRY to auto-select based on domain)

## Actor input object example

```json
{
  "scrapeReviews": false,
  "scrapeProductVariants": false,
  "scrapeProductDetails": false,
  "language": "en",
  "proxyCountry": "AUTO_SELECT_PROXY_COUNTRY"
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {};

// Run the Actor and wait for it to finish
const run = await client.actor("sovanza.inc/alibaba-product-supplier-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {}

# Run the Actor and wait for it to finish
run = client.actor("sovanza.inc/alibaba-product-supplier-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{}' |
apify call sovanza.inc/alibaba-product-supplier-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=sovanza.inc/alibaba-product-supplier-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Alibaba Product & Supplier Scraper",
        "description": "Scrape Alibaba product data including title, price, images, description, reviews, specs, and variations. Handles anti-bot, proxies, and regional layouts. Export clean JSON for eCommerce, market research, and competitor tracking.",
        "version": "0.0",
        "x-build-id": "j8K5QBSdeQ36W7Lib"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/sovanza.inc~alibaba-product-supplier-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-sovanza.inc-alibaba-product-supplier-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/sovanza.inc~alibaba-product-supplier-scraper/runs": {
            "post": {
                "operationId": "runs-sync-sovanza.inc-alibaba-product-supplier-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/sovanza.inc~alibaba-product-supplier-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-sovanza.inc-alibaba-product-supplier-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Product URLs",
                        "type": "array",
                        "description": "List of Alibaba product page URLs to scrape (one per line)",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "url": {
                        "title": "Single Product URL (Legacy)",
                        "type": "string",
                        "description": "URL of a single Alibaba product page to scrape (use 'Start URLs' for multiple)"
                    },
                    "scrapeReviews": {
                        "title": "Scrape Reviews",
                        "type": "boolean",
                        "description": "Whether to scrape product reviews",
                        "default": false
                    },
                    "scrapeProductVariants": {
                        "title": "Scrape Product Variants",
                        "type": "boolean",
                        "description": "Whether to scrape product variants/options",
                        "default": false
                    },
                    "scrapeProductDetails": {
                        "title": "Scrape Product Details",
                        "type": "boolean",
                        "description": "Whether to scrape detailed product specifications",
                        "default": false
                    },
                    "language": {
                        "title": "Language",
                        "enum": [
                            "en",
                            "de",
                            "es",
                            "fr",
                            "it",
                            "ja",
                            "zh_CN",
                            "pt",
                            "nl",
                            "pl",
                            "tr",
                            "ar",
                            "sv",
                            "ko",
                            "hi",
                            "cs",
                            "da",
                            "he",
                            "ru",
                            "th"
                        ],
                        "type": "string",
                        "description": "Language to use on Alibaba",
                        "default": "en"
                    },
                    "proxyCountry": {
                        "title": "Proxy Country",
                        "enum": [
                            "AUTO_SELECT_PROXY_COUNTRY",
                            "US",
                            "GB",
                            "DE",
                            "FR",
                            "JP",
                            "CA",
                            "IT"
                        ],
                        "type": "string",
                        "description": "Country for proxy (AUTO_SELECT_PROXY_COUNTRY to auto-select based on domain)",
                        "default": "AUTO_SELECT_PROXY_COUNTRY"
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
