# Taobao Tmall Product Scraper (`sian.agency/taobao-tmall-product-scraper`) Actor

Extract Taobao & Tmall product data with clean structured output. Four scrapers in one — product details, keyword search, shop catalogs, customer reviews. Perfect for dropshippers, sourcing agents & e-commerce researchers.

- **URL**: https://apify.com/sian.agency/taobao-tmall-product-scraper.md
- **Developed by:** [SIÁN OÜ](https://apify.com/sian.agency) (community)
- **Categories:** E-commerce
- **Stats:** 12 total users, 5 monthly users, 88.0% runs succeeded, 3 bookmarks
- **User rating**: 5.00 out of 5 stars

## Pricing

from $6.00 / 1,000 detail results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Taobao & Tmall Scraper — Products, Shops, Search, Reviews 🛍️

[![SIÁN Agency Store](https://img.shields.io/badge/Store-SI%C3%81N%20Agency-1AE392)](https://apify.com/sian.agency?fpr=sian)
[![Taobao](https://img.shields.io/badge/Platform-Taobao-FF4200)](https://apify.com/sian.agency/taobao-tmall-product-scraper?fpr=sian)
[![Tmall](https://img.shields.io/badge/Platform-Tmall-D01E1E)](https://apify.com/sian.agency/taobao-tmall-product-scraper?fpr=sian)

#### 🎉 Four scrapers in one — product details, keyword search, shop catalogs, and customer reviews in a single actor
##### Built for dropshippers, sourcing teams, e-commerce analysts, and anyone who needs clean Taobao data without the setup headache

---

### 📋 Overview

**Tired of hacking together Taobao scrapers that break every other week?** This actor gives you reliable, structured data from Taobao and Tmall — one clean run per task, one tidy dataset out.

**Why thousands of professionals choose SIÁN scrapers:**
- ✅ **Four operations, one actor**: Product detail, keyword search, shop catalog, and review scraping — pick what you need
- ⚡ **79 structured fields**: Every response pre-flattened into a flat row — no parsing nested Chinese JSON
- 🎯 **Production-ready output**: Three predefined dataset views (Overview, Products, Reviews) for instant BI integration
- 💰 **Best price on the market**: Pay-per-result — you only pay for data you actually receive
- 💎 **No account, no API key, no setup**: Just paste an item ID or keyword and run — works out of the box
- ✨ **NEW**: SKU-level variant data with per-variant prices, stock, and swatch images for dropshipping decisions

---

### ✨ Features

- 🛍️ **Product Detail Scraping**: Full product payload — title, SKUs with variant prices, gallery + description images, videos, seller info, shipping costs, coupons, and buyer Q&A
- 🔍 **Keyword Search**: Search Taobao and Tmall by query (Chinese, English, or mixed) with optional price range and Tmall-only filter
- 🏪 **Shop Catalog Sweep**: Pull a seller's full product list by shop ID — ideal for brand mapping and competitor intelligence
- 💬 **Review Scraping**: Customer reviews with text, photos, videos, purchased variant, reviewer info, and follow-up appended comments
- 🎨 **SKU Variant Intelligence**: Per-SKU prices, stock levels, property paths, and swatch images for every color/size combo
- 📊 **Three Dataset Views**: Overview (mixed), Products (search/catalog/detail), Reviews — switch views without re-running
- 🎬 **Media Included**: Product videos, description gallery, review photos and videos — all as direct URLs
- 🌏 **Chinese + English Titles**: Search results include machine-translated English titles where available
- 🔋 **Two Detail Depths**: Rich (full 38 KB payload) for research; Lite (compact 7 KB) for high-volume runs
- 📄 **Paginated Operations**: Keyword Search, Shop Catalog, and Reviews all paginate automatically — set `maxPages` to control scope

---

### 🎬 Quick Start

Run one operation, get one dataset. The operation selector drives the entire run — either search by keyword, look up a product by ID, dump a shop, or pull reviews.

```bash
curl -X POST "https://api.apify.com/v2/acts/sian.agency~taobao-tmall-product-scraper/runs?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"operation": "productDetail", "itemId": "744983869996"}'
````

***

### 🚀 Getting Started (3 Simple Steps)

#### Step 1: Pick an Operation

Choose one of four modes in the `operation` dropdown: 🛍️ Product Detail, 🔍 Keyword Search, 🏪 Shop Catalog, or 💬 Product Reviews.

#### Step 2: Provide the Key Input

- **Product Detail** / **Reviews** → an `itemId` (the number from any Taobao product URL after `id=`)
- **Keyword Search** → any `keyword` (Chinese, English, or mixed — e.g. `iphone 15` or `无线耳机`)
- **Shop Catalog** → a `userId` (the seller / shop ID, visible as `shopId` in any search result row)

#### Step 3: Run It

Click **Start** — the actor handles the rest. Paginated operations respect `maxPages` (default 5, max 50).

**That's it! In under a minute, you'll have:**

- A flat dataset with up to 79 structured fields per row
- Every image, video, and variant URL as direct links
- Three ready-to-filter views for BI tools, Airtable, Google Sheets, or n8n

***

### 📥 Input Configuration

| Field | Type | Required | Description |
|---|---|---|---|
| `operation` | enum | ✅ | `productDetail` | `keywordSearch` | `shopCatalog` | `productReviews` |
| `itemId` | string | for productDetail + productReviews | Numeric Taobao/Tmall item ID |
| `detailVersion` | enum | — | `v1` (rich, default) or `v5` (lite, faster) |
| `keyword` | string | for keywordSearch | Search query — supports Chinese, English, mixed |
| `startPrice` | integer | — | Min price in CNY (keywordSearch only) |
| `endPrice` | integer | — | Max price in CNY (keywordSearch only) |
| `tmallOnly` | boolean | — | Restrict search to Tmall brand stores |
| `userId` | string | for shopCatalog | Numeric shop / seller ID |
| `maxPages` | integer | — | Pages to fetch (1–50, default 5) |
| `includeRawResponse` | boolean | — | Include unflattened `_raw` field per row |
| `proxy` | object | — | Apify proxy config (default: residential) |

#### Example — Product Detail

```json
{
  "operation": "productDetail",
  "itemId": "744983869996",
  "detailVersion": "v1"
}
```

#### Example — Keyword Search with price filter

```json
{
  "operation": "keywordSearch",
  "keyword": "sony headphones",
  "startPrice": 500,
  "endPrice": 3000,
  "tmallOnly": true,
  "maxPages": 10
}
```

#### Example — Shop Catalog

```json
{
  "operation": "shopCatalog",
  "userId": "713464357",
  "maxPages": 20
}
```

#### Example — Product Reviews

```json
{
  "operation": "productReviews",
  "itemId": "742902854135",
  "maxPages": 10
}
```

#### 💡 How to find IDs

- **Item ID** — in any Taobao / Tmall product URL after `id=`
  `https://item.taobao.com/item.htm?id=744983869996` → `744983869996`
  Every search and shop-catalog row also returns `itemId`.
- **Shop / Seller ID** — returned as `shopId` in every keyword-search or shop-catalog row. The simplest chain: run Keyword Search for a brand first, note the top `shopId`, then feed that ID into Shop Catalog.
- **Keywords** — no special formatting. Chinese, English, or mixed queries all work.

***

### 📤 Output

Each run writes to an Apify dataset with up to **79 structured fields** per row. Every row carries an `_operation` discriminator so you can filter mixed datasets.

#### Core fields (all operations)

| Field | Type | Description |
|---|---|---|
| `_operation` | string | Which operation produced this row |
| `_fetchedAt` | string | ISO-8601 timestamp |
| `itemId` | string | Stable product ID |
| `title` | string | Full product title |
| `priceYuan` | number | Current price in CNY |
| `imageUrl` | string | Primary image URL |
| `shopId` | string | Seller ID (pivot into Shop Catalog) |
| `shopName` | string | Seller display name |
| `status` | string | `success` / `error` |

#### Product Detail extras

`originalPriceYuan`, `promotionPriceYuan`, `discountPct`, `priceRange`, `skus[]` (variant list with prices + stock + swatch image), `skuCount`, `quantityInStock`, `images[]`, `descImages[]`, `videoUrl`, `videoCoverUrl`, `properties[]`, `itemDescHtml`, `creativeText`, `couponInfo`, `couponUrl`, `freeShipping`, `deliveryFee`, `qna[]`, `tags[]`, `categoryId`, `rootCategoryId`, `location`

#### Keyword Search extras

`titleEn` (English title), `subTitle`, `discntPriceYuan`, `commentCount`, `itemGradeAvg` (product rating), `sellerLevel`, `sellerGoodRate`, `sellerLoc`, `userType`, `tags[]`, `_sourceKeyword`, `_page`

#### Shop Catalog extras

`promotionPrice`, `finalPromotionPrice`, `reservePrice`, `minDiscountPrice`, `priceAfterCoupon`, `commissionAmount`, `commissionRate`, `payRate30Days`, `dailySellCount`, `provcity`, `levelOneCategoryName`, `_sourceUserId`

#### Product Review extras

`reviewId`, `reviewDate`, `reviewContent`, `reviewAppend`, `reviewAppendDays`, `reviewRatingStars`, `reviewTag`, `reviewPhotos[]`, `reviewVideoUrl`, `reviewSkuLabel`, `reviewBuyAmount`, `reviewUsefulCount`, `reviewerNick`, `reviewerAvatar`, `reviewerVipLevel`, `reviewerAnonymous`, `_sourceItemId`, `_page`

#### Example — Product Detail row (abridged)

```json
{
  "_operation": "productDetail",
  "_fetchedAt": "2026-04-20T11:34:51Z",
  "itemId": "744983869996",
  "title": "绿联转换插头英标港版Switch2中国香港地区马来西亚新加坡ns通用",
  "priceYuan": 32.9,
  "priceRange": "32.9 - 211",
  "promotionPriceYuan": 20.9,
  "discountPct": 36.5,
  "imageUrl": "https://img.alicdn.com/bao/uploaded/i4/...",
  "imageUrls": ["...", "...", "...", "...", "..."],
  "videoUrl": null,
  "skuCount": 9,
  "skus": [
    { "skuId": "5606370363622", "propPath": "1627207:28562459650", "price": "32.9", "promotionPrice": "32.9", "quantity": "200", "imageUrl": "..." }
  ],
  "shopId": "67095450",
  "shopName": "绿联数码旗舰店",
  "userType": "C",
  "freeShipping": true,
  "deliveryFee": "0.00",
  "categoryId": "50025386",
  "tags": ["官方旗舰", "品牌商家"],
  "status": "success"
}
```

#### Three built-in views

- **Overview** — all rows, mixed across operations, 18 most-useful columns
- **Products** — only product rows (search / catalog / detail), hides review-only fields
- **Reviews** — only review rows, reviewer info + photos + variant purchased

Switch views in the Apify dataset UI — no re-running required.

***

### 💼 Use Cases & Examples

#### 1. Dropshipping Product Research

**Sourcing specialists and dropshipping operators finding hot products with healthy margins.**

**Input:** Keyword Search for a product category (e.g. `wireless earphones`, price 100–500 CNY, Tmall only)
**Output:** Ranked list of 100+ products with prices, seller ratings, review counts, shop locations
**Use:** Sort by `itemGradeAvg` × `commentCount` to find validated best-sellers. Chain into Shop Catalog to map the winning seller's full lineup.

#### 2. Competitor Price Monitoring

**E-commerce teams tracking competitor prices and promotions across Taobao and Tmall.**

**Input:** Shop Catalog for each competitor's `shopId`, scheduled daily
**Output:** Full catalog with `price`, `promotionPrice`, `finalPromotionPrice`, `minDiscountPrice` per SKU
**Use:** Detect price drops, new promotions, and inventory additions. Feed into a BI dashboard with `_fetchedAt` as the time axis.

#### 3. Private-Label Sourcing

**Brand founders identifying white-label manufacturers and OEM partners.**

**Input:** Keyword Search for generic product terms (`unbranded bluetooth speaker`), filter by `userType` = factory/wholesaler
**Output:** Sellers with `provcity`, `sellerLevel`, `sellerGoodRate`, `dailySellCount` indicators
**Use:** Build a shortlist of credible factories. Deep-dive with Product Detail to check SKU variety and Q\&A for reliability signals.

#### 4. Review Sentiment & Media Mining

**Marketing agencies pulling unboxing photos and authentic customer language for ad creative.**

**Input:** Product Reviews for your own or competitor item IDs
**Output:** Dataset of review text, photos, videos, ratings, and purchased variants
**Use:** Extract Chinese review language for UGC ad copy. Download review photos and videos for social proof content. Filter `reviewRatingStars` + `reviewTag` to find negative reviews that surface product flaws.

#### 5. Category Intelligence Sweeps

**Market researchers mapping an entire product category across Taobao.**

**Input:** Keyword Search with broad queries, `maxPages: 50`
**Output:** 500+ products with pricing, seller distribution, category IDs
**Use:** Build a category map — count sellers per `categoryId`, compute price quartiles, identify gaps. Export to Tableau or Looker.

#### 6. Brand Catalog Mapping

**Amazon sellers doing arbitrage research on Chinese brands expanding overseas.**

**Input:** Keyword Search for the Chinese brand, note top `shopId`, then Shop Catalog
**Output:** Complete SKU list with variant breakdowns
**Use:** Map which products the brand sells domestically vs exports. Spot candidates for Amazon FBA.

#### 7. BI & Data Warehouse Enrichment

**Data teams augmenting internal product databases with Taobao market data.**

**Input:** Product Detail lookups driven by internal item IDs
**Output:** Canonical Taobao fields joined into your existing product records
**Use:** Pipe into Snowflake / BigQuery / Postgres. Use `_fetchedAt` for slowly-changing-dimension logic. Join `shopId` for seller-level rollups.

***

### 🔀 Which Chinese E-commerce Scraper Is Right For You?

There are several scrapers on the Apify Store for Chinese marketplaces — each targets a different audience. Quick decision guide:

| Your use case | Right tool |
|---|---|
| 🛒 **B2C retail product research** (consumer prices, SKU variants, authentic reviews) | **🎯 This actor — Taobao & Tmall** |
| 🏭 **B2B wholesale sourcing** (find factories, bulk pricing, MOQ, Gold Supplier data) | `1688.com Products Scraper` · `Alibaba Supplier Scraper` |
| 🌍 **Cross-border retail for Western buyers** (English listings, shipping-ready) | `AliExpress Products Scraper` |
| ♻️ **Second-hand / used goods** | `Xianyu (Goofish) Listings Scraper` |
| 💼 **Dropshipping research needing retail prices + SKU variants + Chinese customer reviews** | **🎯 This actor — Taobao & Tmall** |

#### Why Taobao & Tmall specifically

- **Taobao** = China's #1 consumer marketplace — the equivalent of US Amazon for retail. **Lower prices**, SKU-level variants, authentic customer reviews with photos + videos.
- **Tmall** = Taobao's premium brand-store tier — the equivalent of Amazon Brand Registry. Use the `tmallOnly` filter when you need licensed sellers or official stores only.
- **Alibaba and 1688 = B2B wholesale.** Different audience (importers, resellers). Products listed in bulk MOQ, usually unbranded.
- **AliExpress = Alibaba's retail export site.** English-first, but a subset of Taobao inventory with **20–50% markup** and slower shipping. Scraping Taobao directly gets you the source-price data before markup.

This is **the only Apify store actor focused specifically on Taobao + Tmall retail data** — and the only one bundling product detail, keyword search, shop catalog, and reviews into a single actor.

#### What we include that B2B scrapers skip

- ✅ **SKU-level variant data** — color / size pricing + per-variant stock + swatch images (B2B scrapers list bulk listings only)
- ✅ **Review photos and videos** — UGC from real Chinese customers for ad creative (B2B scrapers skip reviews entirely)
- ✅ **Taobao Union commission data** — `commissionAmount`, `commissionRate`, `maifanPromotionDiscount` in the shop catalog for dropshipping margin decisions
- ✅ **Tmall-only filter** — restrict results to licensed brand stores in one click
- ✅ **Machine-translated English titles** in search results for cross-language workflows
- ✅ **Four operations in one actor** — no need to subscribe to 3+ separate scrapers

#### Keyword translation between marketplaces

If you're used to Alibaba / 1688 vocabulary, here's how concepts map:

| Alibaba / 1688 term | Taobao equivalent |
|---|---|
| Gold Supplier | Tmall brand store (`userType = B`) |
| Verified Supplier | Taobao C-shop with high `sellerLevel` + `sellerGoodRate` |
| MOQ (Minimum Order Quantity) | Not applicable — Taobao is retail (MOQ = 1) |
| Factory Direct | N/A — Taobao sells to end consumers |
| Order count | `sellCount` (sales volume for the listing) |
| Supplier Response Rate | `fahuoDsr` (shipping DSR score) |

***

### 🔗 Integration Examples

#### JavaScript / Node.js

```javascript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });

const run = await client.actor('sian.agency/taobao-tmall-product-scraper').call({
  operation: 'keywordSearch',
  keyword: 'iphone 15',
  tmallOnly: true,
  maxPages: 5,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(`Found ${items.length} products`);
items.forEach(p => console.log(`${p.title} — ¥${p.priceYuan} (${p.shopName})`));
```

#### Python

```python
from apify_client import ApifyClient

client = ApifyClient('YOUR_TOKEN')

run = client.actor('sian.agency/taobao-tmall-product-scraper').call(
    run_input={
        'operation': 'shopCatalog',
        'userId': '713464357',
        'maxPages': 10,
    }
)

for item in client.dataset(run['defaultDatasetId']).iterate_items():
    print(item['itemId'], item['title'], item.get('priceYuan'))
```

#### cURL

```bash
curl -X POST "https://api.apify.com/v2/acts/sian.agency~taobao-tmall-product-scraper/runs?token=YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "operation": "productReviews",
    "itemId": "742902854135",
    "maxPages": 5
  }'
```

#### Automation Workflows (n8n / Zapier / Make)

1. **Trigger**: Schedule (daily/hourly) or webhook on a product tracker
2. **Apify Run**: Call this actor with your chosen `operation` and inputs
3. **Process**: Parse the returned dataset — filter by `_operation`, pick the view you need
4. **Action**: Save to Google Sheets / Airtable / Postgres, alert on price drops, generate reports

***

### 📊 Performance & Pricing

#### 🎁 FREE Tier (Try It Now)

- **5 items** per run — full feature access across all four operations
- No credit card required
- Perfect for evaluating the output shape before scaling up

#### 💎 PAID Tier (Production Ready)

- **Unlimited** items per run — pull hundreds of products or thousands of reviews in a single run
- Pay-per-result: you're only charged for successful rows
- Three dataset views unlocked by default for BI integration

💰 **Best price on the market** — flat pay-per-result pricing with no monthly minimums or hidden fees.

🔗 [View current pricing](https://apify.com/sian.agency/taobao-tmall-product-scraper?fpr=sian)

***

### ❓ Frequently Asked Questions

**Q: How many products / reviews can I scrape in one run?**
A: FREE tier: 5 per run. PAID tier: unlimited, capped only by `maxPages` (up to 50). A single Keyword Search run with `maxPages: 50` returns up to 500 products; a single Shop Catalog run can return 1,500+.

**Q: Do I need a Taobao account or any API key?**
A: No. No account, no API key, no setup. Paste an item ID or keyword and run.

**Q: What output formats are available?**
A: JSON, CSV, Excel, XML, HTML — exported directly from the Apify dataset UI or API.

**Q: Does it work with Taobao Mobile / Xianyu / 1688?**
A: This actor targets Taobao and Tmall. For 1688 and Xianyu, check our [SIÁN Agency Store](https://apify.com/sian.agency?fpr=sian) for dedicated actors.

**Q: How fresh is the data?**
A: Live — every run fetches fresh data at request time. Use `_fetchedAt` to track freshness in downstream pipelines.

**Q: Can I get English translations?**
A: Search results include a machine-translated `titleEn` field where available. Product titles, descriptions, and reviews are returned in their original Chinese — pair with a translation step downstream for full English output.

**Q: Is it legal to scrape Taobao?**
A: Yes — this actor only accesses publicly available product and seller data. See the [Legal](#️-is-it-legal-to-scrape-data) section below.

**Q: How long does a run take?**
A: Product Detail: ~3 seconds per item. Keyword Search: ~4 seconds per page. Shop Catalog: ~5 seconds per page (larger payloads). Product Reviews: ~3 seconds per page.

***

### 🐛 Troubleshooting

**Empty dataset for Keyword Search**

- The query may be too narrow — try a broader term or remove price filters.
- Set `tmallOnly: false` to include Taobao C2C sellers.

**`status: error` with "item not found"**

- The `itemId` has been delisted or never existed. Verify by opening `https://item.taobao.com/item.htm?id=<itemId>` in a browser.

**Shop Catalog returns fewer items than expected**

- The seller may have a smaller catalog than `maxPages × 30`. The actor stops early when no more pages are available.
- Confirm the `userId` is the shop / seller ID, not a product ID. It's the number you see as `shopId` in search results.

**Fields are `null` for a product I know exists**

- Some fields (e.g. `brandName`, `categoryName`) are only populated for certain product categories or by Tmall brand stores.
- Try switching `detailVersion` from `v5` to `v1` for richer detail.

***

### ⚖️ Is it legal to scrape data?

Our actors are ethical and do not extract any private user data, such as email addresses, gender, or location. They only extract what the user has chosen to share publicly. We therefore believe that our actors, when used for ethical purposes by Apify users, are safe.

However, you should be aware that your results could contain personal data (for example, reviewer nicknames or avatar URLs). Personal data is protected by the **GDPR** in the European Union and by other regulations around the world. You should not scrape personal data unless you have a legitimate reason to do so. If you're unsure whether your reason is legitimate, consult your lawyers.

You can also read Apify's blog post on the [legality of web scraping](https://blog.apify.com/is-web-scraping-legal/).

***

### 🤝 Support

[![Telegram Support](https://img.shields.io/badge/Telegram-Support%20Group-0088cc?logo=telegram)](https://t.me/+vyh1sRE08sAxMGRi)

**Join our active support community**

- For issues or feature requests, open an issue in the actor's repository
- Check [SIÁN Agency Store](https://apify.com/sian.agency?fpr=sian) for more automation tools — Instagram, TikTok, LinkedIn, YouTube scrapers and more
- 📧 <hello@sian-agency.online>

***

**Built by [SIÁN Agency](https://www.sian-agency.online)** | **[More Tools](https://apify.com/sian.agency?fpr=sian)**

# Actor input Schema

## `operation` (type: `string`):

🎯 **PICK ONE OPERATION PER RUN.** Each run produces one clean dataset matching the chosen mode.

- **🛍️ Product Detail** — deep scrape of a single product by item ID (title, images, SKUs, pricing, seller, description, Q\&A, coupons)
- **🔍 Keyword Search** — search Taobao / Tmall by keyword, paginate through results
- **🏪 Shop Catalog** — pull the full product catalog for a seller (by numeric user / seller ID)
- **💬 Product Reviews** — paginate customer reviews for a product, including photos and variant info

💡 **TIP:** To combine operations, run the actor multiple times with different configurations.

## `itemId` (type: `string`):

🛍️ **Required for `Product Detail` and `Product Reviews` operations.**

The numeric Taobao/Tmall item ID. You can find it:

- In any Taobao product URL after `id=` (e.g. `https://item.taobao.com/item.htm?id=744983869996` → `744983869996`)
- In the `itemId` field of any search or shop-catalog result row

💡 **TIP:** For bulk product lookups, run the actor once per item ID, or use `Keyword Search` / `Shop Catalog` to build the list first.

⚠️ **Ignored** for Keyword Search and Shop Catalog operations.

## `detailVersion` (type: `string`):

🛍️ **Applies to `Product Detail` only.**

- **Rich (v1)** — full payload: SKUs, pricing tiers, Q\&A, item description HTML, coupons, shipping, return policy (~38 KB per item, slower)
- **Lite (v5)** — normalized compact response: item + delivery + seller (~7 KB per item, faster)

Default is Rich for most users. Switch to Lite if you only need basic fields or are processing high volumes on a tight budget.

## `keyword` (type: `string`):

🔍 **Required for `Keyword Search` operation.**

Any Taobao / Tmall search query. Supports Chinese, English, and mixed queries:

- `iphone 15`
- `无线耳机` (wireless earphones)
- `sony camera a7`
- `running shoes men`

💡 **TIP:** More specific queries return higher-quality results. Use `Tmall only` + price range to filter to brand merchants.

## `startPrice` (type: `integer`):

💰 Minimum price in Chinese Yuan (integer). Leave blank for no minimum. Applies to `Keyword Search` only.

## `endPrice` (type: `integer`):

💰 Maximum price in Chinese Yuan (integer). Leave blank for no maximum. Applies to `Keyword Search` only.

## `tmallOnly` (type: `boolean`):

🏬 When enabled, search results are restricted to Tmall (brand / official stores). Default: false (searches both Taobao and Tmall).

## `userId` (type: `string`):

🏪 **Required for `Shop Catalog` operation.**

The numeric `userId` (aka seller ID) of the Taobao / Tmall shop. You can find it:

- In the `shopId` or `sellerId` field of any search result row (copy it from a prior run)
- In a seller's shop URL query string

💡 **TIP:** For a brand's full catalog, first run a Keyword Search for the brand name, note the top `shopId`, then run Shop Catalog with that ID.

⚠️ **Ignored** for Product Detail, Keyword Search, and Product Reviews.

## `maxPages` (type: `integer`):

📄 **Applies to paginated operations** (Keyword Search, Shop Catalog, Product Reviews). Ignored for Product Detail.

- **Keyword Search:** 10 items per page
- **Shop Catalog:** up to 30 items per page
- **Product Reviews:** 20 reviews per page

💡 **TIP:** Start small (1–3 pages) to preview results before scaling up.

⚠️ Hard cap: 50 pages to prevent runaway runs. If fewer pages are actually available, the actor stops early.

## Actor input object example

```json
{
  "operation": "keywordSearch",
  "itemId": "744983869996",
  "detailVersion": "v1",
  "keyword": "iphone 15",
  "startPrice": 100,
  "endPrice": 5000,
  "tmallOnly": false,
  "userId": "713464357",
  "maxPages": 5
}
```

# Actor output Schema

## `taobaoResults` (type: `string`):

Products, search results, shop catalog entries, or customer reviews — one flat row per upstream item with full SKU variants, media URLs, engagement metrics, and reviewer details.

## `scrapingSummary` (type: `string`):

HTML report with run status, success/error row counts, success rate, pages fetched, duration, and the inputs used — written even on fatal crash.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "itemId": "744983869996",
    "keyword": "iphone 15",
    "userId": "713464357"
};

// Run the Actor and wait for it to finish
const run = await client.actor("sian.agency/taobao-tmall-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "itemId": "744983869996",
    "keyword": "iphone 15",
    "userId": "713464357",
}

# Run the Actor and wait for it to finish
run = client.actor("sian.agency/taobao-tmall-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "itemId": "744983869996",
  "keyword": "iphone 15",
  "userId": "713464357"
}' |
apify call sian.agency/taobao-tmall-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=sian.agency/taobao-tmall-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Taobao Tmall Product Scraper",
        "description": "Extract Taobao & Tmall product data with clean structured output. Four scrapers in one — product details, keyword search, shop catalogs, customer reviews. Perfect for dropshippers, sourcing agents & e-commerce researchers.",
        "version": "1.0",
        "x-build-id": "Y2HamWTwO4Ibeatmw"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/sian.agency~taobao-tmall-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-sian.agency-taobao-tmall-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/sian.agency~taobao-tmall-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-sian.agency-taobao-tmall-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/sian.agency~taobao-tmall-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-sian.agency-taobao-tmall-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "operation"
                ],
                "properties": {
                    "operation": {
                        "title": "🎯 Operation — what do you want to scrape?",
                        "enum": [
                            "productDetail",
                            "keywordSearch",
                            "shopCatalog",
                            "productReviews"
                        ],
                        "type": "string",
                        "description": "🎯 **PICK ONE OPERATION PER RUN.** Each run produces one clean dataset matching the chosen mode.\n\n- **🛍️ Product Detail** — deep scrape of a single product by item ID (title, images, SKUs, pricing, seller, description, Q&A, coupons)\n- **🔍 Keyword Search** — search Taobao / Tmall by keyword, paginate through results\n- **🏪 Shop Catalog** — pull the full product catalog for a seller (by numeric user / seller ID)\n- **💬 Product Reviews** — paginate customer reviews for a product, including photos and variant info\n\n💡 **TIP:** To combine operations, run the actor multiple times with different configurations.",
                        "default": "keywordSearch"
                    },
                    "itemId": {
                        "title": "🛍️ Item ID (Product Detail / Reviews)",
                        "type": "string",
                        "description": "🛍️ **Required for `Product Detail` and `Product Reviews` operations.**\n\nThe numeric Taobao/Tmall item ID. You can find it:\n- In any Taobao product URL after `id=` (e.g. `https://item.taobao.com/item.htm?id=744983869996` → `744983869996`)\n- In the `itemId` field of any search or shop-catalog result row\n\n💡 **TIP:** For bulk product lookups, run the actor once per item ID, or use `Keyword Search` / `Shop Catalog` to build the list first.\n\n⚠️ **Ignored** for Keyword Search and Shop Catalog operations."
                    },
                    "detailVersion": {
                        "title": "🛍️ Detail Response Depth",
                        "enum": [
                            "v1",
                            "v5"
                        ],
                        "type": "string",
                        "description": "🛍️ **Applies to `Product Detail` only.**\n\n- **Rich (v1)** — full payload: SKUs, pricing tiers, Q&A, item description HTML, coupons, shipping, return policy (~38 KB per item, slower)\n- **Lite (v5)** — normalized compact response: item + delivery + seller (~7 KB per item, faster)\n\nDefault is Rich for most users. Switch to Lite if you only need basic fields or are processing high volumes on a tight budget.",
                        "default": "v1"
                    },
                    "keyword": {
                        "title": "🔍 Search Keyword",
                        "type": "string",
                        "description": "🔍 **Required for `Keyword Search` operation.**\n\nAny Taobao / Tmall search query. Supports Chinese, English, and mixed queries:\n- `iphone 15`\n- `无线耳机` (wireless earphones)\n- `sony camera a7`\n- `running shoes men`\n\n💡 **TIP:** More specific queries return higher-quality results. Use `Tmall only` + price range to filter to brand merchants."
                    },
                    "startPrice": {
                        "title": "💰 Min Price (CNY) — optional",
                        "minimum": 0,
                        "type": "integer",
                        "description": "💰 Minimum price in Chinese Yuan (integer). Leave blank for no minimum. Applies to `Keyword Search` only."
                    },
                    "endPrice": {
                        "title": "💰 Max Price (CNY) — optional",
                        "minimum": 0,
                        "type": "integer",
                        "description": "💰 Maximum price in Chinese Yuan (integer). Leave blank for no maximum. Applies to `Keyword Search` only."
                    },
                    "tmallOnly": {
                        "title": "🏬 Tmall only (filter out Taobao C2C sellers)",
                        "type": "boolean",
                        "description": "🏬 When enabled, search results are restricted to Tmall (brand / official stores). Default: false (searches both Taobao and Tmall).",
                        "default": false
                    },
                    "userId": {
                        "title": "🏪 Seller User ID",
                        "type": "string",
                        "description": "🏪 **Required for `Shop Catalog` operation.**\n\nThe numeric `userId` (aka seller ID) of the Taobao / Tmall shop. You can find it:\n- In the `shopId` or `sellerId` field of any search result row (copy it from a prior run)\n- In a seller's shop URL query string\n\n💡 **TIP:** For a brand's full catalog, first run a Keyword Search for the brand name, note the top `shopId`, then run Shop Catalog with that ID.\n\n⚠️ **Ignored** for Product Detail, Keyword Search, and Product Reviews."
                    },
                    "maxPages": {
                        "title": "📄 Max pages to fetch",
                        "minimum": 1,
                        "maximum": 50,
                        "type": "integer",
                        "description": "📄 **Applies to paginated operations** (Keyword Search, Shop Catalog, Product Reviews). Ignored for Product Detail.\n\n- **Keyword Search:** 10 items per page\n- **Shop Catalog:** up to 30 items per page\n- **Product Reviews:** 20 reviews per page\n\n💡 **TIP:** Start small (1–3 pages) to preview results before scaling up.\n\n⚠️ Hard cap: 50 pages to prevent runaway runs. If fewer pages are actually available, the actor stops early.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
