# JD.com Product Scraper (`piotrv1001/jd-com-product-scraper`) Actor

The JD.com Product Scraper extracts structured product data from JD.com item pages, capturing product name, brand, category breadcrumb, vendor and shop IDs, shop name, and the full image gallery — ideal for price monitoring, catalog intelligence, and competitor research.

- **URL**: https://apify.com/piotrv1001/jd-com-product-scraper.md
- **Developed by:** [FalconScrape](https://apify.com/piotrv1001) (community)
- **Categories:** E-commerce, Automation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.00 / 1,000 products

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### 🛒 JD.com Product Scraper

Easily extract product data from **JD.com**, China's #2 e-commerce platform after Taobao / Tmall. The **JD.com Product Scraper** turns a list of SKU IDs or `item.jd.com` URLs into structured product records — ideal for price monitoring, catalog intelligence, brand tracking, and competitor research across the Chinese e-commerce market.

### ✨ Features

-   🏷️ **Rich Product Data**: Retrieve product name, brand, category breadcrumb, vendor / shop IDs, shop name, main image, and the full gallery for every SKU.
-   🔢 **SKU-First Input**: Just paste numeric SKU IDs (e.g. `100008348542`) or full `item.jd.com/{sku}.html` URLs — duplicates are merged automatically.
-   🛡️ **Anti-Bot Handled For You**: JD's verification interstitial is transparently retried so you get clean SSR data every time, with no proxy or API-key setup on your side.
-   🔗 **Cross-Platform Schema**: Output fields line up with the sibling `alibaba-listings-scraper` Actor, making JD ↔ Alibaba joins on brand or category trivial.
-   🏬 **POP Seller Intel**: Each record exposes `venderId` / `shopId` — perfect for building a JD third-party-seller directory.

### 🛠️ How It Works

1. **Enter SKUs or URLs** – Paste JD SKU IDs into `skuIds`, or full product URLs into `startUrls`. Both inputs are combined and de-duplicated.
2. **Run the Scraper** – Start the run from the Apify Console (or via the API). The Actor fetches each product page, parses the SSR HTML, and pushes one record per SKU to the dataset.
3. **Download the Data** – Export your results from the Storage → Dataset tab as JSON, CSV, Excel, HTML, RSS, or JSONL.

### 📥 Input

The Actor accepts two fields. At least one must be supplied.

| Field | Type | Required | Description |
|---|---|---|---|
| `skuIds` | string[] | One of | Numeric JD product SKU IDs. Each becomes `https://item.jd.com/{skuId}.html`. |
| `startUrls` | RequestList | One of | Direct `item.jd.com/{skuId}.html` URLs. Merged with `skuIds`; duplicates removed. |

Example input:

```json
{
    "skuIds": ["100008348542", "100015253059", "100012043978"],
    "startUrls": [
        { "url": "https://item.jd.com/100020891608.html" }
    ]
}
````

### 📊 Sample Output Data

The scraper provides structured JSON output with key product details. Example:

```json
[
    {
        "skuId": "100008348542",
        "url": "https://item.jd.com/100008348542.html",
        "title": "【AppleiPhone 11】Apple iPhone 11 (A2223) 128GB 黑色 移动联通电信4G手机 双卡双待【行情 报价 价格 评测】",
        "name": "Apple iPhone 11 (A2223) 128GB 黑色 移动联通电信4G手机 双卡双待",
        "brandId": "14026",
        "brandName": "Apple",
        "venderId": "1000000127",
        "shopId": "1000000127",
        "shopName": "Apple产品京东自营旗舰店",
        "categoryIds": [9987, 653, 655],
        "categoryNames": ["手机通讯", "手机", "手机"],
        "breadcrumb": ["手机通讯", "手机", "手机", "Apple", "AppleiPhone 11"],
        "mainImage": "https://img12.360buyimg.com/n12/jfs/t1/148767/39/18017/86358/5fd32ff0E5ca41721/d885f7c401dfa557.jpg",
        "images": [
            "https://img12.360buyimg.com/n12/jfs/t1/148767/.../d885f7c401dfa557.jpg",
            "https://img12.360buyimg.com/n12/jfs/t1/142574/.../d2d35afca393e566.jpg"
        ],
        "description": "【AppleiPhone 11】京东JD.COM提供AppleiPhone 11正品行货...",
        "price": null,
        "specs": {},
        "scrapedAt": "2026-05-16T12:00:00.000Z"
    },
    {
        "skuId": "100015253059",
        "url": "https://item.jd.com/100015253059.html",
        "title": "【富士X-T30 II】富士（FUJIFILM）X-T30 II/XT30 II 微单相机 套机（XC35F2 镜头) 银色【行情 报价 价格 评测】",
        "name": "富士（FUJIFILM）X-T30 II/XT30 II 微单相机 套机（XC35F2 镜头) 银色 2610万像素 18种胶片模拟 视频提升",
        "brandId": "7195",
        "brandName": "富士（FUJIFILM）",
        "venderId": "1000000858",
        "shopId": "1000000858",
        "shopName": "富士（FUJIFILM）京东自营旗舰店",
        "categoryIds": [652, 654, 5012],
        "categoryNames": ["数码", "摄影摄像", "微单相机"],
        "breadcrumb": ["数码", "摄影摄像", "微单相机", "富士（FUJIFILM）", "富士X-T30 II"],
        "mainImage": "https://img12.360buyimg.com/n12/jfs/t1/209321/.../4a72a36b3800c84e.jpg",
        "images": ["..."],
        "description": "...",
        "price": null,
        "specs": {},
        "scrapedAt": "2026-05-16T12:00:00.000Z"
    }
]
```

> **Note on `price` and `specs`.** JD.com loads both asynchronously after the product page renders via APIs that are geo-restricted to mainland China — they are intentionally left `null` / `{}` in the output. Everything else (name, brand, category, vendor / shop, images) comes straight from the SSR HTML and is captured cleanly.

Unlock JD.com's catalogue for your pricing, brand-monitoring, and marketplace-intelligence pipelines with **JD.com Product Scraper** today! 🚀

# Actor input Schema

## `skuIds` (type: `array`):

List of JD product SKU IDs to scrape (e.g. "100008348542"). Each SKU is fetched from https://item.jd.com/{skuId}.html.

## `startUrls` (type: `array`):

Direct JD product URLs (item.jd.com/{skuId}.html). Combined with `skuIds`; duplicates are de-duplicated.

## Actor input object example

```json
{
  "skuIds": [
    "100008348542"
  ],
  "startUrls": []
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "skuIds": [
        "100008348542"
    ],
    "startUrls": []
};

// Run the Actor and wait for it to finish
const run = await client.actor("piotrv1001/jd-com-product-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "skuIds": ["100008348542"],
    "startUrls": [],
}

# Run the Actor and wait for it to finish
run = client.actor("piotrv1001/jd-com-product-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "skuIds": [
    "100008348542"
  ],
  "startUrls": []
}' |
apify call piotrv1001/jd-com-product-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=piotrv1001/jd-com-product-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "JD.com Product Scraper",
        "description": "The JD.com Product Scraper extracts structured product data from JD.com item pages, capturing product name, brand, category breadcrumb, vendor and shop IDs, shop name, and the full image gallery — ideal for price monitoring, catalog intelligence, and competitor research.",
        "version": "0.0",
        "x-build-id": "UcKVq0OiGBSiH3K9t"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/piotrv1001~jd-com-product-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-piotrv1001-jd-com-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/piotrv1001~jd-com-product-scraper/runs": {
            "post": {
                "operationId": "runs-sync-piotrv1001-jd-com-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/piotrv1001~jd-com-product-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-piotrv1001-jd-com-product-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "skuIds": {
                        "title": "JD SKU IDs",
                        "type": "array",
                        "description": "List of JD product SKU IDs to scrape (e.g. \"100008348542\"). Each SKU is fetched from https://item.jd.com/{skuId}.html.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Direct JD product URLs (item.jd.com/{skuId}.html). Combined with `skuIds`; duplicates are de-duplicated.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
