[DEPRECATED v0.7] JD.com Scraper � Price Endpoint Blocked
DeprecatedPricing
from $8.00 / 1,000 product detail extracteds
[DEPRECATED v0.7] JD.com Scraper � Price Endpoint Blocked
DeprecatedDEPRECATED � JD.com pricing endpoints blocked at proxy-infrastructure level on Apify residential pool. Actor returns enrichment fields only (brand, title, category, images, stock) and does NOT charge � realtimePrice universally null. Do not subscribe. v1.0 with premium proxy on roadmap.
Pricing
from $8.00 / 1,000 product detail extracteds
Rating
0.0
(0)
Developer
Sami
Maintained by CommunityActor stats
0
Bookmarked
4
Total users
2
Monthly active users
17 days ago
Last modified
Categories
Share
[DEPRECATED v0.7] JD.com Scraper — Price Endpoint Blocked
⚠️ DEPRECATED — Do not subscribe
JD.com's pricing endpoints (
p.3.cn,c0.3.cn,item-soa.jd.com,cd.jd.com/promotion,union-pim.jd.com) are all blocked at the proxy-infrastructure level on Apify residential CN. Two distinct block patterns were verified across five endpoints and four proxy geographies (CN / HK / SG / no-country):
*.jd.comprice aggregators return HTML error pages instead of JSON (WAF intercept)*.3.cnhosts returncurl (56) CONNECT tunnel failed, response 590(proxy refuses to tunnel)As a result:
realtimePriceis universallynullon this Actor today, and the PPE gate (added in v0.6.5) refuses to charge for any record that lacks a price. The Actor returns the enrichment fields (brand, title, category, images, stock, JD-self-run flag, service tags) for diagnostic visibility but never bills.Subscribing to this Actor today will give you $0 in charges but also $0 in usable price data. If you need JD product pricing data, contact the author about funding a Bright Data / Oxylabs / Soax premium residential pool — a v1.0 release with that integration is on the roadmap.
Existing Saved Tasks continue to function (they receive the diagnostic records, not billing).
Extract enrichment fields from JD.com (京东 / Jingdong) product pages. Currently shipped fields (when not blocked): productTitle, brandName (3-layer fallback), categoryPath, images, isJdSelfRun, stockStatus, serviceTags, sellerId. Real-time price extraction is currently non-functional — see deprecation warning above.
Part of the Chinese Digital Intelligence Suite by zhorex — pairs with the Chinese Brand Monitor (cross-platform brand mention aggregator across Weibo + RedNote + Bilibili + Douban + Xueqiu, $0.045/mention) and the individual Weibo, RedNote, Xueqiu, Douban scrapers for full-stack China data coverage.
What you get per product
For each JD product URL or SKU ID you submit, one record with:
- Identifiers —
productId,productUrl,brandName,categoryPath(full breadcrumb) - Specs — full JD specs panel as a
{key: value}dict (商品名称, 商品编号, 上架时间, dimensions, color options…) - Images —
primaryImageUrlplus fulldescriptionImagesarray for catalog ingestion - Pricing —
realtimePricequeried fresh at scrape time (not stale page-load price) - Seller signal —
sellerId,isJdSelfRunflag (true = JD's own warehouse + warranty + return logistics; false = third-party merchant) - Stock —
stockStatusenum (in_stock/low_stock/out_of_stock), best-effortstockCount - Service tags — JD's protection options decoded to human-readable English:
7-day return,JD Tianhua (warranty),JD Plus exclusive,cash on delivery, etc. - Origin —
shippingCity(defaults to Beijing — JD's central warehouse region) - Timestamp —
scrapedAtUTC ISO 8601
Why this Actor, not a generic e-commerce scraper
isJdSelfRunflag — JD's hybrid model means each SKU is either fulfilled by JD itself (own logistics, warranty, return path) or by a third-party merchant on the marketplace. Generic scrapers don't distinguish; this one surfaces the flag on every record.- Service tag decoding — JD's protection options arrive as opaque numeric codes; this Actor maps them to human-readable English so downstream pipelines don't have to maintain the code table.
- Real-time price —
realtimePriceis fetched fresh at scrape time, not parsed from cached HTML, so it captures flash-discount cycles that move within hours.
Example input
{"mode": "product_detail","productUrls": ["https://item.jd.com/100037053980.html","100066898260"]}
The Actor accepts a mix of full URLs and bare SKU IDs. Duplicates are removed automatically.
Example output (truncated)
{"mode": "product_detail","productId": "100037053980","productTitle": "得力(deli) 6018 剪刀剪子剪纸裁剪 办公文具用品","brandName": "得力","categoryPath": ["文教文化用品", "切屑文具", "办公/学生剪刀"],"specs": {"商品名称": "得力 6018", "商品编号": "100037053980"},"realtimePrice": "12.90","priceCurrency": "CNY","priceSource": "item-soa","sellerId": "0","isJdSelfRun": true,"stockStatus": "in_stock","serviceTags": ["7-day return", "JD Tianhua (warranty)"],"primaryImageUrl": "https://img12.360buyimg.com/n0/jfs/...","descriptionImages": ["...", "..."],"shippingCity": "北京","productUrl": "https://item.jd.com/100037053980.html","scrapedAt": "2026-05-16T10:00:00+00:00"}
Pricing (Pay-Per-Event)
| Event | Price |
|---|---|
product-detail-scraped | $0.008 / product |
Realistic workflow costs
| Workflow | Volume | Cost |
|---|---|---|
| Daily price tracking on 200 SKUs (per month) | 6,000 details/mo | ~$48 / month |
| Competitor SKU enrichment (one-time batch) | 1,000 products | $8 |
| Hedge-fund-grade daily refresh, 5,000 SKUs | 150,000 details/mo | ~$1,200 / month |
| Catalog migration / one-time pull | 50,000 SKUs | $400 |
Proxy
The Actor's input schema defaults to Apify residential proxy with apifyProxyCountry: "CN" — leave it on for production workflows. Apify residential proxy is billed separately from the per-event price (typically a few cents per MB transferred — see your Apify Billing → Proxy usage).
Most non-CN residential IPs also work because item.jd.com applies lighter rate-limiting than other JD subdomains. The default config is the safest bet but you can experiment.
Status note (v0.6.3)
JD's classic price endpoint p.3.cn/prices/mgets is currently rate-limited at the edge for most shared residential proxy pools (Apify included). v0.6.3 ships with a multi-endpoint price fallback — the Actor tries item-soa.jd.com/getWareBusiness first (a modern aggregator that also returns brand info), then c0.3.cn/stock, then the legacy p.3.cn. The first endpoint that returns parseable data wins; price + brand are filled together when possible.
If all three price hosts are blocked AND the page parser can't recover a brand name from the HTML / title, the record is pushed with an error: "missing_price_and_brand" field and no PPE event is charged for it. You always see the diagnostic in the dataset, you never pay for an empty record.
Brand-name extraction has three fallback layers: the canonical #parameter-brand link, the parameter2 品牌 field, and a title-pattern heuristic (JD product titles wrap the brand inside the first (...) pair).
What this Actor does NOT do (and why)
Earlier versions (v0.1–v0.5) shipped four modes: product_detail, seller_store, product_search, product_reviews. The latter three were removed in v0.6 after extensive testing because JD's WAF reliably blocks them on Apify's shared residential proxy pools, returning 系统繁忙 or silently redirecting to the JD homepage. We tested:
- curl_cffi with Chrome TLS impersonation — blocked
- Playwright with full Chromium + JS execution + primed cookies — blocked
- Four proxy geographies (CN / HK / SG / no-country) — all blocked
- Mobile API endpoints — return 403 (require JD mobile-app signing scheme)
The block is at the IP-reputation layer, not anything fixable client-side. Rather than ship modes that return zero items and accidentally charge buyers, v0.6 ships only the mode that works reliably.
If you specifically need search, reviews, or seller-store data from JD, contact the author about integrating a premium residential proxy pool (Bright Data / Oxylabs / Soax) with cleaner reputation against JD specifically. That work is parked as a roadmap item; one paying buyer would unlock it.
Saved tasks that still call the removed modes get a clean rejection message — no PPE charge.
Use cases
- Competitor pricing intelligence —
realtimePriceenables sub-hour competitor SKU price tracking. Pair with a daily-cron schedule via Apify's Saved Tasks. - Catalog enrichment — pull every JD product from your brand's SKU list to refresh title, specs, current price, stock status for downstream BI / dashboards.
- JD-self-run vs marketplace audit — the
isJdSelfRunflag identifies which of your SKUs are sold directly by JD (your authorized channel) vs by third-party merchants (potential gray-market / unauthorized resale). - AI training data — JD product titles, specs, and category paths are clean labeled data for product-classification, product-matching, and Chinese commerce NLP tasks.
Part of the Chinese Digital Intelligence Suite
- 🆕 Chinese Brand Monitor — Cross-platform brand mention aggregator (Weibo + RedNote + Bilibili + Douban + Xueqiu in one normalized feed, sentiment + dedup, $0.045/mention). Pairs perfectly with this Actor: track your competitors' JD product pricing here, then monitor consumer sentiment about those brands across all 5 social platforms in one call.
- Weibo Scraper — public sentiment, hot search, KOL posts
- RedNote Scraper — lifestyle / consumer brand reviews
- Xueqiu Scraper — Chinese stock discussion & cashtag sentiment
- Douban Scraper — film / book / music reviews & ratings
Compliance posture
- Only public JD data — same content any anonymous browser visitor sees on item pages.
- No login bypass — does not attempt authenticated-only content.
- No personal data harvesting — only the seller-identification metadata JD itself displays publicly.
Buyers running this commercially are responsible for downstream compliance with their own jurisdiction's data laws.
Support
Found a bug? Need a field that's not extracted? Open an issue on the Actor page — typical turnaround 48 hours.
If this Actor saves you time, a 30-second review is the single biggest thing that helps — it brings the tool to other buyers and pays for continued maintenance.