Shein New Arrivals & Best-Sellers Trend Scraper
Pricing
Pay per event
Shein New Arrivals & Best-Sellers Trend Scraper
Scrape Shein category pages for new arrivals and best-sellers. Tracks rank, first_seen_date, rank_movement, pricing, and flash sales per run. For fashion trend forecasting and dropship research. Powered by Bright Data Web Unlocker — the only proven path through Shein Halo anti-bot.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
17 hours ago
Last modified
Categories
Share
Scrape Shein category pages for new arrivals and best-sellers. Tracks rank, first_seen_date, rank_movement, pricing, and flash sale data per run. Designed for fashion trend forecasting and dropship research.
Powered by Bright Data Web Unlocker — the only proven path through Shein's Halo anti-bot system.
What you get
Each record represents one product's position in a Shein category listing at the time of the run:
| Field | Description |
|---|---|
goods_id | Shein's internal numeric product ID |
product_name | Display title from the listing page |
product_url | Full canonical product detail URL |
image_url | Primary listing thumbnail URL |
category / category_id | Human-readable name + numeric -c-<id> extracted from the URL |
rank | 1-based position in the listing |
first_seen_date | ISO 8601 date this product first appeared in a run for this category + feed type |
rank_movement | Position delta vs the previous run (positive = climbed, negative = fell, 0 = first seen or unchanged) |
sale_price | Current sale price (numeric) |
retail_price | Original retail price before discount |
discount_pct | Discount percentage (0–100) |
currency | ISO 4217 currency code (USD, GBP, EUR, …) |
is_flash_sale | true if currently on a limited-time flash sale |
flash_sale_price | Flash sale price if active — null otherwise |
flash_sale_end | ISO 8601 end timestamp of the flash sale — null if not active |
region | Storefront region derived from the category URL (us, uk, de, …) |
feed_type | Which feed this product came from: new_arrival or best_seller |
How it works
Shein's Halo anti-bot system blocks Playwright and standard residential proxies. This actor uses Bright Data Web Unlocker, which handles TLS fingerprinting, browser challenge solving, and rotating exit nodes server-side — returning the full rendered HTML (~4–5 MB per category page) to the actor.
Product data is extracted from an anonymous JavaScript array embedded in the page. The actor applies a mandatory content gate: any response smaller than 400 KB, containing explicit challenge markers, or lacking goods_id signals is treated as a Halo challenge shell and retried (up to 3 times).
Trend state (first_seen_date, previous rank) is persisted to the run's Key-Value store so that consecutive runs compute accurate rank_movement deltas.
Inputs
Category URLs
Shein category page URLs to scrape. Must use the -c-<id>.html URL format:
https://us.shein.com/Women-Dresses-c-1727.htmlhttps://us.shein.com/Women-Tops-c-1738.html
Leave this field empty to use the built-in default set: Women Dresses, Tops, Pants, and Sets on us.shein.com.
Do not add ?sort= manually. The sort direction is controlled by the Feed Type input.
Feed Type
Which trend feed to scrape:
- Both New Arrivals and Best Sellers (default) — fetches each category URL twice, once with
sort=7(newest) and once withsort=8(best-selling). - New Arrivals only — sorted by newest additions (
sort=7). - Best Sellers only — sorted by sales volume / popularity (
sort=8).
Region
Shein storefront region for currency context. Automatically derived from the category URL domain if left unset.
| Value | Storefront |
|---|---|
us | us.shein.com (USD) |
uk | shein.co.uk (GBP) |
de | shein.de (EUR) |
fr | shein.fr (EUR) |
au | shein.com.au (AUD) |
ca | shein.ca (CAD) |
Max Items
Maximum number of product records to emit across all pages and feeds combined. Set to 0 for no limit. Default: 100.
Example output
{"goods_id": "414605698","product_name": "Fashionable Digital Printed Milk Silk Fabric, Cool Summer Maxi Dress","product_url": "https://us.shein.com/p-414605698.html","image_url": "https://img.ltwebstatic.com/images3_spmp/...","category": "Women Dresses","category_id": "1727","rank": 1,"first_seen_date": "2026-07-05","rank_movement": 0,"sale_price": 8.93,"retail_price": 14.99,"discount_pct": 40,"currency": "USD","is_flash_sale": false,"flash_sale_price": null,"flash_sale_end": null,"region": "us","feed_type": "new_arrival"}
Performance and cost
Each Shein category page is fetched via Bright Data Web Unlocker. The billing model is per successful Web Unlocker request (~$1.50 / 1 000 requests). A standard run with 4 categories on both feeds = 8 BD requests.
Memory: 512 MB. Concurrent BD fetches are capped at 2 to prevent OOM on large response bodies (~5 MB each).
Trend tracking across runs
Run this actor on a schedule (daily or hourly) for the same category URLs. On each run it:
- Reads the previous run's rank state from the Key-Value store.
- Computes
rank_movementper product. - Records
first_seen_datethe first time agoods_idappears. - Persists updated state for the next run.
The state key is category_id:feed_type:goods_id, so multiple categories and feed types are tracked independently.
Anti-bot notes
- Halo challenge shell: Shein serves a ~355 KB JavaScript challenge page that Bright Data's server-side grader sometimes marks as
ok. The actor's content gate (size >= 400 KB +goods_idpresent) rejects these false positives and retries. - Retry limit: Up to 3 BD retries per request. A page that fails all retries is skipped with a warning — the run continues.
- Concurrency:
inn_max_conc: 2limits peak concurrent BD fetches. Do not raise this above 4 without increasing memory.