MarketVantage: Global E-commerce & Variant Scraper
Pricing
$29.99/month + usage
MarketVantage: Global E-commerce & Variant Scraper
Deep Variant Scraper for marketplaces like Temu, AliExpress, and Shein. Standard tools fail on dynamic pricing and variant mapping (size/color SKU data). This agent maps every combination, bypasses anti-bot blocks, and delivers structured, AI-ready product data for price monitoring & research.
Pricing
$29.99/month + usage
Rating
0.0
(0)
Developer

Filip Ebert, BSc.
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
6 days ago
Last modified
Categories
Share
Global Marketplace & Variant Specialist
Extract every product variant — size, color, material — from the world's most anti-bot-protected marketplaces.
Built for dropshippers, competitor analysts, and feed managers who need clean, structured SKU data from fast-fashion and global marketplace sites — not raw HTML to parse yourself.
Supported sites
| Site | Extraction method | Anti-bot level |
|---|---|---|
| AliExpress | window.runParams (full SKU matrix) | ★★★★ |
| Temu | __NEXT_DATA__ hydration JSON | ★★★★★ |
| SHEIN | window.gbProductDetailInfo / __NEXT_DATA__ | ★★★★ |
| Etsy | JSON-LD + variation JSON blocks | ★★★ |
| DHGate | window.__initial_state__ | ★★★ |
| Wish | window.__data__ / __NEXT_DATA__ | ★★★★ |
| Joom | window.__JOOM_DATA__ | ★★★ |
| Banggood | window.productData | ★★★ |
| Gearbest | window.gbData | ★★★ |
| Any other site | JSON-LD + Open Graph meta (universal fallback) | — |
Try it now
Paste this into the Apify Actor input form and click Run:
{"sources": ["https://www.aliexpress.com/item/1005006500000001.html","https://www.etsy.com/listing/123456789/handmade-ceramic-mug","https://www.temu.com/goods.html?goods_id=601099512345678"],"targetCountry": "US","targetCurrency": "USD","maxConcurrency": 3}
What makes this different
Variant-First extraction
Most scrapers only grab the displayed price. This actor reads the full SKU matrix embedded in each site's hydration JSON before JavaScript renders anything — so if a blue shirt has 5 sizes, you get 5 SKU entries, each with its own price, stock count, and variant-specific image.
"variants": [{ "sku_id": "1,10", "color": "Blue", "size": "S", "sku_price": 12.99, "stock_status": "in_stock", "stock_quantity": 47, "variant_image": "https://..." },{ "sku_id": "1,11", "color": "Blue", "size": "M", "sku_price": 12.99, "stock_status": "in_stock", "stock_quantity": 23, "variant_image": "https://..." },{ "sku_id": "1,12", "color": "Blue", "size": "L", "sku_price": 13.99, "stock_status": "out_of_stock", "stock_quantity": 0, "variant_image": "https://..." }]
Shadow Pricing captured
Temu and AliExpress show different prices based on selected variant. Because this actor reads the full price list from the hydration JSON — not the rendered DOM — every variant's exact price is captured, including discounts that only appear after selection.
Stealth browser with fingerprint injection
Anti-bot systems like DataDome and Cloudflare Bot Management are defeated through:
- Fingerprint injection — randomised
navigator.webdriver, canvas hash, audio context, plugins list per request - CDP timezone spoofing — browser timezone matches
targetCountry - Human-like behaviour — random mouse movement and delays before extraction
- Geo-targeted headers —
Accept-Languageand locale match the target country
AI-ready output
Every product includes a clean_text field: the description with all HTML tags stripped and entities decoded. Feed it directly into GPT-4 or Claude for competitor analysis, SEO copy generation, or product categorisation.
Geo-targeting & Currency
Set targetCountry: "DE" and the browser will appear to be a German user (language headers, timezone). Set targetCurrency: "EUR" and all prices are converted. Add a residential proxy from Germany for true local pricing.
Output format
One JSON object per product URL:
{"productId": "1005006500000001","title": "2024 New Summer Women Dress Floral Print","description": "<p>Material: 95% Cotton...</p>","clean_text": "Material: 95% Cotton 5% Spandex. Available in 5 colors and 4 sizes.","price": 14.99,"original_price": 29.99,"currency": "USD","availability": true,"quantity": 312,"sku_count": 20,"variants": [{"sku_id": "Color:Red|Size:S","attributes": { "Color": "Red", "Size": "S" },"color": "Red","size": "S","sku_price": 14.99,"original_price": 29.99,"stock_status": "in_stock","stock_quantity": 48,"variant_image": "https://ae01.alicdn.com/kf/red-s.jpg"}],"images": ["https://ae01.alicdn.com/kf/main.jpg","https://ae01.alicdn.com/kf/detail1.jpg"],"reviews_count": 1847,"rating": 4.8,"seller_id": "123456789","seller_name": "Fashion Boutique Store","shipping_cost": 0,"source_url": "https://www.aliexpress.com/item/1005006500000001.html","site_label": "aliexpress","country": "US","scraped_at": "2026-03-04T12:00:00.000Z"}
Input parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
sources | string[] | ✅ | Product page URLs to scrape |
targetCountry | string | — | ISO 3166-1 alpha-2 (e.g. US, DE, GB). Default: US |
targetCurrency | string | — | Output currency (e.g. USD, EUR). Default: USD |
proxyUrl | string | — | http://user:pass@host:port — use residential proxy for geo-accurate pricing |
liveRatesUrl | string | — | Exchange rate API URL (e.g. open.er-api.com). Hardcoded rates used if omitted |
maxConcurrency | integer | — | Parallel requests 1–20. Use 1–3 for Temu/AliExpress. Default: 3 |
priceSelector | string | — | CSS selector override for price (unsupported sites) |
titleSelector | string | — | CSS selector override for title (unsupported sites) |
Business intelligence use cases
Dropshipping & product sourcing
Find which Temu/AliExpress products have inventory on all sizes. Filter variants where stock_status === "out_of_stock" to avoid listing dead SKUs.
Competitor price tracking
Run daily on competitor Etsy listings. Compare sku_price per variant combination to spot when they run size-specific discounts.
LLM-powered product feed enrichment
Feed clean_text into GPT-4 to auto-generate localised product descriptions, SEO meta titles, or translate listings. No preprocessing needed.
Buy-Box analysis (Etsy / DHGate)
Multiple sellers list identical products. seller_id and shipping_cost let you identify the cheapest total-cost option and track which seller holds the Buy Box.
Pricing parity monitoring
Same product, different countries. Run with targetCountry: "US" and targetCountry: "DE" (plus matching residential proxies) to detect price discrimination across markets.
Adding a new site
The architecture is modular. To add support for a new store:
- Create
src/sites/mystore.jsfollowing this interface:
const MATCH_RE = /mystore\.com/i;function matches(url) { return MATCH_RE.test(url); }const LABEL = 'mystore';function extractFromWindowVars(html) {// 1. Try to find window.__STORE_DATA__ or similar// 2. Return canonical product object, or null to fall backreturn null;}const CSS_SELECTORS = {title : 'h1.product-title',price : '.price-current',description: '#product-description',images : '.product-gallery img',};module.exports = { matches, label: LABEL, extractFromWindowVars, CSS_SELECTORS };
- Register it in
src/sites/index.js— insert it before thegenericentry.
That's it. The orchestration, variant normalisation, stealth browser, and currency conversion all work automatically.
Architecture overview
src/├── main.js Apify Actor entry point — concurrency + currency├── fetcher.js Stealth Playwright browser + HTTP fallback├── parser.js Multi-strategy extraction pipeline├── variant_extractor.js Cartesian product SKU builder├── stealth_browser.js Fingerprint injection, anti-challenge helpers├── geo_config.js Country → locale/timezone/currency profiles├── normalize.js Price parsing + currency conversion├── utils.js cleanText(), extractASIN(), safeJsonParse()└── sites/├── index.js Site registry + URL resolver├── aliexpress.js window.runParams extractor├── temu.js __NEXT_DATA__ extractor├── shein.js gbProductDetailInfo extractor├── etsy.js JSON-LD + variation blocks extractor├── dhgate.js __initial_state__ extractor├── wish.js __data__ extractor├── joom.js __JOOM_DATA__ extractor├── banggood.js productData extractor├── gearbest.js gbData extractor└── generic.js JSON-LD + Open Graph fallback
Legal & ethical use
This actor reads only publicly visible product information.
- Do not use authenticated sessions or bypass paywalls
- You are responsible for compliance with each website's Terms of Service
- Use residential proxies only through legitimate proxy providers
- Respect
robots.txtand crawl rate limits (maxConcurrency: 1–3for protected sites)
Q: How often should I run it?
A: Most teams run daily or weekly. Real-time monitoring isn't needed for DRAM pricing.
Q: Does it work with market indexes like DRAMeXchange?
A: Not yet. Market/index sites use different data formats. Planned for v1.1.
Q: Can I export the data?
A: Yes. All results are stored in JSON format and can be exported to CSV, Google Sheets, or via API.
Get started
- Click Try for free in the Apify Console
- Paste the demo input above
- Click Run
- See price data from real Mouser and Digi-Key products
Need help? Check the examples/ for more input samples.