Warehouse Product Parser Spider
Pricing
$0.05 / actor start
Warehouse Product Parser Spider
The Warehouse Product Parser Spider extracts detailed product data from The Warehouse NZ, including name, price, brand, category, and identifiers. Ideal for eCommerce analytics, catalog management, and retail intelligence with structured JSON output.
Pricing
$0.05 / actor start
Rating
0.0
(0)
Developer

GetDataForMe
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
🕷️ Warehouse Product Parser Spider
The Warehouse Product Parser Spider is an Apify Actor built to extract structured and accurate product data from The Warehouse — one of New Zealand’s largest retail websites. This Actor efficiently scrapes product pages and delivers detailed product metadata, including price, brand, description, category, identifiers (EAN, GTIN, etc.), and availability status — all neatly formatted in JSON for easy integration into your systems.
🚀 How It Works
Simply provide a list of product page URLs from The Warehouse website. The Actor will visit each page, extract all key product information, and return it as structured JSON. The use of residential proxies ensures stable access and avoids anti-bot restrictions for consistent scraping.
🧩 Input Configuration
Example input:
{"ProductUrls": ["https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html"],"proxy": {"useApifyProxy": true,"apifyProxyGroups": ["RESIDENTIAL"],"apifyProxyCountry": "US"}}
Input fields
| Field | Type | Required | Description |
|---|---|---|---|
ProductUrls | Array of Strings | ✅ Yes | List of product page URLs from The Warehouse website. |
proxy.useApifyProxy | Boolean | ✅ Yes | Enables the Apify Proxy for reliable scraping. Recommended: true. |
proxy.apifyProxyGroups | Array | Optional | Specifies the proxy group(s) to use. Use ["RESIDENTIAL"] for best performance. |
proxy.apifyProxyCountry | String | Optional | Country code for proxy routing (e.g., "US"). |
📦 Example Output
Example structured JSON result:
[[{"product_id": "R3000287","product_url": "https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html","price": "$32.00","name": "Living & Co Pillow Cotton Cover 2 Pack Firm White 2 Pack","description": "Living & Co Pillow Cotton Cover 2 Pack Firm White 2 Pack","brand": "Living & Co","image_url": "https://www.thewarehouse.co.nz/dw/image/v2/BDMG_PRD/on/demandware.static/-/Sites-twl-master-catalog/default/dwe7104d07/images/hi-res/B1/A6/R3000287_30.jpg?sw=292&sh=292","category_id": "homegarden-homewares-bedding-pillows","ean": "9401113512526","manufacturer_part_number": "ITM2411-000746","gtin14": "09401113512526","availability": "True","subclass_code": "2084","marketplace_item": "false","brand_external_id": "Living _and_ Co","active": true,"disabled": false,"product_page_url": "https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html","actor_id": "3yTE283ARdyF0zPp1","run_id": "6reQ8523xoLGNnLz5"}]]
✨ Features
- 🧾 Extracts complete product details — including name, description, pricing, and identifiers (EAN, GTIN, MPN).
- 🌍 Supports Apify Residential Proxies for stable, region-agnostic scraping.
- ⚙️ Outputs well-structured JSON data, ideal for automation or analytics pipelines.
- 🏷️ Captures brand, category, and availability status for accurate cataloging.
- 💨 Optimized for speed, reliability, and consistency across product pages.
💡 Use Cases
- 🛒 Price monitoring – Track product prices and stock availability over time.
- 📊 E-commerce analytics – Feed extracted data into BI or product intelligence dashboards.
- 🧾 Catalog management – Sync detailed Warehouse product information with internal databases.
- 🧠 AI training datasets – Build structured retail datasets for model training or automation.
🛠️ Version
Current version: 1.0.0
Supported site: The Warehouse (NZ)
🧰 Integration Tips
- Use the Apify Dataset API to retrieve your scraped data directly in JSON, CSV, or Excel format.
- Combine this Actor with schedules or webhooks to automate product data collection.
- Add multiple URLs in
"ProductUrls"for batch scraping.
💬 Support
For custom outputs, feature requests, or bug reports, contact: 📧 support@getdataforme.com 🌐 https://getdataforme.com/contact/
💡 Tip: When contacting support, please include a clear subject line — for example: Subject: Warehouse Product Parser Spider – Support Request