Warehouse Product Parser Spider avatar
Warehouse Product Parser Spider

Pricing

$0.05 / actor start

Go to Apify Store
Warehouse Product Parser Spider

Warehouse Product Parser Spider

The Warehouse Product Parser Spider extracts detailed product data from The Warehouse NZ, including name, price, brand, category, and identifiers. Ideal for eCommerce analytics, catalog management, and retail intelligence with structured JSON output.

Pricing

$0.05 / actor start

Rating

0.0

(0)

Developer

GetDataForMe

GetDataForMe

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 days ago

Last modified

Share


🕷️ Warehouse Product Parser Spider

The Warehouse Product Parser Spider is an Apify Actor built to extract structured and accurate product data from The Warehouse — one of New Zealand’s largest retail websites. This Actor efficiently scrapes product pages and delivers detailed product metadata, including price, brand, description, category, identifiers (EAN, GTIN, etc.), and availability status — all neatly formatted in JSON for easy integration into your systems.


🚀 How It Works

Simply provide a list of product page URLs from The Warehouse website. The Actor will visit each page, extract all key product information, and return it as structured JSON. The use of residential proxies ensures stable access and avoids anti-bot restrictions for consistent scraping.


🧩 Input Configuration

Example input:

{
"ProductUrls": [
"https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html"
],
"proxy": {
"useApifyProxy": true,
"apifyProxyGroups": [
"RESIDENTIAL"
],
"apifyProxyCountry": "US"
}
}

Input fields

FieldTypeRequiredDescription
ProductUrlsArray of Strings✅ YesList of product page URLs from The Warehouse website.
proxy.useApifyProxyBoolean✅ YesEnables the Apify Proxy for reliable scraping. Recommended: true.
proxy.apifyProxyGroupsArrayOptionalSpecifies the proxy group(s) to use. Use ["RESIDENTIAL"] for best performance.
proxy.apifyProxyCountryStringOptionalCountry code for proxy routing (e.g., "US").

📦 Example Output

Example structured JSON result:

[
[
{
"product_id": "R3000287",
"product_url": "https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html",
"price": "$32.00",
"name": "Living & Co Pillow Cotton Cover 2 Pack Firm White 2 Pack",
"description": "Living & Co Pillow Cotton Cover 2 Pack Firm White 2 Pack",
"brand": "Living & Co",
"image_url": "https://www.thewarehouse.co.nz/dw/image/v2/BDMG_PRD/on/demandware.static/-/Sites-twl-master-catalog/default/dwe7104d07/images/hi-res/B1/A6/R3000287_30.jpg?sw=292&sh=292",
"category_id": "homegarden-homewares-bedding-pillows",
"ean": "9401113512526",
"manufacturer_part_number": "ITM2411-000746",
"gtin14": "09401113512526",
"availability": "True",
"subclass_code": "2084",
"marketplace_item": "false",
"brand_external_id": "Living _and_ Co",
"active": true,
"disabled": false,
"product_page_url": "https://www.thewarehouse.co.nz/p/living-co-pillow-cotton-cover-2-pack-firm-white-2-pack/R3000287.html",
"actor_id": "3yTE283ARdyF0zPp1",
"run_id": "6reQ8523xoLGNnLz5"
}
]
]

✨ Features

  • 🧾 Extracts complete product details — including name, description, pricing, and identifiers (EAN, GTIN, MPN).
  • 🌍 Supports Apify Residential Proxies for stable, region-agnostic scraping.
  • ⚙️ Outputs well-structured JSON data, ideal for automation or analytics pipelines.
  • 🏷️ Captures brand, category, and availability status for accurate cataloging.
  • 💨 Optimized for speed, reliability, and consistency across product pages.

💡 Use Cases

  • 🛒 Price monitoring – Track product prices and stock availability over time.
  • 📊 E-commerce analytics – Feed extracted data into BI or product intelligence dashboards.
  • 🧾 Catalog management – Sync detailed Warehouse product information with internal databases.
  • 🧠 AI training datasets – Build structured retail datasets for model training or automation.

🛠️ Version

Current version: 1.0.0 Supported site: The Warehouse (NZ)


🧰 Integration Tips

  • Use the Apify Dataset API to retrieve your scraped data directly in JSON, CSV, or Excel format.
  • Combine this Actor with schedules or webhooks to automate product data collection.
  • Add multiple URLs in "ProductUrls" for batch scraping.

💬 Support

For custom outputs, feature requests, or bug reports, contact: 📧 support@getdataforme.com 🌐 https://getdataforme.com/contact/

💡 Tip: When contacting support, please include a clear subject line — for example: Subject: Warehouse Product Parser Spider – Support Request