Woocommerce Scraper avatar

Woocommerce Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Woocommerce Scraper

Woocommerce Scraper

πŸ›οΈ WooCommerce Scraper (woocommerce-scraper) extracts product titles, prices, stock, variants, categories, images, reviews & URLs from WooCommerce stores. ⚑ Ideal for SEO, price tracking, competitor research & catalog import. CSV/JSON export, scheduling, scaling. πŸ”„

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

ScrapeFlow

ScrapeFlow

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

6 days ago

Last modified

Share

Woocommerce Scraper

Woocommerce Scraper is a production-ready Apify actor that extracts structured data from WooCommerce stores via public REST endpoints. It solves the challenge of automating WooCommerce data extraction at scale β€” from prices and stock to categories, tags, and media β€” making it ideal for marketers, developers, data analysts, and researchers. Use this WooCommerce product scraper to power SEO research, competitive monitoring, and catalog imports across multiple stores with repeatable workflows and robust reliability.

What is Woocommerce Scraper?

Woocommerce Scraper is a scalable WooCommerce scraping tool that fetches products and other resources from WooCommerce stores through public β€œwc/store” and β€œwp/v2” endpoints. It addresses the need to scrape WooCommerce products and metadata reliably (including prices, stock status, categories, tags, images, and reviews) without manual copy-paste or browser automation. Built for marketers, analysts, developers, and researchers, this WooCommerce store scraper enables large-scale data collection, filtering, and export for SEO, price tracking, competitor research, and catalog imports.

What data / output can you get?

Below are the primary fields the actor outputs when scraping products. Fields are returned as found in the store’s API, cleaned by default, and enriched with the source store URL and resource type.

Data fieldDescriptionExample value
urlProduct page URLhttps://example.com/product/example-product
idProduct ID12345
nameProduct nameExample Product
slugProduct slugexample-product
typeProduct type (simple, variable, etc.)simple
skuStock-keeping unitSKU-001
on_saleWhether the product is on salefalse
prices.priceCurrent price (string)29.99
prices.regular_priceRegular price (string)29.99
prices.currency_codeCurrency codeUSD
average_ratingAverage rating (string)4.5
review_countNumber of reviews10
is_in_stockAvailability flagtrue
images[]Array of image objects[{"id":1,"src":"https://…/image.jpg"}]
categories[]Array of category terms[{"id":12,"slug":"accessories"}]
tags[]Array of tag terms[{"id":34,"slug":"summer"}]
brands[]Array of brand terms[{"id":7,"slug":"acme"}]
attributes[]Product attributes/options[{"id":1,"name":"Color","options":["Red","Blue"]}]
add_to_cartAdd-to-cart metadata{"minimum":1,"maximum":10,"url":"…"}
storeSource store URLhttps://example.com
resource_typeScraped resourceproducts

Notes:

  • Additional product fields include: parent, variation, short_description, description, prices.price_range, prices.currency_symbol, prices.currency_prefix, prices.currency_suffix, prices.currency_minor_unit, prices.currency_decimal_separator, prices.currency_thousand_separator, variations, grouped_products, has_options, is_purchasable, is_on_backorder, low_stock_remaining, sold_individually, stock_availability, extensions.
  • For non-product resources (e.g., categories, reviews, pages, posts, comments, users), the actor returns the resource’s fields as-is from the API, plus store and resource_type.
  • You can export your dataset to JSON or CSV from the Apify platform.

Key features

  • ⚑️ Automatic proxy fallback for resilience
    Built-in logic escalates from no proxy β†’ datacenter β†’ residential, with retries, to reduce blocking and keep your runs stable.

  • πŸ“¦ Multi-store batch processing
    Add multiple store URLs via startUrls, url, or dev_fileupload to scrape several WooCommerce catalogs in a single run.

  • 🎯 Powerful filtering & sorting
    Filter by featured and on-sale products, SKU, rating, tax class, min/max price, category, tag, product type, status, and stock. Sort by date, modified, id, title, slug, price, popularity, rating, menu_order, or comment_count.

  • 🧱 Resource flexibility beyond products
    Select from products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, or users.

  • πŸ“ Clean, formatted descriptions
    Choose output format for descriptions and content: md (Markdown), text (plain), or html.

  • 🧩 Field selection & transformations
    Use dev_transform_fields to select only specific output fields (supports dot notation and array indices).

  • πŸ—‚οΈ Custom storage & safe restarts
    Save to a custom-named dataset (dev_dataset_name), clear datasets before runs (dev_dataset_clear), and benefit from incremental pushing to avoid data loss.

  • 🌐 Custom networking controls
    Configure proxy settings (proxyConfiguration or dev_proxy_config), and optionally send custom HTTP headers/cookies.

  • πŸ§ͺ Production-ready reliability
    Includes retry logic, robust error handling, and real-time logging. Ideal for continuous price monitoring, catalog maintenance, and analytics pipelines.

How to use Woocommerce Scraper - step by step

  1. Create or log in to your Apify account.
  2. Open the β€œwoocommerce-scraper” actor in Apify Console.
  3. Add your store URLs in startUrls (string list) or use url as an alternative. You can also provide a file/URL with dev_fileupload to bulk-load URLs.
  4. Choose the resource to scrape (resource), e.g., products (default), and set limit (1–1000).
  5. Configure filters as needed: featured, sale, search, sku, rating, min_price, max_price, tax_class, category, tag, product_type, status, stock, include_variations, and sorting (sort + order).
  6. Select description format (format): md, text, or html.
  7. Optionally set proxyConfiguration or dev_proxy_config and add extra headers/cookies (dev_custom_headers, dev_custom_cookies).
  8. Click Start to run the actor and monitor logs for progress.
  9. When finished, open the Dataset to view results and export to JSON or CSV.

Pro Tip: Use dev_transform_fields (e.g., url,name,prices.price) to output only the fields you need, and dev_dataset_name (e.g., data-{ACTOR}-{DATE}-{TIME}) for clean, time-stamped storage.

Use cases

Use caseDescription
SEO content enrichmentAggregate product names, descriptions, categories, tags, and images to enrich on-site content and metadata.
Price monitoring & alertsTrack prices and on_sale state across stores with a reliable WooCommerce price scraper for competitive intelligence.
Competitor catalog analysisCompare SKUs, product types, and inventory across competitors to inform assortment and merchandising.
Catalog import & PIM syncExport product data to JSON/CSV and feed into PIM/ERP for catalog onboarding or updates.
Inventory trackingCheck is_in_stock and related stock availability to monitor inventory changes by category or brand.
Market & academic researchCollect structured datasets from multiple stores for trend analysis and research projects.
API pipelines & ETLIntegrate via the Apify API with your Python workflows for automated WooCommerce API data extraction.

Why choose Woocommerce Scraper?

Woocommerce Scraper is built for precision, automation, and scale β€” a robust alternative to browser extensions and fragile scripts.

  • βœ… Accurate, structured output with rich price, stock, and taxonomy data
  • 🌍 Multi-store coverage with flexible resource selection (products to posts)
  • πŸ“ˆ Scales with proxy fallback and retry logic for continuous operations
  • πŸ’» Developer-friendly via Apify API; ideal for Python WooCommerce scraper workflows
  • πŸ”’ Public data only by default for safer, ethical operations
  • πŸ’Έ Cost-effective for repeated runs like SEO audits, competitor research, and catalog updates
  • πŸ”Œ Works with your automation stack (export to CSV/JSON for tools like Make, n8n, Zapier)

In short, it’s a best-in-class WooCommerce product scraping plugin alternative for stable, repeatable data extraction.

Yes β€” when done responsibly. This actor accesses publicly available WooCommerce endpoints and does not log into private accounts or bypass authentication.

Guidelines for compliant use:

  • Scrape only publicly accessible data.
  • Respect target websites’ terms of service and robots.txt.
  • Avoid collecting personal data and ensure compliance with GDPR/CCPA and similar regulations.
  • Consult your legal team for edge cases or regulated use.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://woocommerce.com",
"https://example-store.com"
],
"proxyConfiguration": {
"useApifyProxy": false
},
"limit": 50,
"resource": "products",
"include_variations": false,
"format": "md",
"sort": "date",
"order": "",
"search": "shirt",
"sku": "SKU-001,SKU-002",
"rating": "4,5",
"min_price": 10,
"max_price": 200,
"tax_class": "standard",
"category": "12,34",
"tag": "56",
"product_type": "simple",
"status": "publish",
"stock": "instock",
"featured": false,
"sale": false,
"dev_proxy_config": "socks5://example.com:9000",
"dev_custom_headers": "[{\"name\": \"Authorization\", \"value\": \"Bearer token\"}]",
"dev_custom_cookies": "[{\"name\": \"session\", \"value\": \"abc123\"}]",
"dev_transform_fields": "url,name,prices.price",
"dev_dataset_name": "data-{ACTOR}-{DATE}-{TIME}",
"dev_dataset_clear": false,
"dev_no_strip": false,
"dev_fileupload": "https://example.com/store-urls.txt"
}

All input fields

  • startUrls (array) β€” πŸ’‘ Where do you want to Shop? (Also accepts 'url' as array of store URLs). Required: Yes. Default: none.
  • url (array) β€” Alternative to startUrls: array of store URLs to scrape. Required: No. Default: none.
  • limit (integer) β€” Number of results (per-query). Required: No. Default: 20 (min: 1, max: 1000).
  • resource (string) β€” Select resource type to scrape. One of: products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, users. Required: No. Default: products.
  • include_variations (boolean) β€” Include product variations in results. Required: No. Default: false.
  • format (string) β€” Output format for Descriptions. One of: md, text, html. Required: No. Default: md.
  • sort (string) β€” Sort results by attribute. One of: "", date, modified, id, include, title, slug, price, popularity, rating, menu_order, comment_count. Required: No. Default: date.
  • order (string) β€” Order sort direction. One of: "", asc, desc. Required: No. Default: "".
  • search (string) β€” Limit results to those matching a string. Required: No. Default: none.
  • sku (string) β€” Limit result set to products with specific SKU(s). Use commas to separate. Required: No. Default: none.
  • rating (string) β€” Filter by product ratings. Enter comma-separated rating values (e.g., 1,2,3,4,5). Required: No. Default: none.
  • min_price (integer) β€” Limit result set to products based on a minimum price. Required: No. Default: none.
  • max_price (integer) β€” Limit result set to products based on a maximum price. Required: No. Default: none.
  • tax_class (string) β€” Limit result set to products with a specific tax class. One of: "", standard, reduced-rate, zero-rate. Required: No. Default: none.
  • category (string) β€” Product Category ID(s) separated by comma. Required: No. Default: none.
  • tag (string) β€” Product Tag ID(s) separated by comma. Required: No. Default: none.
  • product_type (string) β€” Products assigned a specific type. One of: "", simple, grouped, external, variable, wbs_bundle, variation. Required: No. Default: none.
  • status (string) β€” Filter by product status. One of: "", future, trash, draft, pending, private, publish. Required: No. Default: none.
  • stock (string) β€” Filter by stock status. One of: "", instock, outofstock, onbackorder. Required: No. Default: none.
  • featured (boolean) β€” Featured products. Required: No. Default: false.
  • sale (boolean) β€” Products on sale. Required: No. Default: false.
  • proxyConfiguration (object) β€” Choose which proxies to use. By default, no proxy is used. Required: No. Default: {"useApifyProxy": false}.
  • dev_proxy_config (string) β€” Supported protocol: HTTP(S), SOCKS5. Example format: socks5://example.com:9000. Required: No. Default: none.
  • dev_custom_headers (string) β€” Additional HTTP Headers as JSON array (string). Required: No. Default: none.
  • dev_custom_cookies (string) β€” Additional HTTP Cookies as JSON array (string). Required: No. Default: none.
  • dev_transform_fields (string) β€” Transform the resulting output. Enter comma-separated field paths (dot notation for objects; numeric indices for arrays). Required: No. Default: none.
  • dev_dataset_name (string) β€” Save results into custom named Dataset; supports {ACTOR}, {DATE}, {TIME}. Required: No. Default: none.
  • dev_dataset_clear (boolean) β€” Clear Dataset before insert/update. Required: No. Default: false.
  • dev_no_strip (boolean) β€” Disable data cleansing (keep empty values). Required: No. Default: false.
  • dev_fileupload (string) β€” Upload your file and paste the URL to load store URLs. Required: No. Default: none.

Example JSON output (products)

{
"url": "https://example.com/product/example-product",
"id": 12345,
"name": "Example Product",
"slug": "example-product",
"parent": 0,
"type": "simple",
"variation": "",
"sku": "SKU-001",
"short_description": "Short summary of the product.",
"description": "**Full description** with details and features.",
"on_sale": false,
"prices": {
"price": "29.99",
"regular_price": "29.99",
"sale_price": "0",
"price_range": null,
"currency_code": "USD",
"currency_symbol": "$",
"currency_minor_unit": 2,
"currency_decimal_separator": ".",
"currency_thousand_separator": ",",
"currency_prefix": "$",
"currency_suffix": ""
},
"average_rating": "4.5",
"review_count": 10,
"images": [
{ "id": 1, "src": "https://example.com/wp-content/uploads/image.jpg", "alt": "Example" }
],
"categories": [
{ "id": 12, "name": "Accessories", "slug": "accessories" }
],
"tags": [
{ "id": 34, "name": "Summer", "slug": "summer" }
],
"brands": [
{ "id": 7, "name": "Acme", "slug": "acme" }
],
"attributes": [
{ "id": 1, "name": "Color", "options": ["Red", "Blue"] }
],
"variations": [],
"grouped_products": [],
"has_options": false,
"is_purchasable": true,
"is_in_stock": true,
"is_on_backorder": false,
"low_stock_remaining": null,
"sold_individually": false,
"stock_availability": { "class": "in-stock", "text": "" },
"add_to_cart": {
"minimum": 1,
"maximum": 10,
"multiple_of": 1,
"single_text": "Add to cart",
"url": "https://example.com/?add-to-cart=12345"
},
"extensions": {
"bundles": [],
"compatibility": { "host": "", "host_slug": "", "vendor_id": 0, "version": "" },
"dependencies": { "plugins": [], "themes": [] },
"express": { "plans": "" },
"marketplace": { "slug": "" }
},
"store": "https://example.com",
"resource_type": "products"
}

Notes:

  • Output fields may be omitted if empty (cleaned by default). Set dev_no_strip to true to keep empty/null values.
  • For non-product resources, the actor outputs the resource fields unmodified from the API and adds store and resource_type.

FAQ

Do I need API keys or a login to use this WooCommerce product scraper?

No. The actor fetches data from publicly accessible WooCommerce REST β€œstore” and WordPress β€œwp/v2” endpoints and does not require login or private API credentials.

Which resources can I scrape besides products?

You can choose the resource input to scrape products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, or users. The actor returns each resource’s fields as provided by the API, plus store and resource_type.

How many items can I scrape per store?

The limit parameter controls the maximum number of results per query. It defaults to 20 and supports up to 1000 per run.

Can I include or exclude product variations?

Yes. Set include_variations to true to include variations in results, or false to exclude them from the output.

How do I filter by price, SKU, rating, or stock?

Use min_price and max_price to constrain price, sku for SKUs, rating for rating values, and stock for stock status (instock, outofstock, onbackorder). You can also filter by featured and sale.

What happens if a store blocks my requests?

The actor includes proxy fallback logic (no proxy β†’ datacenter β†’ residential) with retries. You can also configure proxyConfiguration or provide a custom dev_proxy_config.

Can I change how descriptions are formatted?

Yes. Use format to choose md (Markdown), text (plain text), or html. The actor applies this to description, short_description, and content/excerpt fields where applicable.

How can I export the results?

After the run, open the Dataset in Apify Console and export your data to JSON or CSV for downstream analysis, PIM/ERP import, or ETL workflows.

Closing CTA / Final thoughts

Woocommerce Scraper is built for accurate, scalable WooCommerce data extraction. With robust filtering, proxy fallback, and flexible output formatting, it helps marketers, developers, analysts, and researchers automate catalog scraping, price tracking, and competitor analysis across multiple stores. Developers can integrate results via the Apify API to power Python-based pipelines and ETL workflows. Start extracting smarter WooCommerce insights today with a reliable, automation-ready scraper.