Woocommerce Scraper
Pricing
$19.99/month + usage
Woocommerce Scraper
ποΈ WooCommerce Scraper (woocommerce-scraper) extracts product titles, prices, stock, variants, categories, images, reviews & URLs from WooCommerce stores. β‘ Ideal for SEO, price tracking, competitor research & catalog import. CSV/JSON export, scheduling, scaling. π
Pricing
$19.99/month + usage
Rating
0.0
(0)
Developer
ScrapeFlow
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
6 days ago
Last modified
Categories
Share
Woocommerce Scraper
Woocommerce Scraper is a production-ready Apify actor that extracts structured data from WooCommerce stores via public REST endpoints. It solves the challenge of automating WooCommerce data extraction at scale β from prices and stock to categories, tags, and media β making it ideal for marketers, developers, data analysts, and researchers. Use this WooCommerce product scraper to power SEO research, competitive monitoring, and catalog imports across multiple stores with repeatable workflows and robust reliability.
What is Woocommerce Scraper?
Woocommerce Scraper is a scalable WooCommerce scraping tool that fetches products and other resources from WooCommerce stores through public βwc/storeβ and βwp/v2β endpoints. It addresses the need to scrape WooCommerce products and metadata reliably (including prices, stock status, categories, tags, images, and reviews) without manual copy-paste or browser automation. Built for marketers, analysts, developers, and researchers, this WooCommerce store scraper enables large-scale data collection, filtering, and export for SEO, price tracking, competitor research, and catalog imports.
What data / output can you get?
Below are the primary fields the actor outputs when scraping products. Fields are returned as found in the storeβs API, cleaned by default, and enriched with the source store URL and resource type.
| Data field | Description | Example value |
|---|---|---|
| url | Product page URL | https://example.com/product/example-product |
| id | Product ID | 12345 |
| name | Product name | Example Product |
| slug | Product slug | example-product |
| type | Product type (simple, variable, etc.) | simple |
| sku | Stock-keeping unit | SKU-001 |
| on_sale | Whether the product is on sale | false |
| prices.price | Current price (string) | 29.99 |
| prices.regular_price | Regular price (string) | 29.99 |
| prices.currency_code | Currency code | USD |
| average_rating | Average rating (string) | 4.5 |
| review_count | Number of reviews | 10 |
| is_in_stock | Availability flag | true |
| images[] | Array of image objects | [{"id":1,"src":"https://β¦/image.jpg"}] |
| categories[] | Array of category terms | [{"id":12,"slug":"accessories"}] |
| tags[] | Array of tag terms | [{"id":34,"slug":"summer"}] |
| brands[] | Array of brand terms | [{"id":7,"slug":"acme"}] |
| attributes[] | Product attributes/options | [{"id":1,"name":"Color","options":["Red","Blue"]}] |
| add_to_cart | Add-to-cart metadata | {"minimum":1,"maximum":10,"url":"β¦"} |
| store | Source store URL | https://example.com |
| resource_type | Scraped resource | products |
Notes:
- Additional product fields include: parent, variation, short_description, description, prices.price_range, prices.currency_symbol, prices.currency_prefix, prices.currency_suffix, prices.currency_minor_unit, prices.currency_decimal_separator, prices.currency_thousand_separator, variations, grouped_products, has_options, is_purchasable, is_on_backorder, low_stock_remaining, sold_individually, stock_availability, extensions.
- For non-product resources (e.g., categories, reviews, pages, posts, comments, users), the actor returns the resourceβs fields as-is from the API, plus store and resource_type.
- You can export your dataset to JSON or CSV from the Apify platform.
Key features
-
β‘οΈ Automatic proxy fallback for resilience
Built-in logic escalates from no proxy β datacenter β residential, with retries, to reduce blocking and keep your runs stable. -
π¦ Multi-store batch processing
Add multiple store URLs via startUrls, url, or dev_fileupload to scrape several WooCommerce catalogs in a single run. -
π― Powerful filtering & sorting
Filter by featured and on-sale products, SKU, rating, tax class, min/max price, category, tag, product type, status, and stock. Sort by date, modified, id, title, slug, price, popularity, rating, menu_order, or comment_count. -
π§± Resource flexibility beyond products
Select from products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, or users. -
π Clean, formatted descriptions
Choose output format for descriptions and content: md (Markdown), text (plain), or html. -
π§© Field selection & transformations
Use dev_transform_fields to select only specific output fields (supports dot notation and array indices). -
ποΈ Custom storage & safe restarts
Save to a custom-named dataset (dev_dataset_name), clear datasets before runs (dev_dataset_clear), and benefit from incremental pushing to avoid data loss. -
π Custom networking controls
Configure proxy settings (proxyConfiguration or dev_proxy_config), and optionally send custom HTTP headers/cookies. -
π§ͺ Production-ready reliability
Includes retry logic, robust error handling, and real-time logging. Ideal for continuous price monitoring, catalog maintenance, and analytics pipelines.
How to use Woocommerce Scraper - step by step
- Create or log in to your Apify account.
- Open the βwoocommerce-scraperβ actor in Apify Console.
- Add your store URLs in startUrls (string list) or use url as an alternative. You can also provide a file/URL with dev_fileupload to bulk-load URLs.
- Choose the resource to scrape (resource), e.g., products (default), and set limit (1β1000).
- Configure filters as needed: featured, sale, search, sku, rating, min_price, max_price, tax_class, category, tag, product_type, status, stock, include_variations, and sorting (sort + order).
- Select description format (format): md, text, or html.
- Optionally set proxyConfiguration or dev_proxy_config and add extra headers/cookies (dev_custom_headers, dev_custom_cookies).
- Click Start to run the actor and monitor logs for progress.
- When finished, open the Dataset to view results and export to JSON or CSV.
Pro Tip: Use dev_transform_fields (e.g., url,name,prices.price) to output only the fields you need, and dev_dataset_name (e.g., data-{ACTOR}-{DATE}-{TIME}) for clean, time-stamped storage.
Use cases
| Use case | Description |
|---|---|
| SEO content enrichment | Aggregate product names, descriptions, categories, tags, and images to enrich on-site content and metadata. |
| Price monitoring & alerts | Track prices and on_sale state across stores with a reliable WooCommerce price scraper for competitive intelligence. |
| Competitor catalog analysis | Compare SKUs, product types, and inventory across competitors to inform assortment and merchandising. |
| Catalog import & PIM sync | Export product data to JSON/CSV and feed into PIM/ERP for catalog onboarding or updates. |
| Inventory tracking | Check is_in_stock and related stock availability to monitor inventory changes by category or brand. |
| Market & academic research | Collect structured datasets from multiple stores for trend analysis and research projects. |
| API pipelines & ETL | Integrate via the Apify API with your Python workflows for automated WooCommerce API data extraction. |
Why choose Woocommerce Scraper?
Woocommerce Scraper is built for precision, automation, and scale β a robust alternative to browser extensions and fragile scripts.
- β Accurate, structured output with rich price, stock, and taxonomy data
- π Multi-store coverage with flexible resource selection (products to posts)
- π Scales with proxy fallback and retry logic for continuous operations
- π» Developer-friendly via Apify API; ideal for Python WooCommerce scraper workflows
- π Public data only by default for safer, ethical operations
- πΈ Cost-effective for repeated runs like SEO audits, competitor research, and catalog updates
- π Works with your automation stack (export to CSV/JSON for tools like Make, n8n, Zapier)
In short, itβs a best-in-class WooCommerce product scraping plugin alternative for stable, repeatable data extraction.
Is it legal / ethical to use Woocommerce Scraper?
Yes β when done responsibly. This actor accesses publicly available WooCommerce endpoints and does not log into private accounts or bypass authentication.
Guidelines for compliant use:
- Scrape only publicly accessible data.
- Respect target websitesβ terms of service and robots.txt.
- Avoid collecting personal data and ensure compliance with GDPR/CCPA and similar regulations.
- Consult your legal team for edge cases or regulated use.
Input parameters & output format
Example JSON input
{"startUrls": ["https://woocommerce.com","https://example-store.com"],"proxyConfiguration": {"useApifyProxy": false},"limit": 50,"resource": "products","include_variations": false,"format": "md","sort": "date","order": "","search": "shirt","sku": "SKU-001,SKU-002","rating": "4,5","min_price": 10,"max_price": 200,"tax_class": "standard","category": "12,34","tag": "56","product_type": "simple","status": "publish","stock": "instock","featured": false,"sale": false,"dev_proxy_config": "socks5://example.com:9000","dev_custom_headers": "[{\"name\": \"Authorization\", \"value\": \"Bearer token\"}]","dev_custom_cookies": "[{\"name\": \"session\", \"value\": \"abc123\"}]","dev_transform_fields": "url,name,prices.price","dev_dataset_name": "data-{ACTOR}-{DATE}-{TIME}","dev_dataset_clear": false,"dev_no_strip": false,"dev_fileupload": "https://example.com/store-urls.txt"}
All input fields
- startUrls (array) β π‘ Where do you want to Shop? (Also accepts 'url' as array of store URLs). Required: Yes. Default: none.
- url (array) β Alternative to startUrls: array of store URLs to scrape. Required: No. Default: none.
- limit (integer) β Number of results (per-query). Required: No. Default: 20 (min: 1, max: 1000).
- resource (string) β Select resource type to scrape. One of: products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, users. Required: No. Default: products.
- include_variations (boolean) β Include product variations in results. Required: No. Default: false.
- format (string) β Output format for Descriptions. One of: md, text, html. Required: No. Default: md.
- sort (string) β Sort results by attribute. One of: "", date, modified, id, include, title, slug, price, popularity, rating, menu_order, comment_count. Required: No. Default: date.
- order (string) β Order sort direction. One of: "", asc, desc. Required: No. Default: "".
- search (string) β Limit results to those matching a string. Required: No. Default: none.
- sku (string) β Limit result set to products with specific SKU(s). Use commas to separate. Required: No. Default: none.
- rating (string) β Filter by product ratings. Enter comma-separated rating values (e.g., 1,2,3,4,5). Required: No. Default: none.
- min_price (integer) β Limit result set to products based on a minimum price. Required: No. Default: none.
- max_price (integer) β Limit result set to products based on a maximum price. Required: No. Default: none.
- tax_class (string) β Limit result set to products with a specific tax class. One of: "", standard, reduced-rate, zero-rate. Required: No. Default: none.
- category (string) β Product Category ID(s) separated by comma. Required: No. Default: none.
- tag (string) β Product Tag ID(s) separated by comma. Required: No. Default: none.
- product_type (string) β Products assigned a specific type. One of: "", simple, grouped, external, variable, wbs_bundle, variation. Required: No. Default: none.
- status (string) β Filter by product status. One of: "", future, trash, draft, pending, private, publish. Required: No. Default: none.
- stock (string) β Filter by stock status. One of: "", instock, outofstock, onbackorder. Required: No. Default: none.
- featured (boolean) β Featured products. Required: No. Default: false.
- sale (boolean) β Products on sale. Required: No. Default: false.
- proxyConfiguration (object) β Choose which proxies to use. By default, no proxy is used. Required: No. Default: {"useApifyProxy": false}.
- dev_proxy_config (string) β Supported protocol: HTTP(S), SOCKS5. Example format: socks5://example.com:9000. Required: No. Default: none.
- dev_custom_headers (string) β Additional HTTP Headers as JSON array (string). Required: No. Default: none.
- dev_custom_cookies (string) β Additional HTTP Cookies as JSON array (string). Required: No. Default: none.
- dev_transform_fields (string) β Transform the resulting output. Enter comma-separated field paths (dot notation for objects; numeric indices for arrays). Required: No. Default: none.
- dev_dataset_name (string) β Save results into custom named Dataset; supports {ACTOR}, {DATE}, {TIME}. Required: No. Default: none.
- dev_dataset_clear (boolean) β Clear Dataset before insert/update. Required: No. Default: false.
- dev_no_strip (boolean) β Disable data cleansing (keep empty values). Required: No. Default: false.
- dev_fileupload (string) β Upload your file and paste the URL to load store URLs. Required: No. Default: none.
Example JSON output (products)
{"url": "https://example.com/product/example-product","id": 12345,"name": "Example Product","slug": "example-product","parent": 0,"type": "simple","variation": "","sku": "SKU-001","short_description": "Short summary of the product.","description": "**Full description** with details and features.","on_sale": false,"prices": {"price": "29.99","regular_price": "29.99","sale_price": "0","price_range": null,"currency_code": "USD","currency_symbol": "$","currency_minor_unit": 2,"currency_decimal_separator": ".","currency_thousand_separator": ",","currency_prefix": "$","currency_suffix": ""},"average_rating": "4.5","review_count": 10,"images": [{ "id": 1, "src": "https://example.com/wp-content/uploads/image.jpg", "alt": "Example" }],"categories": [{ "id": 12, "name": "Accessories", "slug": "accessories" }],"tags": [{ "id": 34, "name": "Summer", "slug": "summer" }],"brands": [{ "id": 7, "name": "Acme", "slug": "acme" }],"attributes": [{ "id": 1, "name": "Color", "options": ["Red", "Blue"] }],"variations": [],"grouped_products": [],"has_options": false,"is_purchasable": true,"is_in_stock": true,"is_on_backorder": false,"low_stock_remaining": null,"sold_individually": false,"stock_availability": { "class": "in-stock", "text": "" },"add_to_cart": {"minimum": 1,"maximum": 10,"multiple_of": 1,"single_text": "Add to cart","url": "https://example.com/?add-to-cart=12345"},"extensions": {"bundles": [],"compatibility": { "host": "", "host_slug": "", "vendor_id": 0, "version": "" },"dependencies": { "plugins": [], "themes": [] },"express": { "plans": "" },"marketplace": { "slug": "" }},"store": "https://example.com","resource_type": "products"}
Notes:
- Output fields may be omitted if empty (cleaned by default). Set dev_no_strip to true to keep empty/null values.
- For non-product resources, the actor outputs the resource fields unmodified from the API and adds store and resource_type.
FAQ
Do I need API keys or a login to use this WooCommerce product scraper?
No. The actor fetches data from publicly accessible WooCommerce REST βstoreβ and WordPress βwp/v2β endpoints and does not require login or private API credentials.
Which resources can I scrape besides products?
You can choose the resource input to scrape products, categories, brands, tags, attributes, reviews, pages, posts, comments, post-categories, post-tags, or users. The actor returns each resourceβs fields as provided by the API, plus store and resource_type.
How many items can I scrape per store?
The limit parameter controls the maximum number of results per query. It defaults to 20 and supports up to 1000 per run.
Can I include or exclude product variations?
Yes. Set include_variations to true to include variations in results, or false to exclude them from the output.
How do I filter by price, SKU, rating, or stock?
Use min_price and max_price to constrain price, sku for SKUs, rating for rating values, and stock for stock status (instock, outofstock, onbackorder). You can also filter by featured and sale.
What happens if a store blocks my requests?
The actor includes proxy fallback logic (no proxy β datacenter β residential) with retries. You can also configure proxyConfiguration or provide a custom dev_proxy_config.
Can I change how descriptions are formatted?
Yes. Use format to choose md (Markdown), text (plain text), or html. The actor applies this to description, short_description, and content/excerpt fields where applicable.
How can I export the results?
After the run, open the Dataset in Apify Console and export your data to JSON or CSV for downstream analysis, PIM/ERP import, or ETL workflows.
Closing CTA / Final thoughts
Woocommerce Scraper is built for accurate, scalable WooCommerce data extraction. With robust filtering, proxy fallback, and flexible output formatting, it helps marketers, developers, analysts, and researchers automate catalog scraping, price tracking, and competitor analysis across multiple stores. Developers can integrate results via the Apify API to power Python-based pipelines and ETL workflows. Start extracting smarter WooCommerce insights today with a reliable, automation-ready scraper.