# HealthWarehouse Scraper (`kawsar/healthwarehouse-scraper`) Actor

HealthWarehouse scraper that extracts product data by category, including price, stock, SKU, and URLs, so ecommerce and research teams can automate catalog tracking and monitor pharmacy market changes with a clean dataset output.

- **URL**: https://apify.com/kawsar/healthwarehouse-scraper.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.90 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## HealthWarehouse Products Scraper: Extract Product Data from HealthWarehouse Categories

This HealthWarehouse scraper collects product data from `healthwarehouse.com` through the public GraphQL endpoint used by the storefront. Use it to scrape product names, SKUs, prices, stock status, and product URLs by category so you can monitor catalog changes, automate price tracking, and run pharmacy market research without manual exports.

### Use cases

- **Price monitoring**: track product price shifts in a target HealthWarehouse category over time
- **Stock tracking**: watch in-stock and out-of-stock changes for diabetic supplies, OTC items, or prescription-related listings
- **Catalog intelligence**: collect structured product metadata for competitor analysis and assortment comparisons
- **Data pipelines**: automate product scraping and send dataset output into BI tools or Sheets
- **Market research**: capture category-level snapshots for pharmacy and healthcare ecommerce analysis

### Input

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `category` | string | `diabetic-supplies` | One category selected from the validated dropdown list of working `categoryUrl` slugs. |
| `searchInputs` | array of strings | `[]` | Optional multiple search queries, one per line, executed inside the selected category. |
| `state` | string | `""` | Optional US state code used by the API. |
| `maxItems` | integer | `100` | Maximum number of products to collect per category, hard-capped at 1000. |
| `pageSize` | integer | `24` | Number of products requested per API page. |
| `timeoutSecs` | integer | `300` | Total run timeout to limit runtime costs. |
| `requestTimeoutSecs` | integer | `30` | Timeout for each HTTP request. |
| `proxyConfiguration` | object | Datacenter (Anywhere) | Proxy type and location to use for requests. |

#### Example input

```json
{
    "category": "diabetic-supplies",
    "searchInputs": ["sildenafil", "lisinopril"],
    "state": "",
    "maxItems": 50,
    "pageSize": 24,
    "timeoutSecs": 300,
    "requestTimeoutSecs": 30,
    "proxyConfiguration": { "useApifyProxy": true }
}
````

### What data does this actor extract?

The actor stores each product in the dataset with normalized fields:

```json
{
    "category": "diabetic-supplies",
    "productId": 28003,
    "name": "Anipryl 15mg Tablets- 30ct",
    "sku": "*V*70047",
    "isActive": true,
    "visibility": 0,
    "pharmacy": true,
    "specialOtc": false,
    "taxClassId": null,
    "url": "https://www.healthwarehouse.com/anipryl-15mg-tablets-30ct",
    "imageUrl": "https://d3pq5rjvq8yvv1.cloudfront.net/products/Anipryl-15mg.png",
    "thumbnailUrl": "https://d3pq5rjvq8yvv1.cloudfront.net/products/Anipryl-15mg.png",
    "minPrice": 100,
    "defaultQty": null,
    "inStock": true,
    "scrapedAt": "2026-05-05T11:00:00+00:00"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `category` | string | Input category slug used for this record. |
| `productId` | integer | Product identifier returned by HealthWarehouse API. |
| `name` | string | Product title. |
| `sku` | string | Product SKU. |
| `isActive` | boolean | Product active status flag. |
| `visibility` | integer | Visibility code from API. |
| `pharmacy` | boolean | Pharmacy product flag. |
| `specialOtc` | boolean | Special OTC flag. |
| `taxClassId` | integer/null | Tax class ID. |
| `url` | string | Product page URL. |
| `imageUrl` | string | Main product image URL. |
| `thumbnailUrl` | string | Product thumbnail URL. |
| `minPrice` | number/null | Minimum product price. |
| `defaultQty` | number/null | Default quantity value. |
| `inStock` | boolean | Current stock status. |
| `scrapedAt` | string | UTC timestamp for when item was collected. |
| `error` | string | Error details when a request fails. |

### How it works

1. Uses the selected `category` from the dropdown.
2. Runs each query in `searchInputs` separately, or runs once with an empty query if no search input is provided.
3. Sends GraphQL requests with pagination (`endCursor`, `hasNextPage`).
4. Applies `maxItems` per search query.
5. Pushes product rows to the dataset and writes an error row if one query fails.
6. If category mode fails on page 1, retries with fallback search mode.

### FAQ

**Can I run multiple search terms in one run?**\
Yes. Add multiple entries in `searchInputs`, one per line.

**Which slug should I use for Pharmacy, Home Medical, and Pet Supplies?**\
Use `products` for pharmacy-wide browsing, `home-medical` for Home Medical, and `personal-care` for Pet Supplies fallback coverage.

**Why are many category slugs missing from dropdown?**\
The dropdown now includes only slugs that returned valid GraphQL category results during endpoint validation. This avoids empty runs and `FAILED_SEARCH_PRODUCTS` errors.

**Does it support proxies?**\
Yes. Use `proxyConfiguration` to select Datacenter, Residential, or your own proxy setup.

**How many products can I collect?**\
Set `maxItems` up to 1000 per run to keep the actor cost-controlled and predictable.

**Can I use this for SEO and product research workflows?**\
Yes. The output is structured for analysis, scheduling, and integration with external tools.

### Integrations

Connect this actor with [Apify integrations](https://apify.com/integrations) to send product data to Google Sheets, Airbyte, webhooks, Slack, Make, Zapier, or custom endpoints.

If you need a reliable HealthWarehouse scraper for category-level product extraction and automation, this actor gives you a clean dataset that is ready for price tracking and market analysis.

# Actor input Schema

## `category` (type: `string`):

Choose one validated HealthWarehouse category slug.

## `searchInputs` (type: `array`):

Optional multiple search queries, one per line. Each query runs within the selected category.

## `state` (type: `string`):

Optional state code used by the API. Leave empty to skip state filtering.

## `maxItems` (type: `integer`):

Maximum number of products to collect.

## `pageSize` (type: `integer`):

Number of products requested per GraphQL call.

## `timeoutSecs` (type: `integer`):

Hard stop for the full run to control costs.

## `requestTimeoutSecs` (type: `integer`):

Timeout for each API request.

## `proxyConfiguration` (type: `object`):

Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect.

## Actor input object example

```json
{
  "category": "diabetic-supplies",
  "searchInputs": [
    "sildenafil",
    "lisinopril"
  ],
  "state": "",
  "maxItems": 100,
  "pageSize": 24,
  "timeoutSecs": 300,
  "requestTimeoutSecs": 30,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "category": "diabetic-supplies",
    "searchInputs": [],
    "state": "",
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/healthwarehouse-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "category": "diabetic-supplies",
    "searchInputs": [],
    "state": "",
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/healthwarehouse-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "category": "diabetic-supplies",
  "searchInputs": [],
  "state": "",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call kawsar/healthwarehouse-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/healthwarehouse-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "HealthWarehouse Scraper",
        "description": "HealthWarehouse scraper that extracts product data by category, including price, stock, SKU, and URLs, so ecommerce and research teams can automate catalog tracking and monitor pharmacy market changes with a clean dataset output.",
        "version": "0.0",
        "x-build-id": "h75BjawLf75LwJVMX"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~healthwarehouse-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-healthwarehouse-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~healthwarehouse-scraper/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-healthwarehouse-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~healthwarehouse-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-healthwarehouse-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "category"
                ],
                "properties": {
                    "category": {
                        "title": "Category",
                        "enum": [
                            "diabetic-supplies",
                            "easytouch",
                            "fsa",
                            "hw-logo-products",
                            "home-medical",
                            "needles-syringes",
                            "over-the-counter",
                            "contraceptive",
                            "personal-care",
                            "products"
                        ],
                        "type": "string",
                        "description": "Choose one validated HealthWarehouse category slug.",
                        "default": "diabetic-supplies"
                    },
                    "searchInputs": {
                        "title": "Search inputs",
                        "type": "array",
                        "description": "Optional multiple search queries, one per line. Each query runs within the selected category.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "state": {
                        "title": "US state code",
                        "type": "string",
                        "description": "Optional state code used by the API. Leave empty to skip state filtering.",
                        "default": ""
                    },
                    "maxItems": {
                        "title": "Maximum items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of products to collect.",
                        "default": 100
                    },
                    "pageSize": {
                        "title": "Page size",
                        "minimum": 1,
                        "maximum": 100,
                        "type": "integer",
                        "description": "Number of products requested per GraphQL call.",
                        "default": 24
                    },
                    "timeoutSecs": {
                        "title": "Actor timeout (seconds)",
                        "minimum": 30,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Hard stop for the full run to control costs.",
                        "default": 300
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Timeout for each API request.",
                        "default": 30
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
