# Walmart Reviews Scraper — Product Reviews to CSV/JSON in 2 min (`knotless_cadence/walmart-reviews-scraper`) Actor

7 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. Walmart product reviews → CSV/JSON. Bypasses 100-review UI cap. 17 fields: stars, text, author, date, helpful, images, seller. For BI + competitor monitoring + sentiment. spinov001@gmail.com · blog.spinov.online · t.me/scraping\_ai

- **URL**: https://apify.com/knotless\_cadence/walmart-reviews-scraper.md
- **Developed by:** [Alex](https://apify.com/knotless_cadence) (community)
- **Categories:** Developer tools, E-commerce, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Walmart Reviews Scraper

Walmart product reviews → CSV / JSON / Excel in 2 minutes. Bypasses the 100-review default UI cap on `walmart.com/reviews/product/<id>`. 17 fields per review including image URLs and seller metadata.

For: BI dashboards, competitor monitoring, sentiment analysis, product-launch QA.

### Output schema (17 fields)

| Field | Type | Notes |
|---|---|---|
| `productId` | string | Walmart numeric product ID (extracted from URL or accepted directly) |
| `page` | integer | Pagination page on which this review was found |
| `sourceUrl` | string | Exact URL fetched (page + sort included) |
| `reviewId` | string \| null | Walmart's stable review identifier |
| `stars` | integer | 1 – 5 |
| `title` | string \| null | Review headline |
| `body` | string \| null | Review body text |
| `author` | string \| null | `userNickname` — display name (Walmart obfuscates real names) |
| `date` | string \| null | Submission date (US-format `M/D/YYYY` as Walmart returns it) |
| `verifiedPurchase` | boolean \| null | Derived from `userBadges`; `null` when Walmart omits the flag |
| `helpfulCount` | integer \| null | `positiveFeedback` — number of "Helpful" votes |
| `unhelpfulCount` | integer \| null | `negativeFeedback` — number of "Not Helpful" votes |
| `recommended` | boolean \| null | Reviewer recommends product (Walmart-prompted yes/no) |
| `imageUrls` | string[] | Photos attached to this review (empty array when none) |
| `fulfilledBy` | string \| null | E.g. "Walmart" / "Marketplace seller" — fulfilment side of purchase |
| `sellerName` | string \| null | Seller of record at time of purchase |
| `scrapedAt` | integer | Unix epoch seconds — when this row was collected |

### Inputs

* **`productInputs`** *(array, required)* — list of full Walmart product URLs OR bare numeric product IDs. Mixed input is OK. Up to 50 products per run.
* **`maxReviewsPerProduct`** *(integer, default 200, max 5000)* — hard cap. If a product has fewer reviews than the cap, the scraper stops naturally; it does NOT pad.
* **`sortBy`** *(enum, default `most-recent`)* — `most-recent` | `most-helpful` | `rating-high` | `rating-low`.
* **`useProxy`** *(boolean, default `true`)* — strongly recommended ON.
* **`proxyConfiguration`** *(object, default RESIDENTIAL)* — Apify proxy group selector.
* **`maxConcurrency`** *(integer, default 2)* — parallel products. Reviews within a single product still paginate sequentially.
* **`requestDelayMs`** *(integer, default 1500ms)* — delay between sequential page fetches within a product.

### Honest limitations

These are real, deliberate trade-offs — not bugs:

* **Walmart aggressively rate-limits direct datacenter IPs.** `useProxy=true` (RESIDENTIAL) is the default for a reason. Datacenter / no-proxy runs will see 403 / 429 within a few hundred requests.
* **Reviews-per-page is fixed at 10 by Walmart.** A 200-review product = 20 sequential page fetches = ~30 seconds at default 1500ms delay.
* **Cap behavior.** If a product has 47 reviews and `maxReviewsPerProduct=200`, you get 47 — no padding, no synthetic rows.
* **`verifiedPurchase` is best-effort.** Walmart sometimes omits the flag on older reviews. Missing → `null`.
* **No auth, no seller dashboard reviews.** Public review list only — what you'd see at `walmart.com/reviews/product/<id>` without logging in.
* **404 products are skipped.** Deactivated, regional, or ID-typoed entries log a warning and continue with the next product.
* **HTTP 403 / 429 aborts the current product** (not the whole run) and continues. Retry the failed IDs in a separate run — repeated 4xx in the same session usually means proxy IP burnout.

### How it works

The scraper fetches `walmart.com/reviews/product/<id>?page=<n>&sort=<sort>` and parses the embedded `__NEXT_DATA__` JSON blob — not the rendered DOM. This is more stable across UI redesigns: as long as the Next.js page state is shipped with the HTML, the parser keeps working.

Canonical extraction path (verified 2026-04 against `walmart.com`): `props.pageProps.initialData.data.reviews.customerReviews`. If a future Walmart redesign moves the array, the parser falls back to a tree-walk that pattern-matches review records by their unique key set (`reviewId` + `reviewText` + `rating`) — so the scraper degrades gracefully rather than returning empty.

### Cost framing (Apify pricing)

* Per product, full pagination to ~100 reviews = ~10 page fetches × ~3 KB each = ~30 KB transfer.
* Walmart pages are HTML-heavy (~150–250 KB each rendered, but the embedded JSON is what matters for parsing).
* Default RESIDENTIAL proxy traffic is the dominant cost driver — budget accordingly.

### Use cases

* **Competitor monitoring**: track competitor product reviews daily, alert on rating drops or new negative themes.
* **Product-launch QA**: scrape your own product's reviews post-launch, build a sentiment dashboard.
* **Market research**: collect reviews across a category for thematic analysis (LDA, embeddings, GPT clustering).
* **BI integrations**: drop the dataset into BigQuery / Snowflake / DuckDB for ad-hoc analysis.

### Related actors in this portfolio

| Tool | Adds |
|------|------|
| [Trustpilot Reviews](https://apify.com/knotless_cadence/trustpilot-review-scraper) | Cross-platform review parity (Trustpilot vs Walmart) |
| [Reddit Discussion](https://apify.com/knotless_cadence/reddit-discussion-scraper) | Off-platform sentiment (where buyers complain *about* your product) |
| [Google News](https://apify.com/knotless_cadence/google-news-scraper) | News coverage signals correlated with rating spikes |

---

### Need a custom data pipeline?

I build custom scrapers, ETL pipelines, and data-feed integrations. Pilot scope examples: a tailored parser for a specific Walmart category, a daily cron with Slack alerts, or a multi-marketplace aggregator.

📧 **Email**: spinov001@gmail.com
🌐 **Portfolio**: [blog.spinov.online](https://blog.spinov.online) · [apify.com/knotless_cadence](https://apify.com/knotless_cadence)
💬 **Tips & tutorials**: [t.me/scraping_ai](https://t.me/scraping_ai)

*Disclosure: I maintain Apify actors related to this topic; links above point to my Apify Store profile (commercial). I am not affiliated with Walmart Inc.*

# Actor input Schema

## `productInputs` (type: `array`):

List of Walmart product URLs (e.g. https://www.walmart.com/ip/Apple-AirPods/123456789) OR bare numeric product IDs. Mixed input is OK — the scraper extracts the trailing /<id> from URLs automatically.
## `maxReviewsPerProduct` (type: `integer`):

Hard cap on reviews returned per product. Walmart shows 10 reviews per page; the scraper paginates until cap or natural end. Default 200.
## `sortBy` (type: `string`):

Walmart's review-list sort. 'most-recent' is default; 'most-helpful' surfaces top-voted; 'rating-high' / 'rating-low' for polarity slices.
## `useProxy` (type: `boolean`):

Recommended ON. Walmart applies aggressive anti-bot for direct datacenter IPs — residential proxy reduces 403/429 rate.
## `proxyConfiguration` (type: `object`):

Apify proxy settings. Defaults to RESIDENTIAL group (best success rate on Walmart).
## `maxConcurrency` (type: `integer`):

How many products to scrape in parallel. Reviews within a product still paginate sequentially (rate-limit safety). Default 2.
## `requestDelayMs` (type: `integer`):

Delay between sequential page fetches within a product. Lower = faster but higher 429 risk. Default 1500ms.

## Actor input object example

```json
{
  "productInputs": [
    "https://www.walmart.com/ip/Apple-AirPods-Pro-2-USB-C/5032468035"
  ],
  "maxReviewsPerProduct": 200,
  "sortBy": "most-recent",
  "useProxy": true,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": [
      "RESIDENTIAL"
    ]
  },
  "maxConcurrency": 2,
  "requestDelayMs": 1500
}
````

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "productInputs": [
        "https://www.walmart.com/ip/Apple-AirPods-Pro-2-USB-C/5032468035"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("knotless_cadence/walmart-reviews-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "productInputs": ["https://www.walmart.com/ip/Apple-AirPods-Pro-2-USB-C/5032468035"] }

# Run the Actor and wait for it to finish
run = client.actor("knotless_cadence/walmart-reviews-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "productInputs": [
    "https://www.walmart.com/ip/Apple-AirPods-Pro-2-USB-C/5032468035"
  ]
}' |
apify call knotless_cadence/walmart-reviews-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=knotless_cadence/walmart-reviews-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Walmart Reviews Scraper — Product Reviews to CSV/JSON in 2 min",
        "description": "7 runs. Backed by 951-run Trustpilot flagship + 31-actor portfolio. Walmart product reviews → CSV/JSON. Bypasses 100-review UI cap. 17 fields: stars, text, author, date, helpful, images, seller. For BI + competitor monitoring + sentiment. spinov001@gmail.com · blog.spinov.online · t.me/scraping_ai",
        "version": "0.0",
        "x-build-id": "Q1GyNzZwBFJjlz6AK"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/knotless_cadence~walmart-reviews-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-knotless_cadence-walmart-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/knotless_cadence~walmart-reviews-scraper/runs": {
            "post": {
                "operationId": "runs-sync-knotless_cadence-walmart-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/knotless_cadence~walmart-reviews-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-knotless_cadence-walmart-reviews-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "productInputs"
                ],
                "properties": {
                    "productInputs": {
                        "title": "Walmart product URLs or IDs",
                        "minItems": 1,
                        "maxItems": 50,
                        "type": "array",
                        "description": "List of Walmart product URLs (e.g. https://www.walmart.com/ip/Apple-AirPods/123456789) OR bare numeric product IDs. Mixed input is OK — the scraper extracts the trailing /<id> from URLs automatically.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxReviewsPerProduct": {
                        "title": "Max reviews per product",
                        "minimum": 10,
                        "maximum": 5000,
                        "type": "integer",
                        "description": "Hard cap on reviews returned per product. Walmart shows 10 reviews per page; the scraper paginates until cap or natural end. Default 200.",
                        "default": 200
                    },
                    "sortBy": {
                        "title": "Review sort order",
                        "enum": [
                            "most-recent",
                            "most-helpful",
                            "rating-high",
                            "rating-low"
                        ],
                        "type": "string",
                        "description": "Walmart's review-list sort. 'most-recent' is default; 'most-helpful' surfaces top-voted; 'rating-high' / 'rating-low' for polarity slices.",
                        "default": "most-recent"
                    },
                    "useProxy": {
                        "title": "Use Apify proxy",
                        "type": "boolean",
                        "description": "Recommended ON. Walmart applies aggressive anti-bot for direct datacenter IPs — residential proxy reduces 403/429 rate.",
                        "default": true
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify proxy settings. Defaults to RESIDENTIAL group (best success rate on Walmart).",
                        "default": {
                            "useApifyProxy": true,
                            "apifyProxyGroups": [
                                "RESIDENTIAL"
                            ]
                        }
                    },
                    "maxConcurrency": {
                        "title": "Max parallel requests",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many products to scrape in parallel. Reviews within a product still paginate sequentially (rate-limit safety). Default 2.",
                        "default": 2
                    },
                    "requestDelayMs": {
                        "title": "Inter-request delay (ms)",
                        "minimum": 500,
                        "maximum": 10000,
                        "type": "integer",
                        "description": "Delay between sequential page fetches within a product. Lower = faster but higher 429 risk. Default 1500ms.",
                        "default": 1500
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
