Walmart Data Extractor avatar

Walmart Data Extractor

Pricing

Pay per event

Go to Apify Store
Walmart Data Extractor

Walmart Data Extractor

Extract Walmart.com products from search, category, product URLs, or item IDs. Returns prices, ratings, stock, seller, variants, breadcrumbs, and specifications. Optional per-review extraction. PPE pricing — $0.003 per product, $0.002 per review.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Khadin Akbar

Khadin Akbar

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

3

Monthly active users

18 hours ago

Last modified

Share

Walmart Data Extractor — Products, Prices, Reviews & Sellers

Production-grade scraper that extracts structured product data from Walmart.com: titles, brands, current and list prices, savings, ratings, review counts, stock status, seller name and type (Walmart vs marketplace), pickup/shipping availability, images, variants, breadcrumbs, and specifications. Optionally pulls individual customer reviews. Built on Walmart's __NEXT_DATA__ JSON blob (richer and more reliable than DOM scraping) and runs through Apify Residential proxies to clear Walmart's Akamai + PerimeterX defense.

One actor, four modes — search, category, direct product URLs, or raw Walmart item IDs.

What you get per product

FieldDescription
itemIdWalmart numeric item ID (a.k.a. usItemId).
productUrlCanonical Walmart product page URL.
titleProduct title.
brandBrand name.
modelManufacturer model number, when available.
categoryLeaf category (Laptops, Wireless Headphones, etc.).
breadcrumbs[]Full category path from root → leaf.
currentPriceCurrent selling price (USD).
listPriceMSRP / strikethrough price, when shown.
savingslistPrice - currentPrice, when both present.
onSaletrue if savings > 0.
currencyCurrency code (USD).
ratingAverage customer rating (0–5).
reviewCountTotal number of customer reviews.
inStockBoolean parsed from availabilityStatus.
sellerNameSelling entity (Walmart.com or marketplace seller).
sellerTypewalmart / marketplace / walmart_or_marketplace.
pickupAvailableLocal store pickup available.
shippingAvailableShips from any fulfillment channel.
freeShippingFree shipping flagged.
imageUrlPrimary image URL.
images[]Full image gallery.
variants[]Color / size / configuration variants (when includeVariants).
specifications{}Manufacturer spec sheet as a {key: value} map (when includeSpecifications).
descriptionShort product description.
badges[]Promotional badges (Rollback, Reduced Price, Bestseller, etc.).
scrapedAtISO timestamp the record was captured.

When includeReviews = true, the same dataset also receives review records tagged _type: "review":

FieldDescription
reviewIdWalmart review identifier.
itemIdParent product item ID.
ratingStar rating (1–5).
titleReview title.
textReview body.
authorReviewer display name.
verifiedPurchaserVerified buyer flag.
helpfulVotes / unhelpfulVotesCommunity vote counts.
photos[]Review-attached photos.
pros[] / cons[]Bullet pros/cons (when present).
submittedAtISO date of the review.

When to use this actor

  • Walmart Marketplace sellers tracking competitor SKUs, Buy Box wins, and MAP violations.
  • Retail price-monitoring SaaS building cross-retailer feeds (pair with our Amazon, eBay, and Google Shopping actors).
  • Dropshipping / arbitrage agents comparing Walmart prices against Amazon/Walmart pricing in real time.
  • Brand teams monitoring review sentiment and rating drift on their SKUs.
  • AI shopping agents that need current Walmart prices/stock as a tool call (apify--walmart-data-extractor via Apify MCP).

When NOT to use

  • Walmart Canada / Mexico — this actor targets walmart.com only. Walmart.ca and walmart.com.mx have different DOM/JSON shapes.
  • Bulk catalogs > 500 products per run — split into multiple runs. Walmart's anti-bot pool degrades with sustained heavy concurrency.
  • Walmart Grocery delivery — grocery uses a separate API surface; out of scope here.

Modes

Run a keyword search.

{ "mode": "search", "searchQuery": "samsung 65 inch tv", "maxProducts": 50 }

mode: category

Crawl a Walmart category/listing URL.

{ "mode": "category", "categoryUrls": ["https://www.walmart.com/shop/electronics/laptops"] }

mode: productUrls

Direct product detail pages.

{ "mode": "productUrls", "productUrls": [
"https://www.walmart.com/ip/Apple-AirPods-Pro/588402301",
"https://www.walmart.com/ip/123456789"
] }

mode: itemIds

Bulk catalog enrichment from a list of Walmart item IDs.

{ "mode": "itemIds", "itemIds": ["588402301", "210384701"] }

Pricing — Pay Per Event

This actor uses Apify's pay-per-event billing — you only pay for results you actually receive, plus a tiny start fee.

EventPrice
apify-actor-start$0.00005 (one-time per run)
product-extracted$0.003 per product
review-extracted$0.002 per review (only when includeReviews = true)

Typical costs:

  • 50 products: ~$0.15
  • 50 products + 10 reviews each: ~$1.15
  • 500 products: ~$1.50

Pay-per-usage (compute + proxy) is also enabled for heavy power-user jobs — buyers self-select at run time.

Usage examples

Via Apify console

  1. Pick a mode and supply the matching field (searchQuery, categoryUrls, productUrls, or itemIds).
  2. Optionally set includeReviews: true and maxReviewsPerProduct.
  3. Run.

Via API (Node.js)

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('khadinakbar/walmart-data-extractor').call({
mode: 'search',
searchQuery: 'wireless headphones',
maxProducts: 50,
includeReviews: false,
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);

Via API (Python)

from apify_client import ApifyClient
client = ApifyClient('YOUR_TOKEN')
run = client.actor('khadinakbar/walmart-data-extractor').call(run_input={
'mode': 'productUrls',
'productUrls': ['https://www.walmart.com/ip/588402301'],
'includeReviews': True,
'maxReviewsPerProduct': 20,
})
items = list(client.dataset(run['defaultDatasetId']).iterate_items())
print(items)

Via Apify MCP (Claude / GPT / Gemini agents)

Tool name: apify--walmart-data-extractor. The actor's input schema is wired for AI-agent consumption — every property has a 4-sentence description with a literal example and disambiguator.

How it works

  • PlaywrightCrawler (Chromium) with realistic Chrome 120+ fingerprint pool (Windows + macOS desktop, en-US locale).
  • Apify Residential proxy pinned to US — non-US IPs see different (or missing) Walmart inventory due to geo-locking.
  • Session pool with cookie persistence — PerimeterX scores per-session, so we keep sessions warm. Sessions are retired on 403/429/503 or block-page detection.
  • Resource blocking — images, fonts, stylesheets, and media are aborted in preNavigationHooks for ~3x faster page loads (we don't need them, the data lives in __NEXT_DATA__).
  • Multi-path defensive extractionparsers.js walks several known Next.js layouts so a single Walmart schema change rarely breaks the extractor.
  • Silent-failure guard — runs that complete with 0 products and >0 errors fail with an actionable status message rather than charging the start fee on an empty dataset.

Known limits

  • Walmart caps search at 25 pages (~1,000 results). Use category URLs or split keyword variants for larger catalogs.
  • Reviews shape may vary for some product types (auto, grocery, books). The extractor falls back gracefully — partial review records over no review records.
  • Akamai + PerimeterX rate-limits the residential pool under heavy concurrent load. If a run produces 0 items, retry in 10–15 minutes or lower maxProducts.
  • Geo-locking — Walmart shows different prices/availability per ZIP. The default US residential proxy yields the "default" US view; ZIP-aware pricing is on the roadmap for v0.2.

This actor scrapes publicly available product information from walmart.com. By using it, you assume responsibility for your compliance with Walmart's Terms of Use and applicable scraping/data laws in your jurisdiction. The actor does not require any Walmart account login and does not extract account-only or personal data. Use at your own risk.