๐Ÿ›’ Walmart Data Extractor avatar

๐Ÿ›’ Walmart Data Extractor

Pricing

from $4.99 / 1,000 results

Go to Apify Store
๐Ÿ›’ Walmart Data Extractor

๐Ÿ›’ Walmart Data Extractor

๐Ÿ›’ Walmart Data Extractor pulls product details, pricing, ratings & availability from Walmart for fast market research. ๐Ÿ“Š Automate leads, monitor competitors & track trends with reliable data. ๐Ÿš€ Great for B2B insights & analytics.

Pricing

from $4.99 / 1,000 results

Rating

0.0

(0)

Developer

Scraper Engine

Scraper Engine

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract rich, structured product data from Walmart.com at scale. Feed it category pages, search pages, or product (/ip/) URLs โ€” or just a keyword โ€” and get back prices, images, brand, full specifications, ratings, seller info, and much more. Built for reliability with automatic proxy escalation, anti-bot browser impersonation, retries, and real-time dataset saving.

โœจ Why Choose This Actor?

  • ๐Ÿ”— Bulk URLs โ€” mix category, search and product URLs in a single run.
  • ๐Ÿ›ก๏ธ Smart proxy escalation โ€” starts direct, falls back to datacenter, then residential automatically, and sticks with residential once it has to.
  • ๐Ÿงฐ Anti-bot by design โ€” uses impit browser impersonation (real TLS/HTTP fingerprints) instead of heavy headless browsers.
  • ๐Ÿ’พ Live results โ€” products stream into the output table as they're scraped, grouped by source section, so a mid-run stop never loses data.
  • โญ Reviews & specs โ€” opt into reviews and get full idml specifications.
  • ๐Ÿงฉ Customizable output โ€” reshape every record with your own Python hooks.

๐Ÿ”‘ Key Features

FeatureDescription
Category scrapingAuto-paginates browse/category pages
Search scrapingSearch pages or a raw keyword
Product detailDirect /ip/ URL extraction
ReviewsincludeReviews / onlyReviews
LimitsmaxItems (global) and endPage
LocationBest-effort zipCode targeting
Proxydirect โ†’ datacenter โ†’ residential (sticky)

๐Ÿ“ฅ Input

{
"startUrls": [
{ "url": "https://www.walmart.com/browse/auto-tires/brake-pads/91083_1074765_9038935_4670095_4582920" }
],
"search": "laptop",
"maxItems": 10,
"endPage": null,
"zipCode": "10001",
"includeReviews": false,
"onlyReviews": false,
"proxy": { "useApifyProxy": false }
}
FieldTypeDescription
startUrlsarrayWalmart category / search / product URLs (bulk). Required.
searchstringKeyword โ†’ converted to a search URL.
maxItemsintegerCap on total products. Empty = no limit.
endPageintegerLast category/search page to read.
zipCodestringUS ZIP for localized pricing/availability.
postalCodeintegerโš ๏ธ Deprecated โ€” use zipCode.
includeReviewsbooleanAttach reviews to each product.
onlyReviewsbooleanKeep only reviews + identifiers.
extendOutputFunctionstringPython def extendOutputFunction(product) โ†’ dict merged in.
outputFilterFunctionstringPython def outputFilterFunction(product) โ†’ reshape/drop.
proxyobjectProxy config. Default: no proxy (auto-escalates on block).

๐Ÿ“ค Output

Each product is pushed as one dataset row with the full Walmart product object plus convenience columns for the table view:

{
"name": "MAX Advanced Brakes - Brake Kit ...",
"brand": "Max Advanced Brakes",
"priceString": "$194.99",
"price": 194.99,
"availabilityStatus": "IN_STOCK",
"usItemId": "1902495893",
"productUrl": "https://www.walmart.com/ip/.../1902495893",
"imageUrl": "https://i5.walmartimages.com/seo/...jpeg",
"sourceSection": "browse_auto_tires",
"sourceUrl": "https://www.walmart.com/browse/...",
"priceInfo": { "currentPrice": { "price": 194.99, "priceString": "$194.99" } },
"idml": { "specifications": { }, "longDescription": "..." },
"reviews": null
}

A structured, per-section summary (mirroring results_by_url) is also written to the key-value store as OUTPUT.

๐Ÿš€ How to Use (Apify Console)

  1. Log in at https://console.apify.com โ†’ Actors.
  2. Open Walmart Data Extractor.
  3. Paste your Walmart URLs (or a keyword), set maxItems, and configure proxy.
  4. Click Start.
  5. Watch products stream into the run log and Output tab in real time.
  6. Export to JSON / CSV / XLSX when done.

๐Ÿค– Use via API

curl -X POST "https://api.apify.com/v2/acts/<ACTOR_ID>/run-sync-get-dataset-items?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://www.walmart.com/search?q=laptop"}],"maxItems":10}'

๐ŸŽฏ Best Use Cases

  • ๐Ÿ’ฐ Price monitoring & repricing
  • ๐Ÿ“Š Catalog & assortment analysis
  • ๐Ÿ”Ž Competitor & market research
  • ๐Ÿท๏ธ Brand / seller tracking

๐Ÿ’ณ Pricing

This actor uses the pay-per-event model. The primary event is row_result, charged once per product saved to the dataset. Platform startup is covered by the synthetic apify-actor-start event. You only pay for the products you actually receive.

โ“ FAQ

Which URLs are supported? Category/browse pages, search pages, and product (/ip/) pages.

Do I need a proxy? No. The actor runs direct by default and only escalates to datacenter then residential proxies if Walmart blocks the request.

Can I limit the run? Yes โ€” use maxItems for a global cap and endPage to stop pagination early.

Why are some fields null? Walmart omits fields per product; reviews are only attached when includeReviews/onlyReviews is enabled.

  • Data is collected only from publicly available Walmart pages.
  • You are responsible for compliance with Walmart's ToS and applicable laws (GDPR, CCPA, etc.). Use reasonable rate limits and scrape responsibly.

๐Ÿ›Ÿ Support & Feedback

Open an issue on the Actor's Issues tab with your run ID and input, and we'll take a look.