# Instacart Scraper (`piotrv1001/instacart-scraper`) Actor

The Instacart Scraper extracts grocery product data across 70+ US retailers — Costco, Safeway, Walmart, Target and more — capturing names, brands, sizes, prices, availability, images and dietary tags from product pages and retailer storefronts. Ideal for price monitoring and competitive analysis.

- **URL**: https://apify.com/piotrv1001/instacart-scraper.md
- **Developed by:** [FalconScrape](https://apify.com/piotrv1001) (community)
- **Categories:** E-commerce, Automation
- **Stats:** 3 total users, 2 monthly users, 100.0% runs succeeded, 1 bookmarks
- **User rating**: No ratings yet

## Pricing

from $5.00 / 1,000 products

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### 🛒 Instacart Scraper

Extract product data from [Instacart](https://www.instacart.com/) across 70+ US grocery retailers — Costco, Safeway, Walmart, Target, Sprouts, Kroger, Publix, Wegmans, Whole Foods and more. The **Instacart Scraper** pulls structured product details from individual product pages, retailer storefronts, and cross-retailer search results in one run.

### ✨ Features

-   🏷️ **Rich Product Data**: Names, brands, sizes, images, prices, sale prices, availability, dietary attributes, and category.
-   🏬 **Three Source Types**: Product detail pages (PDPs), retailer storefronts (~32 featured items per store), and cross-retailer search (~14 items per query, mixed retailers).
-   🇺🇸 **Multi-Retailer Coverage**: A single run can sweep Costco, Safeway, Walmart, Target, Sprouts and any other Instacart partner store.
-   ⚡ **Fast & Lightweight**: Plain HTTP via CheerioCrawler — no browser overhead. Each request returns a fully populated product feed item.
-   📍 **ZIP-Aware**: Records the postal code context against every item (Instacart prices vary by ZIP).

### 🛠️ How It Works

1.  **Configure the input** – Provide any mix of Instacart URLs, retailer slugs, search queries, and product IDs.
2.  **Run the Actor** – It fetches each page, parses the embedded Apollo state and JSON-LD, and emits one record per product.
3.  **Download the dataset** – Get clean JSON / CSV / Excel output ready for analysis, monitoring, or further enrichment.

### 📥 Input

All input fields are optional. Combine them freely — for example, scrape three retailer storefronts plus a few direct PDPs in one run.

| Field | Type | Description | Example |
|---|---|---|---|
| `startUrls` | array | Any Instacart URLs: PDPs (`/products/...`), retailer storefronts (`/store/{retailer}`), or cross-retailer search (`/store/s?k=...`). Unsupported URL families are skipped. | `[{ "url": "https://www.instacart.com/store/costco" }]` |
| `retailerSlugs` | string[] | Retailer slugs to scrape as storefronts. Equivalent to passing `/store/{slug}` start URLs. | `["costco", "safeway", "target"]` |
| `searchQueries` | string[] | Free-text queries run against cross-retailer search. Only the first SSR page is captured (~14 items per query). | `["milk", "bread"]` |
| `productIds` | string[] | Instacart product IDs fetched directly as PDPs. | `["7079", "20630136"]` |
| `defaultRetailerSlug` | string | Retailer to attach as `?retailerSlug=` when fetching `productIds`. Leave empty for Instacart's default market. | `"safeway"` |
| `seedFromHome` | boolean | If `true`, fetch `https://www.instacart.com/` and enqueue every retailer storefront found on it. | `false` |
| `postalCode` | string | US ZIP code recorded against each item. Instacart defaults to `94105` (San Francisco). | `"10001"` |
| `maxItems` | integer | Stop pushing to the dataset after this many products. `0` = unlimited. | `50` |
| `proxyConfiguration` | object | Apify proxy settings. Instacart works from any IP for these endpoints; a US proxy gives more accurate location defaults. | `{ "useApifyProxy": false }` |

### 📊 Sample Output Data

Every record follows the same shape regardless of source (PDP, storefront, or search). The `source` field tells you where each item came from.

```json
[
    {
        "id": "items_88059-20630136",
        "productId": "20630136",
        "retailerSlug": "walmart",
        "retailerId": null,
        "shopId": null,
        "url": "https://www.instacart.com/products/20630136-great-value-2-reduced-fat-milk-1-gal?retailerSlug=walmart",
        "name": "Great Value 2% Reduced Fat Milk",
        "brand": "Great Value",
        "size": "128 oz",
        "image": "https://d2lnr5mha7bycj.cloudfront.net/product-image/file/large_5c855497.jpeg",
        "category": "Plain Milk",
        "description": null,
        "price": 3.48,
        "priceString": "$3.48",
        "fullPriceString": null,
        "priceCurrency": "USD",
        "availability": "InStock",
        "dietaryAttributes": [],
        "source": "pdp",
        "postalCode": "94105",
        "scrapedAt": "2026-05-23T06:26:03.293Z"
    },
    {
        "id": "items_74-20110703",
        "productId": "20110703",
        "retailerSlug": "costco",
        "url": "https://www.instacart.com/products/20110703-kirkland-signature-mini-chocolate-chip-cookies-60-ct?retailerSlug=costco",
        "name": "Kirkland Signature Mini Chocolate Chip Cookies, 60-count",
        "brand": "kirkland signature",
        "size": "each",
        "image": "https://d2lnr5mha7bycj.cloudfront.net/product-image/file/large_63fb7a0f.jpeg",
        "price": 9.94,
        "priceString": "$9.94",
        "fullPriceString": "$11.94",
        "priceCurrency": "USD",
        "availability": "inStock",
        "dietaryAttributes": [],
        "source": "storefront",
        "postalCode": "94105",
        "scrapedAt": "2026-05-23T06:26:04.891Z"
    }
]
````

Build price feeds, run availability checks, and compare grocery prices across US retailers with the **Instacart Scraper** today! 🚀

# Actor input Schema

## `startUrls` (type: `array`):

Any mix of Instacart URLs: PDPs (/products/...), retailer storefronts (/store/{retailer}), or cross-retailer search (/store/s?k=...). Other Instacart URL families will be skipped.

## `retailerSlugs` (type: `array`):

List of retailer slugs (e.g. costco, safeway, target). For each one the actor scrapes the storefront and enqueues featured products as PDPs.

## `searchQueries` (type: `array`):

List of free-text queries to run against the cross-retailer search (/store/s?k=...). Only the first SSR page is captured.

## `productIds` (type: `array`):

List of Instacart product IDs to fetch directly. The slug part of the PDP URL is decorative and is omitted.

## `defaultRetailerSlug` (type: `string`):

Retailer slug to attach as ?retailerSlug= when fetching PDPs from productIds. If empty, Instacart returns its default (SF) market.

## `seedFromHome` (type: `boolean`):

If true, fetch https://www.instacart.com/ and enqueue all retailer storefronts found on it. Useful when you want a broad sweep.

## `postalCode` (type: `string`):

US ZIP code used as location context. Instacart prices and availability vary by ZIP. The actor records this on every item but does not currently inject it into requests (Instacart defaults to 94105 / San Francisco).

## `maxItems` (type: `integer`):

Stop pushing to the dataset after this many products. 0 = unlimited.

## `proxyConfiguration` (type: `object`):

Proxy settings. PDP, storefront and cross-retailer search work from any IP, but a US IP yields more accurate location defaults.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.instacart.com/products/7079-lucerne-dairy-farms-milk-128-oz?retailerSlug=safeway"
    },
    {
      "url": "https://www.instacart.com/store/costco"
    },
    {
      "url": "https://www.instacart.com/store/s?k=milk"
    }
  ],
  "retailerSlugs": [],
  "searchQueries": [],
  "productIds": [],
  "defaultRetailerSlug": "",
  "seedFromHome": false,
  "postalCode": "94105",
  "maxItems": 50,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.instacart.com/products/7079-lucerne-dairy-farms-milk-128-oz?retailerSlug=safeway"
        },
        {
            "url": "https://www.instacart.com/store/costco"
        },
        {
            "url": "https://www.instacart.com/store/s?k=milk"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("piotrv1001/instacart-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [
        { "url": "https://www.instacart.com/products/7079-lucerne-dairy-farms-milk-128-oz?retailerSlug=safeway" },
        { "url": "https://www.instacart.com/store/costco" },
        { "url": "https://www.instacart.com/store/s?k=milk" },
    ] }

# Run the Actor and wait for it to finish
run = client.actor("piotrv1001/instacart-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.instacart.com/products/7079-lucerne-dairy-farms-milk-128-oz?retailerSlug=safeway"
    },
    {
      "url": "https://www.instacart.com/store/costco"
    },
    {
      "url": "https://www.instacart.com/store/s?k=milk"
    }
  ]
}' |
apify call piotrv1001/instacart-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=piotrv1001/instacart-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Instacart Scraper",
        "description": "The Instacart Scraper extracts grocery product data across 70+ US retailers — Costco, Safeway, Walmart, Target and more — capturing names, brands, sizes, prices, availability, images and dietary tags from product pages and retailer storefronts. Ideal for price monitoring and competitive analysis.",
        "version": "0.0",
        "x-build-id": "dhrg5FbD43aY1MIGm"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/piotrv1001~instacart-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-piotrv1001-instacart-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/piotrv1001~instacart-scraper/runs": {
            "post": {
                "operationId": "runs-sync-piotrv1001-instacart-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/piotrv1001~instacart-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-piotrv1001-instacart-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "Any mix of Instacart URLs: PDPs (/products/...), retailer storefronts (/store/{retailer}), or cross-retailer search (/store/s?k=...). Other Instacart URL families will be skipped.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "retailerSlugs": {
                        "title": "Retailer slugs",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "List of retailer slugs (e.g. costco, safeway, target). For each one the actor scrapes the storefront and enqueues featured products as PDPs.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "searchQueries": {
                        "title": "Search queries",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "List of free-text queries to run against the cross-retailer search (/store/s?k=...). Only the first SSR page is captured.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "productIds": {
                        "title": "Product IDs",
                        "uniqueItems": true,
                        "type": "array",
                        "description": "List of Instacart product IDs to fetch directly. The slug part of the PDP URL is decorative and is omitted.",
                        "default": [],
                        "items": {
                            "type": "string"
                        }
                    },
                    "defaultRetailerSlug": {
                        "title": "Default retailer slug for product IDs",
                        "type": "string",
                        "description": "Retailer slug to attach as ?retailerSlug= when fetching PDPs from productIds. If empty, Instacart returns its default (SF) market.",
                        "default": ""
                    },
                    "seedFromHome": {
                        "title": "Seed retailers from Instacart home",
                        "type": "boolean",
                        "description": "If true, fetch https://www.instacart.com/ and enqueue all retailer storefronts found on it. Useful when you want a broad sweep.",
                        "default": false
                    },
                    "postalCode": {
                        "title": "Postal code (ZIP)",
                        "type": "string",
                        "description": "US ZIP code used as location context. Instacart prices and availability vary by ZIP. The actor records this on every item but does not currently inject it into requests (Instacart defaults to 94105 / San Francisco).",
                        "default": "94105"
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 0,
                        "type": "integer",
                        "description": "Stop pushing to the dataset after this many products. 0 = unlimited.",
                        "default": 50
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Proxy settings. PDP, storefront and cross-retailer search work from any IP, but a US IP yields more accurate location defaults.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
