# Tesco Product Extractor (`kawsar/tesco-product-extractor`) Actor

Extract product pricing, specifications, ratings, reviews, and active Clubcard promotions from Tesco.com using a built-in residential bypass network.

- **URL**: https://apify.com/kawsar/tesco-product-extractor.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, 0 bookmarks
- **User rating**: No ratings yet

## Pricing

from $3.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Tesco Product Extractor: Advanced Grocery and Pricing Intelligence Solution

Tesco Product Extractor is an enterprise-grade web scraping solution designed to extract comprehensive e-commerce product data from Tesco.com. It allows you to gather real-time product catalogs, pricing structure, average user ratings, review volume, and special promotional offers directly from Tesco listings with zero complex configuration. 

Equipped with a built-in residential bypass network, the extractor handles IP rotation, browser fingerprinting, and session replication automatically. This ensures high-speed, reliable data extraction at scale without encountering blockages or CAPTCHAs.

---

### Key Features

- **Search Keyword Scraping**: Input any number of search keywords (e.g., "milk", "organic cheddar"), and the extractor automatically retrieves relevant results across all matched pages.
- **Direct Category Pagination**: Supply direct category and search URLs to extract comprehensive item lists page-by-page.
- **Single Product Fallback Detail Page Extraction**: Supply direct product detail page URLs (e.g. `https://www.tesco.com/shop/en-GB/products/...`), and the actor will gracefully fall back to parse the specific product detail page directly.
- **Automated Price Parsing**: Extracts both absolute prices (in GBP) and unit pricing metrics (e.g., price per kg or price per litre) automatically.
- **Promotions & Clubcard Parsing**: Automatically detects active retail promotions, multisaver deals, and specialized Clubcard discount rates.
- **Built-in Residential Bypass Network**: Integrates a seamless, maintenance-free bypass layer to handle Tesco's anti-scraping measures automatically without requiring any external proxy configuration or custom browser profiles.
- **Resource-Optimized Limits**: Limits extraction on a per-query or per-URL basis (defaulting to 20 items per input) to keep consumption and speed optimal while allowing thorough scraping when needed.

---

### Common Use Cases

- **Competitor Price Monitoring**: Track retail prices and product unit costs on a schedule to maintain market-competitive pricing.
- **Promotional Trend Tracking**: Collect active Clubcard discounts, multisaver promotions, and category-wide pricing deals automatically.
- **Digital Shelf & Search Placement Visibility**: Identify sponsored product placements and organic search ranking across specific search terms.
- **Customer Sentiment Analysis**: Track average star ratings and review counts across targeted food and beverage categories.

---

### Configuring the Extraction

To run the Tesco Product Extractor, configure the following input parameters:

| Input Field | Type | Default | Description |
|-------------|------|---------|-------------|
| `queries` | array of strings | `["milk"]` | List of search keywords or phrases to run on Tesco. |
| `startUrls` | array of strings | `[]` | Optional direct category, search, or individual product detail page URLs to extract. |
| `maxItems` | integer | `20` | Maximum number of product items to extract **per search query or start URL**. Capped at 1000. |
| `requestTimeoutSecs` | integer | `30` | Timeout in seconds for connecting to the web server (minimum 5s, maximum 120s). |

---

### Output Dataset Schema

Every scraped item is returned in the dataset with the following fields:

| Field Name | Type | Description |
|------------|------|-------------|
| `productId` | string | The unique catalog or product identifier from Tesco. |
| `productName` | string | Full name of the product. |
| `productUrl` | string | Direct web link to the product details page. |
| `imageUrl` | string | Main image web link for the product. |
| `price` | number | Current price of the product in British Pounds (GBP). |
| `pricePerUnit` | string | Calculated unit price (e.g. price per kg, per unit, or per litre). |
| `promotionText` | string | Details of active Clubcard or multisaver promotions (e.g. "Clubcard Price"). |
| `rating` | number | Average customer rating (scale of 1.0 to 5.0). |
| `reviewsCount` | integer | Total reviews received from customers. |
| `isSponsored` | boolean | Indicates whether the product is a sponsored listing. |
| `scrapedAt` | string | ISO 8601 UTC timestamp of when the extraction was performed. |

#### Sample Output Record (JSON)

```json
{
  "productId": "262586694",
  "productName": "Yeo Valley Organic Fresh Whole Milk 2L",
  "productUrl": "https://www.tesco.com/shop/en-GB/products/262586694",
  "imageUrl": "https://digitalcontent.api.tesco.com/v2/media/ghs/9fe281fc-c5c0-41ef-8d14-e5d4b0f72f7c/ff990aaa-bd88-438a-8169-36d10d436d1e_1491621793.jpeg?h=225&w=225",
  "price": 3.15,
  "pricePerUnit": "£1.58/litre",
  "promotionText": "Any 2 for £4 Clubcard Price - Selected Dairy Products 2 Litre",
  "rating": 4.7,
  "reviewsCount": 131,
  "isSponsored": true,
  "scrapedAt": "2026-06-09T08:30:23Z"
}
````

***

### Frequently Asked Questions

##### Do I need to purchase external proxies?

No. The extractor operates with an automatic built-in residential bypass network that routes and rotates requests through high-quality residential IP addresses. This means you do not need to purchase or configure external proxies.

##### How does pagination work?

The extractor automatically increments listing pages page-by-page. It will continue crawling pagination pages for each search term or category URL until the `maxItems` per-input limit is reached, or there are no more results left.

##### What download formats are supported?

Through the Apify platform, you can seamlessly download the scraped dataset in multiple formats, including JSON, CSV, Excel (XLSX), XML, and HTML table views.

##### Can I run this extractor on a schedule?

Yes. You can schedule the actor to run automatically at specific intervals (hourly, daily, weekly, or custom cron schedules) using the Apify scheduler.

##### How are credentials and tokens secured?

The internal extraction engine is designed with rigorous exception boundaries. All runtime exceptions and network tracebacks are automatically scrubbed and sanitized of sensitive credentials, preventing any diagnostic log leakage.

***

### Integrations & Webhooks

Integrate Tesco Product Extractor into your daily workflows and data warehouses using Apify integrations. You can automatically sync your extracted data with Google Sheets, Slack, Zapier, Make, Airbyte, GitHub, or trigger real-time actions through custom webhooks whenever your crawl completes.

# Actor input Schema

## `queries` (type: `array`):

List of keywords or search queries to run on Tesco (e.g., 'milk', 'organic cheddar').

## `startUrls` (type: `array`):

Optional direct Tesco search or category URLs (e.g. 'https://www.tesco.com/shop/en-GB/search?query=organic+milk').

## `maxItems` (type: `integer`):

Maximum number of product items to extract per search query or start URL.

## `requestTimeoutSecs` (type: `integer`):

Per-request timeout in seconds for reaching Tesco's servers.

## Actor input object example

```json
{
  "queries": [
    "milk",
    "cheddar cheese"
  ],
  "startUrls": [
    "https://www.tesco.com/shop/en-GB/search?query=milk"
  ],
  "maxItems": 20,
  "requestTimeoutSecs": 30
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "queries": [
        "milk"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/tesco-product-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "queries": ["milk"] }

# Run the Actor and wait for it to finish
run = client.actor("kawsar/tesco-product-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "queries": [
    "milk"
  ]
}' |
apify call kawsar/tesco-product-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/tesco-product-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Tesco Product Extractor",
        "description": "Extract product pricing, specifications, ratings, reviews, and active Clubcard promotions from Tesco.com using a built-in residential bypass network.",
        "version": "0.0",
        "x-build-id": "zavjLzYnDnAAXXbPm"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~tesco-product-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-tesco-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~tesco-product-extractor/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-tesco-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~tesco-product-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-tesco-product-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "queries": {
                        "title": "Search queries",
                        "type": "array",
                        "description": "List of keywords or search queries to run on Tesco (e.g., 'milk', 'organic cheddar').",
                        "items": {
                            "type": "string"
                        }
                    },
                    "startUrls": {
                        "title": "Direct category or search URLs",
                        "type": "array",
                        "description": "Optional direct Tesco search or category URLs (e.g. 'https://www.tesco.com/shop/en-GB/search?query=organic+milk').",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of product items to extract per search query or start URL.",
                        "default": 20
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "Per-request timeout in seconds for reaching Tesco's servers.",
                        "default": 30
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
