# Any Website Listing Page Scraper (`mrscrapercom/mrscraper-listing`) Actor

Unblock pages and scrape listing page from any websites. It is stealth, reliable, and scalable. Every action uses the mrscraper.com engine to extract structured listing data.

- **URL**: https://apify.com/mrscrapercom/mrscraper-listing.md
- **Developed by:** [MrScraper](https://apify.com/mrscrapercom) (community)
- **Categories:** Automation, AI, E-commerce
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $100.00 / 1,000 standard requests

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## MrScraper CBC - Scrape Any Listing or Category Page!

Unblock pages and scrape listing/category pages from any websites. It is stealth, reliable, and scalable. Every action uses the [MrScraper](https://mrscraper.com) engine to extract structured data from various types of listing content using AI-powered prompts.

### Features

- **Universal Listing Page Scraping** - Extract structured data from category pages, search results, product listings, job boards, real estate listings, and more
- **AI-Powered Extraction** - Use custom prompts to specify exactly what data you want to extract from each page
- **Multi-Page Support** - Scrape across multiple pages of results automatically
- **Pay-per-event Pricing** - Only pay for what you use with transparent event-based pricing
- **Proxy Support** - Residential & mobile proxy support with geo-targeting for accessing geo-restricted content
- **Multiple Output Formats** - Get data as JSON, plus optional HTML and Markdown output

### How It Works

1. The Actor receives input with a URL to scrape, a custom prompt defining what data to extract, and optional page limit
2. It validates the input and charges based on selected options (pay-per-event model)
3. MrScraper engine scrapes the provided URL using AI to understand and extract the requested data
4. Returns structured data in JSON format (with optional HTML and Markdown)

### Input Parameters

#### Required Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `url` | string | Yes | URL to scrape (listing pages, category pages, search results, etc.) |

#### Data Extraction Settings

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `prompt` | string | `"Extract all available data as much as possible."` | Custom prompt to guide the AI on what data to extract from the page |
| `max_pages` | integer | `1` | Maximum number of pages to scrape (useful for paginated listings) |

#### Proxy Configuration

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `proxy_use_proxy` | boolean | `true` | Routes requests through residential & mobile IPs to bypass anti-bot protection. Recommended for heavily protected sites |
| `proxy_proxy_country` | string | `""` | Route the request through a proxy in a specific country to access geo-restricted content. If not set, the request will be routed through a random country. Only applicable if residential & mobile proxy is enabled |
| `proxy_bypass_proxy` | boolean | `true` | Block images, fonts, and stylesheets from loading. Speeds up scraping and reduces bandwidth usage. Only applicable if residential & mobile proxy is enabled |

#### Output Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `output_html` | boolean | `false` | Include raw HTML in the output |
| `output_markdown` | boolean | `false` | Include markdown version in the output |

### Output

The Actor stores extracted data in the dataset. Each data will be saved as single row in dataset.

### Pricing Events

This Actor uses pay-per-event pricing. Events are charged as follows:

| Event | Description |
|-------|-------------|
| `standard-request` | Charged per URL processed |
| `time-based-cost` | Charged based on time taken to scrape (per 30-second interval) |
| `bandwidth-cost` | Charged based on bandwidth used (per MB) |
| `ai-token-cost` | Charged based on AI token usage during extraction |

### Use Cases

- **E-commerce** - Extract product listings, prices, and availability from category pages
- **Job Boards** - Scrape job postings from search results or category listings
- **Real Estate** - Extract property listings from real estate websites
- **News & Articles** - Collect article summaries from news category pages
- **Directories** - Scrape business listings from directory websites
- **Research** - Gather data from any listing-style page with custom extraction requirements

### Contact Us

For enterprise solutions, custom integrations, or high-volume scraping needs, please contact MrScraper directly at [https://www.mrscraper.com](https://www.mrscraper.com) or [book a call here](https://cal.com/cahyo-mrscraper/30min?user=cahyo-mrscraper). We offer tailored solutions to meet your specific business requirements.

# Actor input Schema

## `url` (type: `string`):

URL to scrape (listing pages, category pages, etc.)
## `prompt` (type: `string`):

Prompt to use for the scraper
## `max_pages` (type: `integer`):

Maximum number of pages to scrape
## `proxy_use_proxy` (type: `boolean`):

Routes requests through residential & mobile IPs to bypass anti-bot protection. Recommended for heavily protected sites.
## `proxy_proxy_country` (type: `string`):

Route the request through a proxy in a specific country to access geo-restricted content. If not set, the request will be routed through a random country. Only applicable if residential & mobile proxy is enabled.
## `proxy_bypass_proxy` (type: `boolean`):

Block images, fonts, and stylesheets from loading. Speeds up scraping and reduces bandwidth usage. Only applicable if residential & mobile proxy is enabled.
## `output_html` (type: `boolean`):

Include raw HTML in the output
## `output_markdown` (type: `boolean`):

Include markdown version in the output

## Actor input object example

```json
{
  "url": "https://www.walmart.com/shop/tech/tvs-and-home-theater-new-arrivals?povid=XCAT_NewArrivals_MerchModule_Tech_tvsandhometheatre",
  "prompt": "Extract all available data as much as possible.",
  "max_pages": 1,
  "proxy_use_proxy": true,
  "proxy_proxy_country": "",
  "proxy_bypass_proxy": true,
  "output_html": false,
  "output_markdown": false
}
````

# Actor output Schema

## `result` (type: `string`):

Response from MrScraper API

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "url": "https://www.walmart.com/shop/tech/tvs-and-home-theater-new-arrivals?povid=XCAT_NewArrivals_MerchModule_Tech_tvsandhometheatre",
    "prompt": "Extract all available data as much as possible."
};

// Run the Actor and wait for it to finish
const run = await client.actor("mrscrapercom/mrscraper-listing").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "url": "https://www.walmart.com/shop/tech/tvs-and-home-theater-new-arrivals?povid=XCAT_NewArrivals_MerchModule_Tech_tvsandhometheatre",
    "prompt": "Extract all available data as much as possible.",
}

# Run the Actor and wait for it to finish
run = client.actor("mrscrapercom/mrscraper-listing").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "url": "https://www.walmart.com/shop/tech/tvs-and-home-theater-new-arrivals?povid=XCAT_NewArrivals_MerchModule_Tech_tvsandhometheatre",
  "prompt": "Extract all available data as much as possible."
}' |
apify call mrscrapercom/mrscraper-listing --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=mrscrapercom/mrscraper-listing",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Any Website Listing Page Scraper",
        "description": "Unblock pages and scrape listing page from any websites. It is stealth, reliable, and scalable. Every action uses the mrscraper.com engine to extract structured listing data.",
        "version": "0.1",
        "x-build-id": "5CKHhQD9guO9SJcXb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/mrscrapercom~mrscraper-listing/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-mrscrapercom-mrscraper-listing",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/mrscrapercom~mrscraper-listing/runs": {
            "post": {
                "operationId": "runs-sync-mrscrapercom-mrscraper-listing",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/mrscrapercom~mrscraper-listing/run-sync": {
            "post": {
                "operationId": "run-sync-mrscrapercom-mrscraper-listing",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "url"
                ],
                "properties": {
                    "url": {
                        "title": "URL",
                        "type": "string",
                        "description": "URL to scrape (listing pages, category pages, etc.)"
                    },
                    "prompt": {
                        "title": "Prompt",
                        "type": "string",
                        "description": "Prompt to use for the scraper"
                    },
                    "max_pages": {
                        "title": "Max Pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of pages to scrape",
                        "default": 1
                    },
                    "proxy_use_proxy": {
                        "title": "Residential & Mobile Proxy",
                        "type": "boolean",
                        "description": "Routes requests through residential & mobile IPs to bypass anti-bot protection. Recommended for heavily protected sites.",
                        "default": true
                    },
                    "proxy_proxy_country": {
                        "title": "Geo Targeting",
                        "enum": [
                            "",
                            "af",
                            "al",
                            "dz",
                            "as",
                            "ad",
                            "ao",
                            "ai",
                            "ag",
                            "ar",
                            "am",
                            "aw",
                            "au",
                            "at",
                            "az",
                            "bs",
                            "bh",
                            "bd",
                            "bb",
                            "by",
                            "be",
                            "bz",
                            "bj",
                            "bm",
                            "bt",
                            "bo",
                            "bq",
                            "ba",
                            "bw",
                            "br",
                            "io",
                            "bn",
                            "bg",
                            "bf",
                            "bi",
                            "cv",
                            "kh",
                            "cm",
                            "ca",
                            "ky",
                            "td",
                            "cl",
                            "cn",
                            "co",
                            "km",
                            "cd",
                            "cg",
                            "ck",
                            "cr",
                            "hr",
                            "cu",
                            "cw",
                            "cy",
                            "cz",
                            "ci",
                            "dk",
                            "dj",
                            "dm",
                            "do",
                            "ec",
                            "eg",
                            "sv",
                            "ee",
                            "sz",
                            "et",
                            "fo",
                            "fj",
                            "fi",
                            "fr",
                            "pf",
                            "ga",
                            "gm",
                            "ge",
                            "de",
                            "gh",
                            "gi",
                            "gr",
                            "gl",
                            "gd",
                            "gp",
                            "gu",
                            "gt",
                            "gg",
                            "gn",
                            "gy",
                            "ht",
                            "va",
                            "hn",
                            "hk",
                            "hu",
                            "is",
                            "in",
                            "id",
                            "ir",
                            "iq",
                            "ie",
                            "im",
                            "il",
                            "it",
                            "jm",
                            "jp",
                            "je",
                            "jo",
                            "kz",
                            "ke",
                            "ki",
                            "kr",
                            "kw",
                            "kg",
                            "la",
                            "lv",
                            "lb",
                            "lr",
                            "ly",
                            "lt",
                            "lu",
                            "mo",
                            "mg",
                            "mw",
                            "my",
                            "mv",
                            "ml",
                            "mt",
                            "mr",
                            "mu",
                            "yt",
                            "mx",
                            "fm",
                            "md",
                            "mc",
                            "mn",
                            "me",
                            "ma",
                            "mz",
                            "mm",
                            "na",
                            "np",
                            "nl",
                            "nc",
                            "nz",
                            "ni",
                            "ne",
                            "ng",
                            "mp",
                            "no",
                            "om",
                            "pk",
                            "pw",
                            "ps",
                            "pa",
                            "pg",
                            "py",
                            "pe",
                            "ph",
                            "pl",
                            "pt",
                            "pr",
                            "qa",
                            "mk",
                            "ro",
                            "ru",
                            "rw",
                            "re",
                            "kn",
                            "lc",
                            "mf",
                            "pm",
                            "vc",
                            "ws",
                            "sm",
                            "st",
                            "sa",
                            "sn",
                            "rs",
                            "sc",
                            "sl",
                            "sg",
                            "sx",
                            "sk",
                            "si",
                            "so",
                            "za",
                            "ss",
                            "es",
                            "lk",
                            "sd",
                            "sr",
                            "se",
                            "ch",
                            "sy",
                            "tw",
                            "tj",
                            "tz",
                            "th",
                            "tl",
                            "tg",
                            "tk",
                            "to",
                            "tt",
                            "tn",
                            "tr",
                            "tm",
                            "tc",
                            "tv",
                            "ug",
                            "ua",
                            "ae",
                            "gb",
                            "us",
                            "uy",
                            "uz",
                            "ve",
                            "vn",
                            "vg",
                            "vi",
                            "ye",
                            "zm",
                            "zw"
                        ],
                        "type": "string",
                        "description": "Route the request through a proxy in a specific country to access geo-restricted content. If not set, the request will be routed through a random country. Only applicable if residential & mobile proxy is enabled.",
                        "default": ""
                    },
                    "proxy_bypass_proxy": {
                        "title": "Block Resources",
                        "type": "boolean",
                        "description": "Block images, fonts, and stylesheets from loading. Speeds up scraping and reduces bandwidth usage. Only applicable if residential & mobile proxy is enabled.",
                        "default": true
                    },
                    "output_html": {
                        "title": "Also return HTML",
                        "type": "boolean",
                        "description": "Include raw HTML in the output",
                        "default": false
                    },
                    "output_markdown": {
                        "title": "Also return markdown",
                        "type": "boolean",
                        "description": "Include markdown version in the output",
                        "default": false
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
