# Dm Scraper (`kawsar/dm-scraper`) Actor

DM Product Scraper calls dm.de's search API by keyword and returns product names, prices, stock status, and image URLs so price trackers, market analysts, and e-commerce teams have clean, structured data to work with without manual browsing.

- **URL**: https://apify.com/kawsar/dm-scraper.md
- **Developed by:** [Kawsar](https://apify.com/kawsar) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $2.99 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## DM Product Scraper: Extract Product Data from dm.de

DM Product Scraper calls the dm.de product search API directly and returns clean, structured data you can use right away. Give it a keyword and it pages through the entire catalog, collecting product names, brands, prices, stock states, ratings, and image URLs for every matching result.

No browser needed, no HTML parsing. The actor hits the same JSON endpoint dm.de's own website uses, so results are fast and consistent.

### Use cases

- **Price monitoring**: pull dm.de prices for a product category on a schedule and track changes over time
- **Competitor research**: collect product listings for brands you monitor in the German health and beauty market
- **Inventory checks**: see which products are in stock versus out of stock across a keyword search
- **Catalog building**: build or refresh a product database from dm.de search results
- **Market research**: survey what dm.de carries in a category before launching or expanding a product line

### Input

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `searchQuery` | string | | Keyword to search (e.g. "shampoo", "sunscreen"). Required. |
| `maxItems` | integer | 100 | Maximum products to collect. Hard cap is 1000. |
| `pageSize` | integer | 24 | Products fetched per API request (12 to 48). |
| `enablePharmacy` | boolean | true | Include pharmacy and health products in results. |
| `userToken` | string | | Optional dm.de user token from browser network requests. |
| `searchToken` | string | | Optional dm.de search session token. |
| `requestTimeoutSecs` | integer | 30 | Per-request timeout in seconds. |
| `timeoutSecs` | integer | 300 | Overall actor run timeout in seconds. |
| `proxyConfiguration` | object | Datacenter (Anywhere) | Proxy type and location for requests. Optional. |

#### Example input

```json
{
    "searchQuery": "make up",
    "maxItems": 200,
    "enablePharmacy": true,
    "proxyConfiguration": { "useApifyProxy": true }
}
````

#### Where to find the optional tokens

If you run searches without tokens and get empty results or errors, capture the tokens from your browser:

1. Open dm.de and search for a product
2. Open DevTools (F12) and go to the Network tab
3. Filter by `product-search.services.dmtech.com`
4. Click any request and look at the request headers for `user_token` and `x-dm-product-search-token`
5. Paste those values into the corresponding actor inputs

### What data does this actor extract?

Each product record contains:

```json
{
    "productId": "4066447076370",
    "name": "Manhattan Lip Colour Cream",
    "brandName": "Manhattan",
    "price": 3.95,
    "currency": "EUR",
    "imageUrl": "https://media.dm.de/image/upload/t_prolist_xxl/d3/Products/...",
    "productUrl": "https://www.dm.de/manhattan-lip-colour-cream-...",
    "categoryPath": "beauty/make-up/lippenstift",
    "stockState": "available",
    "ratingAverage": 4.3,
    "reviewCount": 87,
    "searchQuery": "make up",
    "scrapedAt": "2025-05-06T10:22:13.451Z"
}
```

| Field | Type | Description |
|-------|------|-------------|
| `productId` | string | GTIN used as the unique product identifier on dm.de |
| `name` | string | Full product title |
| `brandName` | string | Brand name |
| `price` | number | Current price (null if unavailable) |
| `currency` | string | Currency code, typically EUR |
| `imageUrl` | string | Primary product image URL |
| `productUrl` | string | Direct link to the product page on dm.de |
| `categoryPath` | string | Category path returned by the dm.de API |
| `stockState` | string | Availability: `available`, `notAvailable`, `limitedAvailability` |
| `ratingAverage` | number | Average customer rating, 0-5 scale (null if none) |
| `reviewCount` | integer | Total number of customer reviews (null if none) |
| `searchQuery` | string | The keyword that produced this result |
| `scrapedAt` | string | ISO 8601 UTC timestamp |

### How it works

1. The actor reads your search query and settings from the input
2. It calls `product-search.services.dmtech.com/de/search` with your keyword
3. Each page of results is parsed and pushed to the dataset immediately
4. Pagination continues automatically until your `maxItems` limit is reached or results run out
5. Optional proxy rotation is applied per page if you configure proxy settings

### FAQ

**Does this work without a user token or search token?**
For most keyword searches, yes. The tokens are optional and mainly matter if dm.de starts returning empty results or rate-limit errors. Try without them first.

**What is the maximum number of products I can collect?**
The actor caps at 1000 items per run. The underlying dm.de API may return fewer results than that depending on the search term.

**Which proxy type should I use?**
Datacenter proxies (the default) work for most searches. Switch to Residential if you see repeated HTTP 403 or 429 errors.

**Can I search in a different dm.de country?**
The current version targets `dm.de` (Germany). The product search API uses a `/de/` path — other country variants are not supported in this version.

**How often does the data change?**
Prices and stock states on dm.de can change daily. For monitoring use cases, schedule the actor to run on a regular interval using Apify's scheduler.

### Integrations

Connect DM Product Scraper with other apps and services using [Apify integrations](https://apify.com/integrations). You can integrate with Make, Zapier, Slack, Airbyte, GitHub, Google Sheets, Google Drive, and many more. You can also use [webhooks](https://docs.apify.com/integrations/webhooks) to trigger actions whenever results are available.

# Actor input Schema

## `searchQueries` (type: `array`):

One or more keywords to search on dm.de. Enter each keyword on a separate line (e.g. 'make up', 'shampoo', 'sunscreen').

## `maxItems` (type: `integer`):

Maximum number of products to collect per search query. Hard cap is 1000. Total results = maxItems × number of queries.

## `pageSize` (type: `integer`):

Number of products fetched per API request (12 to 48).

## `enablePharmacy` (type: `boolean`):

When enabled, results include pharmacy and health products alongside regular dm catalog items.

## `requestTimeoutSecs` (type: `integer`):

How long to wait for each API response before timing out.

## `timeoutSecs` (type: `integer`):

Overall actor run timeout in seconds.

## `proxyConfiguration` (type: `object`):

Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect.

## Actor input object example

```json
{
  "searchQueries": [
    "make up",
    "shampoo",
    "sunscreen"
  ],
  "maxItems": 24,
  "pageSize": 24,
  "enablePharmacy": false,
  "requestTimeoutSecs": 30,
  "timeoutSecs": 300,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchQueries": [
        "make up"
    ],
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("kawsar/dm-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchQueries": ["make up"],
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("kawsar/dm-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchQueries": [
    "make up"
  ],
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call kawsar/dm-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=kawsar/dm-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Dm Scraper",
        "description": "DM Product Scraper calls dm.de's search API by keyword and returns product names, prices, stock status, and image URLs so price trackers, market analysts, and e-commerce teams have clean, structured data to work with without manual browsing.",
        "version": "0.0",
        "x-build-id": "mZH3bFStGcVFXQYrp"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/kawsar~dm-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-kawsar-dm-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/kawsar~dm-scraper/runs": {
            "post": {
                "operationId": "runs-sync-kawsar-dm-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/kawsar~dm-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-kawsar-dm-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "searchQueries"
                ],
                "properties": {
                    "searchQueries": {
                        "title": "Search queries",
                        "type": "array",
                        "description": "One or more keywords to search on dm.de. Enter each keyword on a separate line (e.g. 'make up', 'shampoo', 'sunscreen').",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Maximum items per query",
                        "minimum": 1,
                        "maximum": 1000000,
                        "type": "integer",
                        "description": "Maximum number of products to collect per search query. Hard cap is 1000. Total results = maxItems × number of queries.",
                        "default": 24
                    },
                    "pageSize": {
                        "title": "Page size",
                        "minimum": 12,
                        "maximum": 48,
                        "type": "integer",
                        "description": "Number of products fetched per API request (12 to 48).",
                        "default": 24
                    },
                    "enablePharmacy": {
                        "title": "Include pharmacy products",
                        "type": "boolean",
                        "description": "When enabled, results include pharmacy and health products alongside regular dm catalog items.",
                        "default": false
                    },
                    "requestTimeoutSecs": {
                        "title": "Request timeout (seconds)",
                        "minimum": 5,
                        "maximum": 120,
                        "type": "integer",
                        "description": "How long to wait for each API response before timing out.",
                        "default": 30
                    },
                    "timeoutSecs": {
                        "title": "Actor timeout (seconds)",
                        "minimum": 30,
                        "maximum": 3600,
                        "type": "integer",
                        "description": "Overall actor run timeout in seconds.",
                        "default": 300
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Select proxies to use for requests. Helps avoid IP blocking and rate limits. Datacenter proxies are fastest; Residential proxies are harder to detect."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
