# dm.de Products Scraper (`shahidirfan/dm-de-products-scraper`) Actor

Scrape dm.de products, prices, descriptions & ratings at scale. Extract across categories, filters & pagination. Perfect for price monitoring, competitor analysis & market research. High-speed extraction with reliability built-in. Real-time data for beauty, health & wellness retailers.

- **URL**: https://apify.com/shahidirfan/dm-de-products-scraper.md
- **Developed by:** [Shahid Irfan](https://apify.com/shahidirfan) (community)
- **Categories:** E-commerce, Automation, Developer tools
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage, which gets cheaper the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-usage

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## dm.de Product Scraper

Extract product listings from dm.de with reliable pagination and structured dataset output. Collect key product attributes such as identifiers, brand, title, pricing, ratings, images, and product URLs from category and search pages at scale.

### Features

- **Multiple URL support** — Works with dm.de category URLs, dm.de search URLs, and direct product listing URLs
- **Pagination handling** — Automatically collects products across multiple pages
- **Clean output** — Excludes empty and null values from dataset items
- **Deduplicated records** — Prevents duplicate product entries across pages and inputs
- **Configurable run size** — Control result count, pages, and page size

### Use Cases

#### Category Monitoring
Track product assortment changes in specific dm.de categories over time.

#### Price Intelligence
Collect pricing snapshots for competitive analysis and reporting workflows.

#### Product Catalog Research
Build structured product datasets for downstream analytics and enrichment.

---

### Input Parameters

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `urls` | Array | Yes | `[
"https://www.dm.de/haare/haarfarben"
]` | One or more source URLs to scrape |
| `results_wanted` | Integer | No | `20` | Maximum number of products to store |
| `max_pages` | Integer | No | `20` | Maximum pages to fetch per URL |
| `proxyConfiguration` | Object | No | `{ "useApifyProxy": false }` | Optional proxy settings |

---

### Output Data

Each dataset item can include:

| Field | Type | Description |
|-------|------|-------------|
| `sourceUrl` | String | Input URL that produced the record |
| `page` | Number | Source page number |
| `dan` | Number | dm internal product identifier |
| `gtin` | Number | Global product identifier |
| `brandName` | String | Product brand |
| `title` | String | Product title |
| `productUrl` | String | Product detail URL |
| `imageUrls` | Array | Product image URLs |
| `price` | String | Displayed price |
| `netPrice` | String | Net price when available |
| `unitInfo` | Array | Unit/price info strings |
| `ratingValue` | Number | Rating average |
| `ratingCount` | Number | Number of ratings |
| `categories` | Array | Product categories |

---

### Usage Examples

#### Category URL

```json
{
  "urls": ["https://www.dm.de/haare/haarfarben"],
  "results_wanted": 50,
  "max_pages": 5
}
````

#### Search URL

```json
{
  "urls": ["https://www.dm.de/search?query=haarfarbe"],
  "results_wanted": 60,
  "max_pages": 4
}
```

#### Multiple URLs

```json
{
  "urls": [
    "https://www.dm.de/haare/haarfarben",
    "https://www.dm.de/search?query=haarfarbe"
  ],
  "results_wanted": 120,
  "max_pages": 6
}
```

***

### Sample Output

```json
{
  "sourceUrl": "https://www.dm.de/haare/haarfarben",
  "page": 0,
  "dan": 1620805,
  "gtin": 30178120,
  "brandName": "L'ORÉAL PARiS PRÉFÉRENCE",
  "title": "Haarkur Farbglanz Pflegebalsam, 54 ml",
  "productUrl": "https://www.dm.de/p/d/1620805/l-oreal-paris-preference-haarkur-farbglanz-pflegebalsam",
  "imageUrls": [
    "https://products.dm-static.com/images/f_auto,q_auto,c_fit,h_320,w_320/..."
  ],
  "price": "1,95 €",
  "ratingValue": 4.9028,
  "ratingCount": 72,
  "categories": ["Dauerhafte Haarfarben"]
}
```

***

### Tips for Best Results

#### Use Stable Source URLs

- Prefer canonical category URLs for repeatable runs.
- Use direct search URLs when you want keyword-driven output.

#### Tune Collection Limits

- Start with `results_wanted: 20` for quick validation.
- Increase `max_pages` only when you need deeper coverage.

#### Proxy Usage

- Enable proxies for large or frequent runs.
- Keep defaults for small development runs.

***

### Integrations

Connect your dataset with:

- **Google Sheets** — Reporting and ad-hoc analysis
- **Airtable** — Structured catalog workflows
- **Webhooks** — Push run data to internal services
- **Make** — No-code automations
- **Zapier** — Trigger actions in business tools

#### Export Formats

- **JSON** — API and app integrations
- **CSV** — Spreadsheet workflows
- **Excel** — Business reporting

***

### Frequently Asked Questions

#### Does this scraper support multiple URLs in one run?

Yes. Provide one or more URLs in the `urls` array.

#### Will it stop automatically?

Yes. It stops when `results_wanted` is reached or page limits are hit.

#### Are empty values included in output?

No. Empty and null values are removed before records are stored.

#### Can I use it for dm.de search pages?

Yes. Search URLs are supported.

#### Can I control pagination depth?

Yes. Use `max_pages` and `pageSize`.

***

### Support

For issues or feature requests, use the Apify Console issue channels for this actor.

#### Resources

- [Apify Documentation](https://docs.apify.com/)
- [Apify API Reference](https://docs.apify.com/api/v2)

***

### Legal Notice

This actor is designed for legitimate data collection use cases. You are responsible for complying with applicable laws and website terms when using extracted data.

# Actor input Schema

## `urls` (type: `array`):

One or more dm.de URLs to scrape (category URL, /search URL, or direct product-search API URL).

## `results_wanted` (type: `integer`):

Maximum number of products to save.

## `max_pages` (type: `integer`):

Maximum API pages to request per URL.

## `proxyConfiguration` (type: `object`):

Optional proxy setup.

## Actor input object example

```json
{
  "urls": [
    "https://www.dm.de/haare/haarfarben"
  ],
  "results_wanted": 20,
  "max_pages": 20,
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "urls": [
        "https://www.dm.de/haare/haarfarben"
    ],
    "results_wanted": 20,
    "max_pages": 20
};

// Run the Actor and wait for it to finish
const run = await client.actor("shahidirfan/dm-de-products-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "urls": ["https://www.dm.de/haare/haarfarben"],
    "results_wanted": 20,
    "max_pages": 20,
}

# Run the Actor and wait for it to finish
run = client.actor("shahidirfan/dm-de-products-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "urls": [
    "https://www.dm.de/haare/haarfarben"
  ],
  "results_wanted": 20,
  "max_pages": 20
}' |
apify call shahidirfan/dm-de-products-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=shahidirfan/dm-de-products-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "dm.de Products Scraper",
        "description": "Scrape dm.de products, prices, descriptions & ratings at scale. Extract across categories, filters & pagination. Perfect for price monitoring, competitor analysis & market research. High-speed extraction with reliability built-in. Real-time data for beauty, health & wellness retailers.",
        "version": "0.0",
        "x-build-id": "KLtBxCiuVi7Ng8JCg"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/shahidirfan~dm-de-products-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-shahidirfan-dm-de-products-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/shahidirfan~dm-de-products-scraper/runs": {
            "post": {
                "operationId": "runs-sync-shahidirfan-dm-de-products-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/shahidirfan~dm-de-products-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-shahidirfan-dm-de-products-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "urls"
                ],
                "properties": {
                    "urls": {
                        "title": "URLs",
                        "type": "array",
                        "description": "One or more dm.de URLs to scrape (category URL, /search URL, or direct product-search API URL).",
                        "items": {
                            "type": "string"
                        }
                    },
                    "results_wanted": {
                        "title": "Results wanted",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of products to save.",
                        "default": 20
                    },
                    "max_pages": {
                        "title": "Max pages",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum API pages to request per URL.",
                        "default": 20
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Optional proxy setup.",
                        "default": {
                            "useApifyProxy": false
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
