# Dutch Energy Contract PDF Extractor (`alkausari_mujahid/dutch-energy-contract-pdf-extractor`) Actor

Extracts structured data from Dutch energy contract PDFs (direct PDF or Dropbox file/folder links): customer & address details, contract term, electricity/gas tariffs, totals, cashback. Supports Energiedirect.nl, Budget Energie, NLE, Essent, Huismerk Energie. Text PDFs only, no OCR.

- **URL**: https://apify.com/alkausari\_mujahid/dutch-energy-contract-pdf-extractor.md
- **Developed by:** [Alkausari M](https://apify.com/alkausari_mujahid) (community)
- **Categories:** Integrations, Automation, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $20.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## Dutch Energy Contract PDF Extractor

**Turn batches of Dutch energy contract PDFs into clean, structured data.** Paste a **direct PDF link or a Dropbox link** — a single PDF or an entire folder — and this Actor downloads every PDF, parses each contract, and returns one tidy record per file. No opening PDFs by hand, no copy-pasting customer details, tariffs, or IBANs.

Tuned for the contract layouts of **Energiedirect.nl, Budget Energie, NLE, Essent, and Huismerk Energie**.

> Built and maintained by **Alkausari M**.

---

### ✦ Highlights

- 📥  **Direct PDF & Dropbox links** — direct `.pdf` URLs, single Dropbox files, *or* whole Dropbox folders (downloaded as a zip and unpacked automatically)
- 🇳🇱  **Dutch supplier-aware** — detects the supplier and parses personal data, address, contract terms, electricity & gas tariffs
- 📊  **Any format** — JSON, CSV, Excel, XML, or live via the Apify API
- 🗂  **One row per PDF** — 28 normalized fields, ready for a CRM, spreadsheet, or database
- ⚡  **Batch in one run** — drop a folder link and process dozens or hundreds of contracts at once
- ⏰  **Automate it** — schedule recurring runs or trigger via API when new contracts land in your shared folder

---

### ⚙ How it works

1. **Share your PDFs** — host them at a **publicly accessible** URL, or put them in a Dropbox location shared as "Anyone with the link."
2. **Paste the link(s)** — any mix of direct PDF URLs and Dropbox file/folder links.
3. **Click Start** — downloading, unzipping, and parsing are handled for you.
4. **Download or pipe** — grab results as JSON, CSV, Excel, or pull from the API.

// Example input — direct PDF links
```jsonc
{
    "source_urls": [
        "https://example.com/contracts/contract1.pdf",
        "https://example.com/contracts/contract2.pdf"
    ]
}
````

// Example input — a Dropbox folder

```jsonc
{
    "source_urls": [
        "https://www.dropbox.com/scl/fo/abc123/def456?rlkey=xxxx&dl=1"
    ]
}
```

> \[!TIP]
> You don't need to change the `dl=0`/`dl=1` flag yourself — the Actor forces a direct download automatically. Just paste the share link as Dropbox gives it to you.

> \[!IMPORTANT]
> **Links must be public** ("Anyone with the link can view"). Files that require sign-in, a password, or are shared only with specific accounts **cannot be downloaded** and will be skipped. The Actor extracts **embedded text only — it does not run OCR**, so scanned or image-only PDFs won't produce data.

***

### 📦 What you get back

Each processed PDF becomes one structured record:

```json
{
    "file_name": "contract_123.pdf",
    "Leverncier": "Essent",
    "Datum": "12-05-2026",
    "Contract": "Vast 1 jaar",
    "Geslacht": "Man",
    "Voorletters": "J.",
    "Tussenvoegsel": "de",
    "Achternaam": "Vries",
    "Geboortedatum": "01-01-1985",
    "Straat": "Hoofdstraat",
    "Huisnummer": "12",
    "Huisnummer toevoeging": "A",
    "Postcode": "1234 AB",
    "Plaats": "Amsterdam",
    "E-mailadres": "j.devries@example.com",
    "Telefoon": "0612345678",
    "IBAN rekeningnummer": "NL00BANK0123456789",
    "Gewenste startdatum / Ingangsdatum": "01-06-2026",
    "Contractduur": "12 maanden",
    "Variabele leveringskosten normaaltarief": "0,10",
    "Variabele leveringskosten daltarief": "0,09",
    "Stroom Tarief (incl.btw)": "€ 0,12",
    "Stroom Daltarief (incl.btw)": "€ 0,11",
    "Netbeheerkosten": "€ 0,00",
    "Verbruik gas": "1.200",
    "Gas Tarief": "€ 0,98",
    "Totaalkosten": "€ 1.234,56",
    "Welkomstcadeau(cashback)": "€ 50,00"
}
```

#### Fields captured

| Field | Description |
|---|---|
| `file_name` | Source PDF file name |
| `Leverncier` | Detected energy supplier (Energiedirect.nl, Budget Energie, NLE, Essent, Huismerk Energie) |
| `Datum` | Document date |
| `Contract` | Contract type / name |
| `Geslacht`, `Voorletters`, `Tussenvoegsel`, `Achternaam`, `Geboortedatum` | Customer personal details |
| `Straat`, `Huisnummer`, `Huisnummer toevoeging`, `Postcode`, `Plaats` | Customer address |
| `E-mailadres`, `Telefoon`, `IBAN rekeningnummer` | Contact & payment details |
| `Gewenste startdatum / Ingangsdatum`, `Contractduur` | Contract start date and duration |
| `Variabele leveringskosten normaaltarief` / `daltarief` | Variable electricity supply costs |
| `Stroom Tarief (incl.btw)`, `Stroom Daltarief (incl.btw)` | Electricity tariffs incl. VAT |
| `Netbeheerkosten` | Grid management costs |
| `Verbruik gas`, `Gas Tarief` | Gas usage and tariff |
| `Totaalkosten` | Total first-year cost |
| `Welkomstcadeau(cashback)` | Welcome gift / cashback |

> \[!NOTE]
> Sellers don't always fill every field, and some contracts use a different page layout. Any value that can't be found comes back as an empty string or `N/A` for that field — the rest of the record is still extracted.

***

### 📋 Input

| Field | Description | Required | Default |
|---|---|---|---|
| **PDF / Dropbox links** (`source_urls`) | Public links to PDFs: direct `.pdf` URLs and/or Dropbox file or folder links. All PDFs found are downloaded and processed. | Yes | — |

***

### 💡 Use cases

- **Bulk onboarding** — convert a folder of signed contracts into a customer spreadsheet in minutes.
- **CRM data entry** — feed structured records straight into your CRM or database instead of typing them.
- **Auditing & QA** — quickly compare tariffs, durations, and totals across many contracts.
- **Automated intake** — schedule the Actor to process new contracts as they're dropped into a shared Dropbox folder.
- **Data pipelines** — pipe results to Google Sheets, BigQuery, or webhooks via the Apify API.

***

### 💰 Cost estimation

The Actor spends most of its time downloading PDFs and parsing text locally, which is lightweight — the main cost driver is the number and size of your PDFs. A batch of a few dozen contracts typically finishes in minutes and uses only a small number of compute units. Exact usage is shown on the run's detail page in the Apify Console.

***

### ❓ FAQ

**Which suppliers are supported?**
The parser is tuned for Energiedirect.nl, Budget Energie, NLE, Essent, and Huismerk Energie. It expects the standard 3-section layout (personal details → electricity → gas). PDFs with significantly different layouts may yield partial results.

**Can it read scanned contracts?**
No. It extracts the text embedded in the PDF and does **not** perform OCR. Image-only or scanned PDFs won't produce data.

**Why is a file missing from the results?**
The most common reasons: the link isn't public, the file requires sign-in, or it's a scanned/image PDF. Check the run log — skipped files and download errors are reported there.

**Are folder links supported?**
Yes — that's the recommended way. Dropbox folder links are downloaded as a single zip and unpacked automatically, so one link can deliver an entire batch of contracts.

**Is this legal / GDPR-safe?**
Only use the Actor on documents you own or are authorized to process. Extracted records contain **personal data** (names, addresses, IBANs) — handle them in accordance with GDPR and your organization's data-protection policies.

***

### 📮 Support

Bugs, feature requests, a new supplier layout, or custom work — open an issue on Apify or email **<alkausarimujahid@gmail.com>**.

***

<sub>Built by **Alkausari M**. This Actor is independent and not affiliated with, endorsed by, or sponsored by any energy supplier. Use extracted data responsibly and in accordance with GDPR and applicable law.</sub>

# Actor input Schema

## `source_urls` (type: `array`):

Public links to PDFs. Supports direct PDF URLs (https://.../file.pdf) and Dropbox file or folder links. Dropbox folders and .zip links are downloaded and unpacked automatically. All PDFs found are processed.

## Actor input object example

```json
{
  "source_urls": [
    "https://www.dropbox.com/scl/fo/d9pe7y8agotptzvs101p8/ACeijK3eOMrJzeGs_txDLv0?rlkey=ywfsjypcht64bbn9arcnnvo0m&st=0918gtoe&dl=0"
  ]
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "source_urls": [
        "https://www.dropbox.com/scl/fo/d9pe7y8agotptzvs101p8/ACeijK3eOMrJzeGs_txDLv0?rlkey=ywfsjypcht64bbn9arcnnvo0m&st=0918gtoe&dl=0"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("alkausari_mujahid/dutch-energy-contract-pdf-extractor").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "source_urls": ["https://www.dropbox.com/scl/fo/d9pe7y8agotptzvs101p8/ACeijK3eOMrJzeGs_txDLv0?rlkey=ywfsjypcht64bbn9arcnnvo0m&st=0918gtoe&dl=0"] }

# Run the Actor and wait for it to finish
run = client.actor("alkausari_mujahid/dutch-energy-contract-pdf-extractor").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "source_urls": [
    "https://www.dropbox.com/scl/fo/d9pe7y8agotptzvs101p8/ACeijK3eOMrJzeGs_txDLv0?rlkey=ywfsjypcht64bbn9arcnnvo0m&st=0918gtoe&dl=0"
  ]
}' |
apify call alkausari_mujahid/dutch-energy-contract-pdf-extractor --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=alkausari_mujahid/dutch-energy-contract-pdf-extractor",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Dutch Energy Contract PDF Extractor",
        "description": "Extracts structured data from Dutch energy contract PDFs (direct PDF or Dropbox file/folder links): customer & address details, contract term, electricity/gas tariffs, totals, cashback. Supports Energiedirect.nl, Budget Energie, NLE, Essent, Huismerk Energie. Text PDFs only, no OCR.",
        "version": "0.0",
        "x-build-id": "cvtxJEu5hbIhqMFjL"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/alkausari_mujahid~dutch-energy-contract-pdf-extractor/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-alkausari_mujahid-dutch-energy-contract-pdf-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/alkausari_mujahid~dutch-energy-contract-pdf-extractor/runs": {
            "post": {
                "operationId": "runs-sync-alkausari_mujahid-dutch-energy-contract-pdf-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/alkausari_mujahid~dutch-energy-contract-pdf-extractor/run-sync": {
            "post": {
                "operationId": "run-sync-alkausari_mujahid-dutch-energy-contract-pdf-extractor",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "source_urls"
                ],
                "properties": {
                    "source_urls": {
                        "title": "PDF / Dropbox links",
                        "type": "array",
                        "description": "Public links to PDFs. Supports direct PDF URLs (https://.../file.pdf) and Dropbox file or folder links. Dropbox folders and .zip links are downloaded and unpacked automatically. All PDFs found are processed.",
                        "items": {
                            "type": "string"
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
