Pricing

from $2.20 / 1,000 results

German Imprint Scraper with Decision Makers Names Extraction

An Actor that automatically locates and scrapes key contact details from German website imprint pages (Impressum). It extracts information such as company name, address, phone numbers, emails, and decision-makers (Entscheider, Entscheidungsträger)

Pricing

from $2.20 / 1,000 results

Rating

4.1

(2)

Developer

Dominic M. Quaiser

Actor stats

Bookmarked

400

Total users

Monthly active users

0.71 hours

Issues response

3 hours ago

Last modified

German Imprint Scraper

A Python-based Apify Actor designed to find and extract contact and legal information from German imprint pages ("Impressum"). Simply provide a list of website homepages, and the actor will automatically locate the imprint page and scrape key details like company name, address, phone number, email, and commercial register number.

Beta Version Notice: This actor is currently in beta. While it's fully functional and returns results, you may encounter occasional quirks or incomplete features. I welcome your feedback! Please report any issues or suggestions you have.

💡 Features

Automatic Imprint Page Discovery: Intelligently crawls websites to find the correct imprint page from your starting URLs.
Selective Data Extraction: Choose exactly which data points you need, from basic contact info to advanced details like company decision-makers.
Dual Fetching Technology:
- HTTP Mode: A fast, lightweight method for scraping simple, server-rendered websites.
- Headless Browser Mode (Playwright): A powerful option for modern, JavaScript-heavy websites. The actor can be configured to use this mode for all sites or as an automatic fallback if the standard HTTP method fails, ensuring maximum success rates.
Proxy Support: Integrates seamlessly with Apify's proxy service to handle IP rotation and avoid blocking.
Customizable Output: Include optional metadata or error records for detailed analysis and troubleshooting.
Structured JSON Output: Delivers clean, well-structured data ready for use in your applications, databases, or CRM systems.

📥 Input Parameters

Configure the actor's behavior using these fields in the Apify Console Input tab or via API:

Field	Type	Description	Default	Required
`startUrls`	Array	Enter the homepage URLs of the websites to process.	`[{ "url": "https://www.vita-cola.de/" }]`	Yes
`fieldsToExtract`	Array	Choose the specific pieces of information you want to collect.	`["company_name", "business_address"]`	No
`usePlaywright`	Boolean	Use a headless browser for all websites. Slower but more reliable for JavaScript-heavy sites.	`false`	No
`metaData`	Boolean	Include technical details in the output.	`false`	No
`errorOutput`	Boolean	Include a row in the output for each website that failed to process.	`false`	No
`debugLog`	Boolean	Generate a verbose log for troubleshooting.	`false`	No
`proxyConfiguration`	Object	Proxy settings. Apify Proxy is recommended.	`{ "useApifyProxy": true }`	No

📤 Output Data Structure

The exact fields depend on your fieldsToExtract selection.

Example Output

{
  "start_url": "https://muster-firma.de/",
  "imprint_url": "https://muster-firma.de/impressum",
  "company_name": {
    "name": "Muster GmbH",
    "confidence": 1
  },
  "business_address": {
    "full_address": "Musterstraße 123, 12345 Berlin",
    "street": "Musterstraße",
    "house_number": "123",
    "postal_code": "12345",
    "city": "Berlin"
  },
  "phone_number": {
    "phone_1": "+493012345678"
  },
  "fax_number": {
    "fax_1": "+493012345679"
  },
  "emails": {
    "email_1": "kontakt@muster-firma.de"
  },
  "register_number": {
    "number": "HRB 12345 B",
    "court": "Amtsgericht Charlottenburg"
  },
  "vat_id": {
    "vat_id": "DE123456788"
  },
  "social_media": {
    "linkedin": "https://www.linkedin.com/company/muster-firma"
  },
  "decision_makers": ["Max Mustermann"],
  "metadata": {
    "domain": "muster-firma.de",
    "fetch_method": "http",
    "fallback_attempted": false,
    "scraped_at": "2025-08-28T12:04:48.003780"
  }
}

Note: The numbered outputs like emails and phone numbers are sorted by confidence in how likely they are the main contact data for the company.

📊 Extractable Data in Detail

You can select any combination of the following fields for extraction:

Field	Description	Data Structure
`company_name`	Extracts the official company name. The result includes a `confidence` score indicating the likelihood of a correct match. The higher the number, the lower is the confidence.	`Object`
`business_address`	Parses the full business address into structured components: `full_address`, `street`, `house_number`, `postal_code`, and `city`.	`Object`
`phone_number`	Finds and extracts one or more phone numbers from the page. Results are keyed as `phone_1`, `phone_2`, etc.	`Object`
`fax_number`	Finds and extracts one or more fax numbers from the page. Results are keyed as `fax_1`, `fax_2`, etc.	`Object`
`emails`	Finds and extracts one or more email addresses. The extractor prioritizes emails that match the website's domain.	`Object`
`register_number`	Extracts the commercial register number ("Handelsregisternummer") and the corresponding registration `court` (`Registergericht`).	`Object`
`vat_id`	Extracts the German VAT ID ("Umsatzsteuer-Identifikationsnummer") with checksum validation. Returns single best match in format "DE123456789".	`Object`
`social_media`	Scans for and extracts links to common social media platforms like LinkedIn, Xing, Facebook, Instagram, etc.	`Object`
`decision_makers`	(Premium) Identifies and extracts the names of key decision-makers ("Entscheidungsträger"). This feature uses an external NER (Named Entity Recognition) machine learning model to ensure accuracy.	`Array`

💲 Pricing

This actor uses a pay-per-event pricing model. You are charged based on your usage, ensuring you only pay for what you need. The costs are as follows:

Actor Start: $0.10 per run
Per Website:
- Website Processed: $0.0004 for each URL from your input list
- Successful Result: $0.0026 for each website where data is successfully extracted
- Decision Maker Extracted: $0.0006 for decision-makers found per website (this is in addition to the successful result charge)
- Maximum Sum: $0.0036 per Website
Per 1000 Websites:
- Website Processed: $0.40 for 1000 URL from your input list
- Successful Result: $2.60 for 10000 websites where data is successfully extracted
- Decision Maker Extracted: $0.60 for decision-makers found per 1000 websites (this is in addition to the successful result charge)
- Maximum Sum: $3.60 per 1000 Websites

⚙️ Usage

Input URLs: Go to the Input tab and paste the homepage URLs of the websites you want to scrape.
Select Data: In the fieldsToExtract dropdown, select all the data points you wish to collect.
Configure Settings: Adjust settings like usePlaywright or proxyConfiguration as needed.
Start the Actor: Click the Start button.
Get Data: Once the run is finished, find your results in the Storage → Dataset tab.

🤖 Other Actors

🔗 Combine with Other Actors for Powerful Workflows.

You can enhance your data processing pipelines by combining the German Imprint Scraper with other Apify actors.

For example, you might also check out:

Gelbe Seiten (German Yellow Pages) Scraper - Extract business listings from Germany's Yellow Pages with three detail levels
Phone Number Formatter - Parse, validate, and format phone numbers in bulk across international formats

🎯 Use Cases

Lead Generation: Build targeted contact lists for sales and marketing.
Compliance & Verification: Check for legally compliant imprint information.
Market Research: Aggregate data on companies in a specific industry or region.
Data Enrichment: Enhance existing company profiles with official contact and registration details.

⚖️ Legal Disclaimer

You are solely responsible for determining the legality of your use of this actor and the data it generates. The scraping and handling of data, particularly personal information, is subject to complex legal frameworks like the General Data Protection Regulation (GDPR/DSGVO), copyright laws, and the terms of service of the websites you scrape. It is your responsibility to ensure your use case is compliant with all applicable laws. This text does not constitute legal advice.

Please be aware that the decision_makers feature uses an external API hosted on a private server in Europe for data processing.

What is Processed: The text of the imprint page is sent to this API to identify personal names.
Why: This is necessary for the Named Entity Recognition (NER) model to accurately extract decision-makers.
Data Controller: You, the user, are the data controller. The actor's developer acts as the data processor for this specific task.
Location & Compliance: All processing for this feature occurs within the EU (Germany) and is subject to GDPR (DSGVO).
Data Storage: The text is processed in-memory and is not stored or logged on the external server.
Important: This processing is external to the Apify platform and is not covered by Apify's DPA. By using this feature, you acknowledge this separate data processing activity.

🛠️ Maintainer

Author: Dominic M. Quaiser
Contact: mail@dominic-quaiser.io
Website: dominic-quaiser.io

German Imprint Scraper

codescraper/german-imprint-scraper

A powerful Actor scraper to find and extract legal "Impressum" data from German websites. Get company names, addresses, decision-makers, legal IDs, and more, all automatically.

CodeScraper

5.0

German Imprint Scraper + Email Validation

winningsolutions/german-imprint-scraper

Smart Actor for German websites that detects Impressum pages, extracts company details, contact data, and verifies emails. Offers reliable scraping, structured JSON results, and robust performance for lead generation at scale.

Winning Solutions

5.0

German Impressum Scraper (Bulk)

luca-artur/german-impressum-scraper-bulk

Scrape german website imprints for: Company data, decision maker, phone, mail, social profiles, register number, meta description, and more.

Luca S.

Gelbe Seiten Scraper - German Business Leads & Company Data

plowdata/gelbe-seiten

Extract German business leads and company information from Gelbe Seiten (gelbeseiten.de). Collect emails, phone numbers, addresses, reviews, and rich listing data. Export to CSV, Excel, JSON, or integrate into automation workflows.

Frederic

317

5.0

Gelbe Seiten (German Yellow Pages) Scraper

dominic-quaiser/gelbe-seiten-german-yellow-pages-scraper

Scrape German business listings from Gelbe Seiten with flexible detail levels. This Apify Actor supports fast, basic, and deep search modes, rate limiting, proxy rotation, and index control. Ideal for lead gen, SEO, and market research. Outputs structured data to Apify datasets.

Dominic M. Quaiser

115

5.0

Decision Maker Name & Email Extractor

dominic-quaiser/decision-maker-name-email-extractor

An actor that crawls a website to identify key decision‑maker names and job titles, then uses NER‑powered matching to extract and pair their email addresses for streamlined lead generation and B2B data enrichment.

Dominic M. Quaiser

247

1.0

Decision makers Email finder📧 $1/1K Emails, Super cheap.

snipercoder/decision-maker-email-finder

|Input: Domain| |Output: Name, Email, Title, Company, etc of Decision makers.| Perfect for Lead Generation, Email campaigns, Data Enrichment. ✅Forget AnymailFinder, apollo.io, hunter.io, they are all to break the Bank.

Sniper Coder

643

3.8

Handelsregister Scraper

dominic-quaiser/handelsregister-scraper

Leistungsstarker Actor zur Identifikation von Firmen und Entscheidungsträgern aus dem deutschen Handelsregister. Zugriff auf Geschäftsführer, Vorstände und vertretungsberechtigte Personen per Echtzeit-API — ideal für Lead-Enrichment, Marktanalysen, KYC und Due-Diligence.

Dominic M. Quaiser

Playwright MCP Server

jiri.spilka/playwright-mcp-server

A Model Context Protocol (MCP) server that provides browser automation capabilities using Playwright

Jiří Spilka

165

Handelsregister API

radeance/handelsregister-api

Access valuable key company data from the German Commercial Register in realtime from Handelsregister.de: shareholders, executives, addresses, court details, and official documents. Ideal for LegalTech, compliance, and due diligence workflows.