German Imprint Leads Scraper avatar

German Imprint Leads Scraper

Pricing

Pay per event

Go to Apify Store
German Imprint Leads Scraper

German Imprint Leads Scraper

Extract German Impressum legal contacts, company details, VAT IDs, HRB records, emails, and decision-makers from domains.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Extract structured legal and contact data from German public Impressum pages.

Use this actor when you already have a list of German company domains and need CRM-ready enrichment: company name, legal form, registered address, phone numbers, emails, VAT ID, Handelsregister details, managing directors, responsible persons, social links, source snippets, and confidence flags.

What does German Imprint Leads Scraper do?

It visits each submitted domain, checks common German legal-contact pages such as /impressum, /service/impressum, /imprint, and /kontakt, follows likely footer links, and saves one structured lead record per domain.

Who is it for?

  • 🧑‍💼 Sales teams enriching German B2B account lists
  • 🧾 Compliance teams checking public company disclosures
  • 🧲 Lead-generation agencies building Germany-specific datasets
  • 🧑‍💻 Recruiters finding company decision makers
  • 🧹 CRM operations teams normalizing German legal contacts

Why use it?

German websites often place high-value company data in the Impressum instead of on a marketing contact page. This actor targets that legal-contact workflow directly instead of returning generic page text.

What data can it extract?

FieldDescription
inputUrlSubmitted domain or URL
imprintUrlBest Impressum/contact page found
companyNameLegal company name when detected
legalFormGmbH, AG, KG, UG, e.K., and similar forms
addressRegistered or legal address snippet
emailsPublic email addresses
phoneNumbersPublic phone numbers
vatIdGerman VAT ID / USt-IdNr
registrationCourtAmtsgericht / register court
registrationNumberHRB/HRA registration number
managingDirectorsGeschäftsführer, Vorstand, or similar names
responsiblePersonResponsible person when disclosed
confidenceFlagsFlags showing which important fields were found
sourceSnippetsText snippets for verification

How much does it cost to extract German Impressum leads?

The actor uses pay-per-event pricing with a small start fee and a per-result fee. Current configured pricing is:

EventFreeBronzeSilverGoldPlatinumDiamond
Run start$0.005$0.005$0.005$0.005$0.005$0.005
Result extracted$0.0006508$0.00056591$0.00044141$0.00033955$0.00022636$0.00015845

Example estimates before Apify platform fees: 100 extracted domains cost about $0.070 on Free, $0.062 on Bronze, and $0.039 on Gold, including the start event. The default two-domain prefill costs about $0.0063 on Free, so it stays suitable for a quick first test.

Input

Provide domains or URLs in startUrls.

{
"startUrls": [
{ "url": "https://www.rewe.de" },
{ "url": "https://www.dm.de" }
],
"maxPagesPerDomain": 8,
"includeSubpages": true,
"proxyConfiguration": { "useApifyProxy": false }
}

Output

Each dataset item represents one submitted domain or URL.

{
"inputUrl": "https://www.rewe.de",
"inputDomain": "rewe.de",
"imprintUrl": "https://www.rewe.de/service/impressum/",
"status": "found",
"companyName": "REWE Markt GmbH",
"legalForm": "GmbH",
"emails": ["impressum@rewe.de"],
"vatId": "DE812706034",
"registrationNumber": "HRB 66773",
"confidenceFlags": ["company_name_found", "email_found"]
}

How to use it

  1. Prepare a list of German domains or websites.
  2. Paste them into the Start URLs field.
  3. Keep maxPagesPerDomain low for quick enrichment.
  4. Run the actor.
  5. Export the dataset as JSON, CSV, Excel, or via API.

Tips for better results

  • Submit homepages, not random blog posts.
  • Keep includeSubpages enabled so footer Impressum links are followed.
  • Use no proxy first; most public legal pages are accessible directly.
  • Increase maxPagesPerDomain only for sites with unusual navigation.

Status values

  • found means an Impressum/contact page was located and parsed.
  • not_found means pages were checked but no legal-contact page scored high enough.
  • error means the domain could not be processed due to a network or parsing error.

Confidence flags

Confidence flags help filter records:

  • company_name_found
  • address_found
  • email_found
  • phone_found
  • vat_id_found
  • registration_found
  • decision_maker_found

Integrations

Use the output with:

  • HubSpot or Salesforce enrichment workflows
  • Clay tables and lead-routing systems
  • Google Sheets lead lists
  • Compliance review queues
  • Internal data-quality checks

API usage: Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/german-imprint-leads-scraper').call({
startUrls: [{ url: 'https://www.rewe.de' }],
maxPagesPerDomain: 8,
});
console.log(run.defaultDatasetId);

API usage: Python

from apify_client import ApifyClient
client = ApifyClient('YOUR_APIFY_TOKEN')
run = client.actor('automation-lab/german-imprint-leads-scraper').call(run_input={
'startUrls': [{'url': 'https://www.rewe.de'}],
'maxPagesPerDomain': 8,
})
print(run['defaultDatasetId'])

API usage: cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~german-imprint-leads-scraper/runs?token=YOUR_APIFY_TOKEN' \
-H 'Content-Type: application/json' \
-d '{"startUrls":[{"url":"https://www.rewe.de"}],"maxPagesPerDomain":8}'

MCP usage

Connect Apify MCP with this actor enabled:

https://mcp.apify.com/?tools=automation-lab/german-imprint-leads-scraper

Claude Code setup:

$claude mcp add apify-german-imprint https://mcp.apify.com/?tools=automation-lab/german-imprint-leads-scraper

Claude Desktop JSON config:

{
"mcpServers": {
"apify-german-imprint": {
"url": "https://mcp.apify.com/?tools=automation-lab/german-imprint-leads-scraper"
}
}
}

Example prompts:

  • "Extract Impressum contacts for these 20 German domains."
  • "Find VAT IDs and managing directors for this German prospect list."
  • "Check which domains have no public legal contact details."

Legality

This actor extracts publicly available business information from websites you provide. You are responsible for using the data lawfully, respecting website terms, and complying with GDPR, ePrivacy, and other applicable rules.

FAQ

Why did one domain return not_found?

The site may use a non-standard legal page URL, block automated HTTP clients, or render legal data only in JavaScript. Try submitting the exact Impressum URL or increasing maxPagesPerDomain.

Does this actor validate email deliverability?

No. It extracts public emails from pages. Use a dedicated email validation service if you need deliverability checks.

Troubleshooting

If a site returns no data, try raising maxPagesPerDomain or submitting the exact Impressum URL.

If many requests fail, enable Apify Proxy or retry later. Some sites block automated traffic intermittently.

Limitations

The actor uses HTTP and Cheerio for speed and low cost. Some JavaScript-only pages may expose fewer fields than a browser-based scraper.

Privacy notes

The actor does not log in, bypass paywalls, or access private systems. It only reads public pages reachable from submitted domains.

Changelog

Initial version extracts German Impressum legal-contact fields from submitted domains and URLs.

Support

If you need fields tuned for a specific German industry or CMS pattern, open an Apify issue with sample URLs and expected output.

Field reference

pagesChecked lists every URL requested for the domain. sourceSnippets contains nearby text around key legal labels so users can audit extraction quality.

Performance

HTTP-only crawling keeps runs lightweight. The default platform memory is 512 MB and the default crawl depth is capped per domain.

Data quality workflow

Use confidenceFlags to route complete leads into your CRM and send lower-confidence rows to manual review.