German Imprint Contact Scraper avatar
German Imprint Contact Scraper

Pricing

Pay per usage

Go to Store
German Imprint Contact Scraper

German Imprint Contact Scraper

Developed by

Dominic M. Quaiser

Dominic M. Quaiser

Maintained by Community

An Actor that automatically locates and scrapes key contact details from German website imprint pages (Impressum). It extracts information such as company name, address, phone numbers, emails, and decision-maker details.

0.0 (0)

Pricing

Pay per usage

7

Total users

120

Monthly users

61

Runs succeeded

>99%

Issues response

20 hours

Last modified

6 days ago

GZ

Imprint Scraper did not time out

Closed

gallant_zone opened this issue
10 days ago

The scraper ran for 7;41min and did not give my http request a timeout notification, but kept on running. The log states it otherwise, but my Integration has a clear log of infinite loops.

dominic-quaiser avatar

Thank you for bringing this to my attention. I believe I’ve identified the issue. There was a mistake in how the actor handles skipping a page when the content can't be properly processed or when it gets stuck. I've updated the logic so that the actor will now skip the website entirely if it cannot be processed.

EP

effervescent_punch

6 days ago

LOG:

2025-07-17T09:07:00.947Z ACTOR: Pulling Docker image of build gcelmNeFtBBzBqBl8 from registry. 2025-07-17T09:07:01.575Z ACTOR: Creating Docker container. 2025-07-17T09:07:01.700Z ACTOR: Starting Docker container. 2025-07-17T09:07:03.629Z Actor is running on the Apify platform, disable_browser_sandbox was changed to True. 2025-07-17T09:07:03.861Z [apify] INFO Initializing Actor... 2025-07-17T09:07:03.864Z [apify] INFO System info ({"apify_sdk_version": "2.6.0", "apify_client_version": "1.10.0", "crawlee_version": "0.6.10", "python_version": "3.13.4", "os": "linux"}) 2025-07-17T09:07:04.211Z [apify] INFO Enqueuing https://gustone.de 2025-07-17T09:07:04.344Z [apify] INFO Processing https://gustone.de 2025-07-17T09:07:05.665Z [apify] INFO Found Imprint link: https://gustone.de/Impressum:_:4.html 2025-07-17T09:07:05.838Z [apify] INFO Starting synchronous HTML processing for https://gustone.de 2025-07-17T09:07:05.866Z [apify] INFO Cleaning Imprint HTML for https://gustone.de 2025-07-17T09:07:26.291Z ACTOR: The Actor run was aborted by the user.

This one was the Error URL.

-- Christian Mattukat Amazon Specialist christian.mattukat@blankspace.eu · blankspace.eu Tel: 030 2359662 84

[image: Ihr Amazon Co-Pilot] Blankspace Commerce GmbH Friedrichstr. 171 10117 Berlin · Germany –––––– Managing Directors: Julian Wächter, Khanh Tuong Commercial Register: № HRB 226261 B, Amtsgericht Berlin-Charlottenburg

dominic-quaiser avatar

The bug is now fixed. If an HTML page takes too long to process, the actor skips the website or exits the run if it's the last one. Most other pages should be possessed within seconds.

I realized too late that I could view the run that had the error.

EP

effervescent_punch

5 days ago

Awesome! Thanks

-- Christian Mattukat Amazon Specialist christian.mattukat@blankspace.eu · blankspace.eu Tel: 030 2359662 84

[image: Ihr Amazon Co-Pilot] Blankspace Commerce GmbH Friedrichstr. 171 10117 Berlin · Germany –––––– Managing Directors: Julian Wächter, Khanh Tuong Commercial Register: № HRB 226261 B, Amtsgericht Berlin-Charlottenburg