- Standby mode dataset limit check: Fixed AttributeError in
_check_dataset_limit() by using correct Apify SDK method get_metadata() instead of the non-existent get_info(). This resolves 500 Internal Server Error when checking dataset quota limits in standby mode.
- Proxy connection reset handling: Added
ProxyError to the retry exception handling in the search loop. Transient proxy connection resets (595 ECONNRESET) now trigger automatic retries with exponential backoff instead of immediately crashing the actor.
- Improved retry logic for proxy-based requests: Enhanced the search retry mechanism to request a fresh proxy URL (new residential IP) on each retry attempt instead of reusing the same blocked IP. Increased retry attempts from 3 to 5 and implemented exponential backoff with jitter (4-35 seconds) to better handle server disconnections and rate limiting.
- Proxy configuration support: Added support for Apify proxy configuration to hide request origins and improve reliability for high-frequency scraping. Configurable via
proxyConfiguration input parameter with defaults set to Residential proxies from Germany.
- Comprehensive legal form support: Enhanced parser to extract decision-makers from all major German legal forms:
- e.K. (Einzelkaufmann): Inhaber (owners)
- AG (Aktiengesellschaft): Vorstand (board members), Vorstandsvorsitzender (chairmen)
- SE (Europäische Gesellschaft): Geschäftsführender Direktor (managing directors)
- GmbH/OHG: Gesellschafter (shareholders/partners)
- Plus existing support for GmbH Geschäftsführer, KG Komplementäre/Kommanditisten
- Unified decision-makers field: All persons with representation authority are now consolidated under
vertretungsberechtigte with German role names for simplified lead generation.
- Role name lookup: Integration with "GDS.Rollenbezeichnung" codelist for accurate German role designations.
- Simplified output structure: Consolidated all decision-makers under single
vertretungsberechtigte field instead of separate role-specific fields (geschaeftsfuehrer, inhaber, vorstand, etc.).
- Numeric register numbers:
laufende_nummer now contains only the numeric portion (e.g., "8438" instead of "HRA 8438 P") for better data processing.
- Register information extraction: Enhanced parser to check multiple XML sources (registereintragung, aktenzeichen) with intelligent fallback logic.
- e.K. company support: Fixed missing register information for sole proprietorships and other HRA entities.
- Data quality: More accurate extraction of company decision-makers across all legal forms.
- Input validation for "mindestens ein Schlagwort enthalten." mode: Added validation to prevent submission errors when using the "at least one keyword" search option without required additional filters. The scraper now validates that when using
schlagwoerter_suchoptionen = "mindestens ein Schlagwort enthalten.", at least one of registerart, registergericht, or registernummer must be provided, as required by the Handelsregister website.
- Improved error messages: Users now receive a clear 400 Bad Request error explaining the missing required fields instead of encountering a cryptic German error message from the Handelsregister website.
- Updated documentation: Added warnings in both the input schema description and README to inform users about the additional filter requirement for the "mindestens ein Schlagwort enthalten." search mode.
The Handelsregister Scraper actor has been released as a stable version following a successful pre-release testing period. This actor provides reliable, real-time access to German Commercial Register (Handelsregister) data through a HTTP API running in standby mode.
- Real-time API with standby mode: Sub-second response times with always-on availability and automatic scaling.
- Flexible search capabilities: Search by company name/keywords or direct register number lookup with configurable search strategies.
- Phonetic search support: Optional fuzzy matching for similar-sounding terms (e.g., "Meyer" matches "Meier", "Mayer").
- Comprehensive data extraction: Complete company information including legal form, address, business purpose, capital structure, managing directors, authorized officers, partners, and limited partners with liability contributions.
- Raw XML access: Optional storage of original XJustiz XML files in the Apify Key-Value Store for advanced processing.
- Automatic data storage: All results automatically saved to Apify Dataset for easy retrieval and analysis.
- Intelligent error handling: Automatic court name validation with fuzzy matching and helpful suggestions.
- Professional HTTP status codes: Proper error responses (400, 401, 402, 404, 500) for integration reliability.