Pricing

$0.40 / 1,000 results

Go to Store

Gelbe Seiten (German Yellow Pages) Scraper

Try for free

Developed by

Azquaier

Scrape German business listings from Gelbe Seiten with flexible detail levels. This Apify Actor supports fast, basic, and deep search modes, rate limiting, proxy rotation, and index control. Ideal for lead gen, SEO, and market research. Outputs structured data to Apify datasets.

0.0 (0)

Pricing

$0.40 / 1,000 results

Total users

Monthly users

Runs succeeded

>99%

Last modified

15 days ago

Automation

Lead generation

Jobs

Gelbe Seiten (German Yellow Pages) Scraper

A Python-based Apify Actor designed to scrape business listings from Gelbe Seiten (www.gelbeseiten.de). It offers three distinct modes for varying levels of detail extraction. Features include rate limiting, proxy configuration, and flexible index ranges for controlling pagination or resuming interrupted runs.

💡 Features

Targeted Search: Specify the service or business type (search_what) and the geographic area (search_where). Use "bundesweit" for nationwide searches.
Three Search Modes (search_mode):
- fast_search: Quickly extracts summary information directly from search result pages without visiting detail pages. Includes name, address snippet, phone, rating, primary branch, and encoded contact links (email/website). Ideal for rapid list building.
- basic_search: Visits each business profile page once to fetch essential details like full address, email, website, description, industry, phone number, and social media links. Does not include summary data like ratings from the search results page.
- deep_search: Combines the summary data from fast_search with a comprehensive detail page visit to extract all available fields, including opening hours, services, detailed company information, training opportunities, payment methods, social media links, fax number, Google Maps link, FAQs, etc. This is the most thorough mode.
Index Control: Use start_index and end_index (1-based) to define a specific range of listings to process, useful for resuming runs or targeted scraping.
Unlimited Mode: Set max_businesses to 0 to scrape all available results matching the search criteria.
Rate Limiting: Configurable requests_per_second throttle applied to API calls (all modes) and detail page requests (basic_search and deep_search) to manage load and reduce the risk of blocking.
Proxy Support: Leverages Apify's built-in proxy integration (proxyConfiguration) for reliable IP rotation during scraping, especially crucial for detail page visits.
Structured Output: Data is saved to the Apify dataset. Each record includes a scraped_at UTC timestamp and its index (overall position in the search results).

📥 Input Parameters

Configure the actor's behavior using these fields in the Apify Console Input tab or via API:

Field	Type	Description	Default	Required
`search_what`	String	The business type, profession, or service to search for (e.g., "Restaurant", "Arzt", "Hotel", "Kreditvermittlung").	`"containerbau"`	Yes
`search_where`	String	The geographic location (e.g., city name like "Berlin", region, or `"bundesweit"` for nationwide).	`"bundesweit"`	Yes
`search_mode`	String	Extraction detail level: `fast_search` (summary only), `basic_search` (essential details from profile page), `deep_search` (summary + all profile details).	`"basic_search"`	No
`max_businesses`	Integer	Maximum number of listings to save. Set `0` for unlimited (scrapes all found results).	`0`	No
`start_index`	Integer	1-based index of the first listing to save. Useful for resuming runs or skipping initial results.	`1`	No
`end_index`	Integer	1-based index of the last listing to save (inclusive). Set `0` to ignore this limit and rely solely on `max_businesses`.	`0`	No
`requests_per_second`	Integer	Max requests per second. Applies to API calls (all modes) and detail page fetches (`basic`/`deep` modes). Lower values (e.g., 2-5) are safer, higher values (e.g., 10+) faster.	`12`	No
`proxyConfiguration`	Object	Apify proxy settings (Automatic recommended) or custom proxy configuration.	`{}`	No

🔹 Example Input

{
  "search_what": "Hotels",
  "search_where": "Hamburg",
  "search_mode": "deep_search",
  "max_businesses": 200,
  "start_index": 51,
  "requests_per_second": 8,
  "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] }
}

📤 Output Data Structure

Each record in the dataset is a JSON object. The exact fields depend on the selected search_mode.

🔹 `fast_search` Example Output

{
  "index": 1,
  "url": "https://www.gelbeseiten.de/gsbiz/abc123xyz",
  "name": "SchnellTest GmbH",
  "bewertung": 4.5,
  "bewertungen": 8,
  "besteBranche": "Testdienste",
  "telefonnummer": "040 1234567",
  "emaillink": "info@schnelltest.de", // Decoded from Base64
  "base64_emaillink": "aW5mb0BzY2huZWxsdGVzdC5kZQ==", // Raw Base64
  "webseitelink": "https://schnelltest.de", // Decoded from Base64
  "base64_webseitelink": "aHR0cHM6Ly9zY2huZWxsdGVzdC5kZQ==", // Raw Base64
  "adresse_from_search": "Teststraße 1, 20095 Hamburg Neustadt", // Address snippet from search results
  "scraped_at": "2025-04-30T09:30:00.123Z"
}

🔹 `basic_search` Example Output

{
  "index": 51, // Example if start_index was 51
  "url": "https://www.gelbeseiten.de/gsbiz/abc123xyz",
  "name": "Hotel Hanseatic", // Extracted from detail page
  "email": "info@hotel-hanseatic.de", // Extracted from detail page
  "website": "http://www.hotel-hanseatic.de", // Extracted from detail page
  "beschreibung": "Gemütliches Hotel im Herzen von St. Georg.", // Extracted from detail page
  "branche": "Hotels", // Extracted from detail page
  "social_media": { // Extracted from detail page
      "facebook": "https://facebook.com/hotelhanseatic"
   },
  "address": "Steindamm 50, 20099 Hamburg St. Georg", // Address from detail page
  "telefonnummer": "040 9876543", // Extracted from detail page
  "scraped_at": "2025-04-30T09:35:00.789Z"
}

Note: Fetches only from the detail page, does not include search result summary data.

🔹 `deep_search` Example Output

Combines fast_search summary data with all available detail page data.

{
  // --- Fields from fast_search (search results page) ---
  "index": 201,
  "url": "https://www.gelbeseiten.de/gsbiz/abc123xyz",
  "name": "Muster Restaurant",
  "bewertung": 4.8,
  "bewertungen": 55,
  "besteBranche": "Restaurants",
  "telefonnummer": "040 1122334",
  "emaillink": "reservierung@muster-restaurant.de",
  "base64_emaillink": "cmVzZXJ2aWVydW5nQG11c3Rlci1yZXN0YXVyYW50LmRl",
  "webseitelink": "https://www.muster-restaurant.de",
  "base64_webseitelink": "aHR0cHM6Ly93d3cubXVzdGVyLXJlc3RhdXJhbnQuZGU=",
  "adresse_from_search": "Musterweg 10, 20457 Hamburg Altstadt",
  "scraped_at": "2025-04-30T09:40:00.456Z",
  // --- Additional fields from deep_search (detail page) ---
  "email": "reservierung@muster-restaurant.de",
  "website": "https://www.muster-restaurant.de", // The same as webseitelink
  "beschreibung": "Moderne deutsche Küche mit saisonalen Zutaten.",
  "oeffnungszeiten": { // Example structure
    "Mo.": "Ruhetag",
    "Di.-Sa.": "18:00 - 23:00",
    "So.": "12:00 - 15:00"
  },
  "branche": "Restaurant; Deutsche Küche", // Can be more detailed than besteBranche
  "leistungsumfang": "Abendessen, Mittagstisch (So), Terrasse",
  "services": ["Restaurant", "Deutsche Küche", "Terrasse"],
  "unternehmensinformationen": { // Example structure
      "gründungsjahr": ["2010"],
      "parkplätze": ["vorhanden"]
   },
  "ausbildung": null, // Or text/list if available
  "zahlungsmittel": ["EC-Karte", "Kreditkarte", "Bar"],
  "social_media": {
      "instagram": "https://instagram.com/musterrestaurant"
   },
  "google_maps_url": "",
  "faxnummer": "040 1122335",
}

Each record includes additional fields depending on the search_mode selected. See below for a full field reference:

Key	Available in	Description
`name`	fast/basic/deep	Company name from ether the listing or detail page.
`adresse_from_search`	fast/deep	Address snippet from listing page.
`address`	basic/deep	Street address from detail page.
`telefonnummer`	fast/basic/deep	Phone number from ether the listing or detail page.
`faxnummer`	deep	Fax number from the detail page.
`bewertung`	fast/deep	Average rating (numeric).
`bewertungen`	fast/deep	Number of reviews.
`besteBranche`	fast/deep	Primary branch/industry from listing.
`branche`	basic/deep	Industry from detail page.
`email`	basic/deep	Email provided on the detail page.
`emaillink`	fast/deep	Decoded link to the email on the detail page.
`base64_emaillink`	fast/deep	Raw Base64 email data attribute.
`webseitelink`	fast/deep	Decoded website link.
`base64_webseitelink`	fast/deep	Raw Base64 website data attribute.
`beschreibung`	basic/deep	Business description on the detail page.
`oeffnungszeiten`	deep	Opening hours by day.
`leistungsumfang`	deep	Scope of services.
`services`	deep	List of services offered.
`unternehmensinformationen`	deep	Additional company information.
`ausbildung`	deep	Education or training information.
`zahlungsmittel`	deep	Accepted payment methods.
`social_media`	basic/deep	Social media links.
`google_maps_url`	deep	Google Maps search URL.
`faq`	deep	List of FAQ Q&A pairs (if any).

Note: Any field may be null if not present on the page.

⚙️ Usage

Configure inputs in the "Input" tab (set search_what, search_where, etc.).
Choose a proxy mode. Automatic Apify Proxy is recommended for reliability.
Click Start.
Monitor progress in the Log tab.
Access results under Storage → Dataset.

🎯 Use Cases

Lead generation and contact harvesting.
Market research and competitor analysis.
Local SEO and business directory creation.
Data enrichment pipelines on Apify.

💲 Pricing

1000 results = $0.40 + the platform usage:
- 1000 listings, fast_search: ≈ $0.01, ~1.20 min run.
- 1000 listings, basic_search: ≈ $0.04, ~11.00 min run.
- 1000 listings, deep_search: ≈ $0.05, ~12.00 min run.

🔗 Integrations

Scheduler: automate daily/weekly runs.
Webhooks: trigger downstream workflows on completion.
API: programmatic control via Apify API.
Composer: chain with other Actors (e.g., cleaning, enrichment).

🧰 Technical Notes

Async HTTP via httpx, HTML parsing via BeautifulSoup4 + lxml.
Custom RateLimiter for throttling API & detail requests with randomized delays to reduce detectability.
The scraper deliberately bypasses robots.txt directives to ensure complete data retrieval. Use responsibly.
Gelbe Seiten uses Base64 encoding in the listing. The script will output a decoded value for each of them.

🛠️ Maintainer

Author: Azquaier
Contact: 📧 mail@azquaier.xyz
Website: 🌍 azquaier.xyz

On this page

- Gelbe Seiten (German Yellow Pages) Scraper

Share Actor:

Gelbe Seiten Scraper

plowdata/gelbe-seiten

Gather leads and information from one of Germany's most comprehensive business directories, Gelbe Seiten. Download your data as HTML table, JSON, CSV, XML, Excel, RSS, or JSONL.

Frederic

140

4.0

Gelbe Seiten Scraper

caprolok/gelbe-seiten-scraper

Unleash the full potential of market research with the Gelbe Seiten Scraper. This efficient tool expertly navigates and extracts vital business information from Germany's premier directory, offering invaluable data for insightful analysis and strategy development.

Caprolok

113

Yellow Pages Germany (Gelbe Seiten) Business Lead Generator

lead.gen.labs/yellow-pages-germany-gelbe-seiten-business-lead-generator

Unlock high-quality business leads from Germany’s leading directory, GelbeSeiten.de. This powerful scraper extracts company names, addresses, phone numbers, emails, websites, and social media links—giving you verified contact details for B2B outreach, sales prospecting, and market research.

LeadGen Labs

FirmenABC - kostenlose B2B-Daten mit E-Mail, GF & Website

leadifyat/firmenabc-scraper-at

Scrapen Sie verifizierte Unternehmensdaten von FirmenABC.at, einschließlich Firmenname, Adresse, Telefonnummer, E-Mail, Website, detaillierte Informationen zum Geschäftsführer (Anrede, Titel, Vorname, Nachname) und Profile in sozialen Medien.

Leadify

5.0

Simple YellowPages Scraper (USA)

pajoe/simple-yellowpages-scraper-usa

Extract addresses, phone numbers, business categories, and names from Yellow Pages US listings. A flexible Yellow Pages API for crawling and downloading complete contact details.

va-gasd

Yellow Pages Scraper

onidivo/yellow-pages-scraper

Crawl the Yellow Pages site and extract data about businesses. Scrape business details with unlimited options like search terms, location, sorting options, and many more.

Onidivo Technologies

109

1.0

Jameda Scraper

giovannibiancia/Jameda

Scrape doctors data from jameda.de

Giovanni Bianciardi

Yellow Pages US Scraper

trudax/yellow-pages-us-scraper

Scrape addresses, phone numbers, categories, and names from Yellow Pages US listings. Customizable Yellow Pages API to crawl and download all contact data.

Gustavo Rudiger

4.1K

Yellow Pages

mcdowell/yellow-pages

Scrape Yellow Pages for addresses, categories, names and phone numbers from listings

Victor McDowell

474

1.0

Yellow Pages Canada Business Lead Generator

lead.gen.labs/yellow-pages-canada-business-lead-generator

The Yellow Pages Canada Business Lead Generator is a powerful and efficient tool designed to help you extract valuable business leads from Yellow Pages Canada.

LeadGen Labs