Michelin Guide Restaurant Scraper avatar
Michelin Guide Restaurant Scraper

Pricing

$4.00 / 1,000 restaurants

Go to Apify Store
Michelin Guide Restaurant Scraper

Michelin Guide Restaurant Scraper

This Apify actor scrapes restaurant data from the Michelin Guide using their Algolia search API. It automatically extracts the API configuration and retrieves comprehensive restaurant information.

Pricing

$4.00 / 1,000 restaurants

Rating

0.0

(0)

Developer

BML Ventures

BML Ventures

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

0

Monthly active users

6 hours ago

Last modified

Categories

Share

Extract comprehensive restaurant data from the Michelin Guide across 47+ countries worldwide. Get detailed information about Michelin-starred restaurants, Bib Gourmand establishments, and other recommended dining venues.

🌟 What You Get

This scraper extracts detailed information for each restaurant including:

  • Restaurant Details: Name, full description, chef information
  • Location Data: Complete address, city, country, geographic coordinates
  • Michelin Recognition: Star ratings (1-3 stars), Bib Gourmand, Green Star status
  • Cuisine Information: Cuisine types and specialties
  • Pricing: Price category indicators
  • Contact Information: Phone numbers, website links, booking links
  • Facilities: Features, accepted credit cards, opening hours
  • Direct Links: URL to the restaurant's Michelin Guide page

🌍 Global Coverage

Scrape restaurants from 47+ countries including:

  • Europe: France, Italy, Spain, Germany, UK, Netherlands, Belgium, Switzerland, Austria, Portugal, Sweden, Norway, Denmark, Finland, Iceland, Poland, Czech Republic, Greece, Croatia, Serbia, Hungary, Slovenia, Estonia, Latvia, Lithuania, Luxembourg, Malta, and more
  • Asia: Japan, China, Hong Kong, Singapore, Thailand, South Korea, Taiwan, Vietnam, Malaysia, Macau, Philippines
  • Americas: USA, Canada, Mexico, Brazil, Argentina
  • Middle East: UAE, Qatar, Turkey, Saudi Arabia

Or scrape all countries at once with a single run!

βš™οΈ Key Features

  • Fast & Efficient: Scrapes up to 2 restaurants per second with detail page information
  • Reliable: Built-in retry logic and timeout handling for stable runs
  • Flexible: Choose specific countries or get global data
  • Complete Data: Always includes full detail page information
  • Smart Stopping: Automatically detects when all data has been scraped
  • State Persistence: Can resume from interruptions
  • Timeout Protection: Multiple layers of timeout handling to prevent hung runs

πŸ“Š Use Cases

Perfect for:

  • Restaurant Discovery Platforms: Build comprehensive dining databases
  • Food & Travel Apps: Enrich your app with Michelin Guide data
  • Market Research: Analyze culinary trends and Michelin star distributions
  • Travel Planning: Create curated lists of top restaurants by region
  • Data Analysis: Study pricing, cuisine types, and geographic patterns
  • Content Creation: Generate content about fine dining destinations
  • Business Intelligence: Track restaurant industry trends and prestigious establishments

πŸš€ How to Use

Basic Usage

  1. Select Your Coverage: Choose a specific country or select "All Countries"
  2. Configure Pages (Optional): Set start/end pages if you want partial data
  3. Adjust Performance (Optional): Modify concurrency and delays as needed
  4. Run: The scraper will collect all available data within the 1-hour default timeout

Input Parameters

  • Country (default: "All Countries"): Select a specific country or scrape all countries

    • Supports 47+ countries across Europe, Asia, Americas, and Middle East
  • Start Page (default: 1): The page number to start scraping from

    • Useful for resuming interrupted scrapes or scraping specific sections
  • End Page (optional): The page number to stop at

    • Leave empty to scrape all available pages
    • Each page contains approximately 48 restaurants
  • Max Concurrency (default: 3): Number of concurrent detail page requests

    • Higher values = faster scraping but more resource intensive
    • Recommended: 3-5 for optimal balance
  • Delay Between Pages (default: 1 second): Delay between scraping each page

    • Helps avoid overwhelming the server
    • Can be reduced to 0 for faster scraping
  • Skip Detail Scraping (default: false): Skip fetching restaurant detail pages

    • Enable for faster scraping with less comprehensive data
    • Disable (default) for complete restaurant information
  • Output Format (default: "Full"): Choose between full or simplified output

    • Full: All available fields
    • Simplified: Key fields only (name, address, phone, description, etc.)
  • Debug Mode (default: false): Enable verbose logging for troubleshooting

⏱️ Performance

  • Speed: ~1-2 restaurants per second with full detail scraping
  • Timeout: 1 hour default (configurable in run options)
  • Memory: 1 GB default (sufficient for most runs)
  • Data Volume: Can scrape thousands of restaurants in a single run
  • Typical Run Times:
    • Single country (100-200 restaurants): 2-5 minutes
    • Multiple countries (1000+ restaurants): 15-30 minutes
    • All countries (10,000+ restaurants): 45-60 minutes

πŸ“ Output Format

Data is exported in JSON format with all fields clearly labeled. You can download the results as JSON, CSV, Excel, or other formats directly from Apify.

Sample Output

{
"name": "Le Bernardin",
"description": "A temple to seafood in all its guises...",
"address": "155 West 51st Street, New York, NY 10019",
"phone": "+1 212-554-1515",
"website": "https://le-bernardin.com",
"city": {
"name": "New York"
},
"country": {
"name": "United States",
"slug": "united-states"
},
"michelin_award": "3 Stars",
"green_star": false,
"cuisines": [
{
"name": "Seafood",
"slug": "seafood"
},
{
"name": "French",
"slug": "french"
}
],
"price_category": "$$$",
"chef": "Eric Ripert",
"features": [
"Great wine list",
"Counter dining"
],
"credit_cards": [
"American Express",
"Visa",
"Mastercard"
],
"opening_hours": "Monday-Thursday: 5:00 PM - 10:00 PM...",
"booking_link": "https://...",
"url": "https://guide.michelin.com/us/en/new-york-state/new-york/restaurant/le-bernardin",
"_geoloc": {
"lat": 40.761868,
"lng": -73.981363
}
}

πŸ’‘ Tips

  • Start with a single country to test the output format and understand the data structure
  • Use the default settings for optimal balance of speed and reliability
  • For large global scrapes, consider breaking into multiple runs by region if you need faster results
  • Monitor the logs to track scraping progress and identify any issues
  • The scraper automatically retries failed pages to ensure data completeness
  • Empty page handling: The scraper intelligently detects the end of results vs. temporary page load failures

πŸ”§ Advanced Configuration

Resuming Interrupted Runs

The scraper automatically saves its state and can resume from where it left off if interrupted. The state includes:

  • Current page number
  • Total restaurants scraped
  • Timestamp of last update

Timeout Handling

The scraper includes multiple layers of timeout protection:

  • Page fetch timeout: 30 seconds per page
  • Detail scraping timeout: 15 seconds per restaurant
  • Batch timeout: 60 seconds per batch
  • Global runtime limit: Automatically stops 5 minutes before platform timeout

Error Recovery

  • Automatic retry for failed requests (up to 3 attempts with exponential backoff)
  • Failed URLs are tracked and skipped to prevent repeated failures
  • Empty page retry logic (2 attempts before considering truly empty)
  • Comprehensive error logging for debugging

πŸ“‹ Data Fields

Basic Information

  • name: Restaurant name
  • description / full_description: Full restaurant description from detail page
  • chef: Head chef name
  • url: Link to Michelin Guide page

Location

  • address / detailed_address: Complete street address
  • city: City information (name, slug)
  • country: Country information (name, slug, code)
  • region: Regional information
  • _geoloc: Geographic coordinates (lat, lng)

Michelin Recognition

  • michelin_award: Star rating (1 Star, 2 Stars, 3 Stars, Bib Gourmand, etc.)
  • green_star: Boolean for Green Star sustainability award
  • new_table: Boolean for newly added restaurants

Cuisine & Pricing

  • cuisines: Array of cuisine types
  • price_category: Price indicator (€, €€, €€€, €€€€)
  • currency: Local currency code
  • currency_symbol: Currency symbol

Contact & Booking

  • phone / phone_number: Contact phone number
  • website: Restaurant website
  • booking_link: Online booking link
  • online_booking: Boolean for online booking availability

Facilities & Services

  • features: Array of restaurant features/facilities
  • credit_cards: Accepted credit card types
  • opening_hours: Operating hours
  • take_away: Boolean for takeaway service
  • delivery: Boolean for delivery service

Images

  • main_image: Main restaurant image
  • images: Array of additional images
  • image: Featured image

πŸ›Ÿ Support

  • All configuration options include helpful descriptions in the input form
  • Built-in state persistence for interrupted runs
  • Detailed logging to track scraping progress
  • Automatic empty page detection to know when scraping is complete
  • Comprehensive error messages for troubleshooting

⚠️ Important Notes

  • This scraper is designed for legitimate data collection purposes
  • Please respect the Michelin Guide's terms of service
  • Use the data responsibly and in compliance with applicable laws
  • The scraper includes polite delays and rate limiting to avoid overwhelming servers
  • Consider the ethical implications of web scraping and use the data appropriately

πŸ”„ Updates & Maintenance

This scraper is actively maintained and updated to handle:

  • Changes to the Michelin Guide website structure
  • New countries added to the Michelin Guide
  • Performance optimizations
  • Bug fixes and stability improvements

πŸ“ž Getting Help

If you encounter any issues:

  1. Check the logs: The scraper provides detailed logging of its operations
  2. Verify your input: Ensure country codes and page numbers are valid
  3. Review the output: Check if partial data was collected before the error
  4. Enable debug mode: Turn on debug logging for more detailed information
  5. Contact support: Reach out through the Apify platform for assistance

🎯 Performance Benchmarks

Based on typical usage:

  • Austria (~200 restaurants): 3-4 minutes
  • France (~600 restaurants): 8-12 minutes
  • USA (~400 restaurants): 6-10 minutes
  • All Countries (~15,000 restaurants): 50-60 minutes

Note: Times may vary based on network conditions and server response times.


Technical Details

How It Works

  1. Configuration Extraction: Automatically extracts Algolia API configuration from Michelin Guide pages
  2. API Integration: Uses the Algolia search API to efficiently retrieve restaurant listings
  3. Detail Scraping: Fetches comprehensive information from individual restaurant pages
  4. Data Processing: Structures and formats data for easy consumption
  5. State Management: Tracks progress and handles interruptions gracefully

Architecture

  • Built with Python 3.11 and the Apify SDK
  • Asynchronous requests using aiohttp for optimal performance
  • BeautifulSoup for HTML parsing
  • Implements connection pooling and timeout management
  • Event-driven state persistence for reliability

Ready to start? Simply configure your desired settings and click Run! The scraper will handle the rest, providing you with comprehensive Michelin Guide restaurant data in a clean, structured format.