Swiss Grocery Scraper avatar

Swiss Grocery Scraper

Under maintenance

Pricing

Pay per usage

Go to Apify Store
Swiss Grocery Scraper

Swiss Grocery Scraper

Under maintenance

Scrapes weekly offers from Swiss grocery retailers (Aldi, Migros, Coop, Denner, Lidl). Uses Crawlee + Docling for web and PDF extraction. Outputs structured product data with prices, discounts, and categories.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Niklas Wichter

Niklas Wichter

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

8 days ago

Last modified

Categories

Share

Scrapes weekly offers and promotions from the five major Swiss grocery retailers:

  • Aldi Suisse — PDF flyer extraction via Docling
  • Migros — Issuu Wochenflyer OCR + aktionen HTML page
  • Coop — aktionen HTML page via Playwright
  • Denner — aktionen HTML + Issuu Wochenprospekt
  • Lidl Schweiz — PDF flyer discovery + extraction

Output

Each scraped product is pushed to the default dataset with these fields:

FieldTypeDescription
retailerstringRetailer name (aldi, migros, coop, denner, lidl)
namestringProduct name
pricenumberPrice in CHF
discount_pctnumberDiscount percentage (if available)
image_urlstringURL to product image (if available)
categorystringProduct category
regionstringSwiss region

Input

FieldDefaultDescription
retailersall fiveArray of retailer names to scrape
regionzurichSwiss region (zurich, bern, basel)
maxItems200Max items per retailer (1-1000)
webhookUrlOptional webhook URL for completion notification
webhookApiKeyOptional API key for webhook auth header

Example Input

{
"retailers": ["migros", "coop"],
"region": "zurich",
"maxItems": 50
}

How It Works

  1. Scrapes all requested retailers in parallel
  2. Uses Crawlee (Playwright/BeautifulSoup) for HTML pages
  3. Uses Docling for PDF/image OCR extraction
  4. Pushes structured product data to the dataset
  5. Stores a run summary in the key-value store
  6. Optionally sends a webhook notification on completion

Technology

  • Python 3.12, Apify SDK v3, Crawlee v1.5
  • Docling for document/PDF/image extraction
  • Playwright for JavaScript-rendered pages
  • BeautifulSoup for static HTML pages