strube.de scraper
Pricing
$10.00/month + usage
strube.de scraper
Our strube.de scraper effortlessly gathers URLs from all pages and extracts detailed information from each product page
Pricing
$10.00/month + usage
Rating
0.0
(0)
Developer

youssef farhan
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
8 days ago
Last modified
Categories
Share
Strube.de Product Scraper
This Apify Actor is a specialized web scraping solution engineered to extract comprehensive product data from the Strube Verlag webshop. It is specifically built to handle the nuances of music publishing data, including complex product variations, multi-format media samples, and dynamic tiered pricing.
🚀 What This Scraper Does
The scraper automates the collection of sheet music, books, and media data. Unlike a standard crawler, it performs a "deep dive" into product attributes and dynamic elements that are often missed by simpler tools. Der Scraper automatisiert die Erfassung von Noten, Büchern und Mediendaten. Im Gegensatz zu einem herkömmlichen Crawler führt er einen „Deep Dive“ in Produkteigenschaften und dynamische Elemente durch, die von einfacheren Tools häufig übersehen werden.
🔑 Key Features
Deep Metadata Extraction
Captures:
- Titles
- Composers (Komponist)
- Detailed descriptions
- Multi-level categories
Media Link Harvesting
Automatically identifies and extracts URLs for:
- PDF samples (Table of Contents, Sheet Music)
- MP3 audio previews
Variation Handling
Intelligently identifies product variations (e.g., different instrumentations or editions) and generates distinct data rows for each.
🛠 How It Works
1️⃣ Discovery
Provide Start URLs (category pages or specific products).
The scraper automatically handles pagination to find every relevant item.
2️⃣ Parsing
Uses BeautifulSoup to dissect the HTML and extract structured metadata.
3️⃣ Dynamic Fetching
For products with bulk discounts, the scraper triggers secondary background calls to retrieve exact price-per-quantity tiers.
4️⃣ Storage
Cleaned data is pushed to your Apify Dataset, ready for export in:
- JSON
- CSV
- Excel
📦 Output Data Structure
For every product or variation, you receive a structured object.
Example Output (Choral Book)
{"Product URL": "https://www.strube.de/produkt/posaunen-choralbuch-ausg-bayern-thueringen/","Title": "Posaunen-Choralbuch. Ausg. Bayern/Thüringen","Subtitle": "Hrsg. von Barbara Barsch, Erhard Frieß und Karl-Heinz Saretzki","Kategorien": "Bläsermusik, Bläserchor allgemein","Komponist": "Evangelischer Posaunendienst","Beschreibung": "Dieses Standardwerk bietet zu allen Melodien... jeweils eine Intonation und einen vierstimmigen Begleitsatz.","Inhaltsverzeichnisse (one or more PDF-URL)": "","Notenbeispiele (one or more PDF-URLs)": "","Klangbeispiele (one or more MP3-URLs)": "https://www.strube.de/wp-content/uploads/2024/08/T2099.mp3","Image Links": "https://www.strube.de/wp-content/uploads/2019/08/B2099.GIF","Artikelnummer": "2099","Gewicht": "0.722","Instrument / Ausgabe": "Standard","Price 1": "29.50","Quantity 1": 1}