Fragrantica.com Scraper
Pricing
from $4.99 / 1,000 results
Fragrantica.com Scraper
Pricing
from $4.99 / 1,000 results
Rating
0.0
(0)
Developer
Scraper Engine
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
πΈ Fragrantica.com Scraper
The fastest, friendliest way to pull rich, structured perfume data from Fragrantica.com β accords, note pyramids, longevity, sillage, gender breakouts, ratings, real user reviews, similar perfumes and more β straight into a clean JSON dataset.
Built for the Apify platform with a Cloudflare-aware HTTP layer (impit + browser TLS impersonation), a Playwright-powered renderer for Vue.js sections, and a sticky proxy fallback that automatically escalates only when needed.
β¨ Why Choose This Scraper?
- π‘οΈ Anti-bot ready β impit + Chrome TLS impersonation, Playwright fallback, and a shared cooldown gate that prevents retry storms.
- πͺ Smart proxy fallback β starts direct, escalates to datacenter, then residential (3 retries) β and sticks with whichever tier works.
- β‘ Live results β every scraped perfume is pushed to your dataset as soon as it's ready, so a crash mid-run never loses work.
- π§ͺ Deeply structured output β main accords, top/middle/base note pyramid, longevity/sillage/price-value breakouts, season/gender/relation breakouts, pros & cons, real user reviews.
- π§ 6 dataset views β Overview, Accords, Pyramid, Ratings, Reviews, and Similar Perfumes β every section laid out in its own clean table.
- π Bulk or query β paste a list of perfume URLs or just give it a search query (or both).
- π΅ Pay per result β you only pay for perfumes that actually get scraped.
π§ Key Features
- π Bulk URL input (Fragrantica perfume pages or search URLs)
- π Built-in Algolia search by free-text query
- π·οΈ Brand, gender, description, primary image and full photo gallery
- π¨ Main accords with named colors + RGB + intensity
- π§ͺ Top / Middle / Base note pyramid (with images and links)
- β±οΈ Longevity, sillage, price-value & rating breakouts (counts + weighted average)
- π¦οΈ Season + gender + I-have/had/want relation breakouts
- π Pros & cons with like/dislike vote counts
- π¬ Real user reviews with normalized publish date, comment and structured vote tags
- π "People who like this also like" + "Reminds me of" recommendations
- π Sticky proxy fallback (direct β datacenter β residential)
- πͺΆ Heavy-resource blocking for fast browser renders
- π΅ Pay-per-event pricing β predictable, transparent
π₯ Input
| Field | Type | Description |
|---|---|---|
startUrls | array | Fragrantica perfume URLs or search URLs |
query | string | Free-text search query (optional / fallback) |
maxItems | integer | Max perfumes to scrape |
proxyConfiguration | object | Proxy mode β defaults to no proxy, auto-falls-back when blocked |
allReviews | boolean | Scroll & paginate review section |
maxItemsReviews | integer | Cap reviews per perfume |
omitFields | array | Top-level fields to drop |
concurrency | integer | Parallel page fetches |
minRequestIntervalSeconds | number | Floor delay between requests |
requestJitterSeconds | number | Random jitter on top |
requestTimeoutSeconds | integer | Single-request timeout |
maxRetries | integer | Retries per blocked/failed URL |
retryBackoffSeconds | number | Exponential backoff base |
rateLimitCooldownSeconds | number | Shared cooldown when 429/503 seen |
useBrowser | boolean | Use Playwright for Vue.js sections |
browserHeadless | boolean | Run Chromium headless |
blockHeavyResources | boolean | Block ads/trackers/fonts/media |
logLevel | string | DEBUG / INFO / WARNING / ERROR |
Example input
{"startUrls": [{ "url": "https://www.fragrantica.com/perfume/Avon/Avon-Smile-5892.html" }],"query": "Avon","maxItems": 50,"proxyConfiguration": { "useApifyProxy": false },"allReviews": true,"maxItemsReviews": 20}
π€ Output
Each dataset item has this shape (abridged):
{"id": "5892","url": "https://www.fragrantica.com/perfume/Avon/Avon-Smile-5892.html","title": "Avon Smile Avon perfume - a fragrance for women","description": "Avon Smile by Avon is a Floral fragrance for women. The fragrance features Mandarin Orange, Freesia and Mimosa.","primaryImageUrl": "https://fimgs.net/mdimg/perfume-thumbs/375x500.5892.jpg","images": ["β¦"],"brandName": "Avon","brandUrl": "https://www.fragrantica.com/designers/Avon.html","brandLogo": "https://fimgs.net/mdimg/dizajneri/m.7.jpg","mainAccords": [{ "accord": "citrus", "rgb": "rgb(249, 255, 82)", "hex": "#f9ff52", "value": 100.0 }],"pyramid": { "type": "full", "topNotes": [], "middleNotes": [], "baseNotes": [] },"longevityBreakout": [{"very weak": 3}, {"weak": 12}, {"moderate": 6}, {"long lasting": 2}, {"eternal": 0}],"longevityAverage": 2.3043,"sillageBreakout": [],"priceValueBreakout": [],"perfumeRating": 3.4,"ratingCount": 25,"gender": "female","genderBreakout": { "female": 18, "unisex": 4, "male": 1 },"seasonBreakout": { "spring": 16, "summer": 12 },"relationBreakout": { "have": 5, "had": 3, "want": 7 },"perfumers": [],"peopleWhoLikeThisAlsoLike": [],"thisPerfumeRemindsMeOf": [],"reviews": []}
π How to Use (Apify Console)
- Log in at https://console.apify.com β Actors.
- Open this Actor (
fragrantica-com-scraper). - Paste perfume URLs into Perfume URLs or Search URLs, or set a Search query.
- (Optional) Tweak
maxItems,concurrency,allReviews, etc. - Click Start.
- Watch logs stream in real time β every saved perfume is announced. π
- Open the Output tab to see results split into clean views:
- πΉ Overview
- π¨ Accords
- π§ͺ Pyramid
- β Ratings & Breakouts
- π¬ Reviews
- π Similar Perfumes
- Export to JSON / CSV / XLSX.
π€ Use via API
Start a run synchronously and get the dataset items back:
curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~fragrantica-com-scraper/run-sync-get-dataset-items?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"startUrls": [{"url": "https://www.fragrantica.com/perfume/Avon/Avon-Smile-5892.html"}],"maxItems": 10}'
Asynchronous (returns a run ID):
curl -X POST "https://api.apify.com/v2/acts/<USERNAME>~fragrantica-com-scraper/runs?token=$APIFY_TOKEN" \-H "Content-Type: application/json" \-d '{"query": "Tom Ford", "maxItems": 25}'
π‘οΈ Proxy Strategy (the smart fallback)
This Actor uses a sticky escalating proxy chain:
- π Direct β start with no proxy.
- π’ Datacenter β if Fragrantica blocks, fall back to Apify datacenter proxies.
- π‘ Residential β if datacenter also fails, fall back to residential and retry up to 3 times.
- Once any fallback engages, the Actor sticks with that tier for the rest of the run β no flapping.
All escalations are logged with clear emoji-tagged messages, so you can always see which tier is in play.
If you explicitly choose a proxy tier (e.g. residential only), that choice is honored and automatic escalation is disabled.
π― Best Use Cases
- ποΈ Building a perfume catalog or price-comparison product
- π Brand / niche analysis with longevity, sillage and price-value breakouts
- π€ Recommendation engines β "people who like this also likeβ¦"
- π§ͺ Note & accord research for fragrance creation
- π¬ Sentiment analysis on real user reviews
- π Academic / consumer-research datasets
π΅ Pricing
This Actor uses the Pay Per Event model:
| Event | What it bills |
|---|---|
apify-actor-start | One-time start fee per run (synthetic) |
result-item | Each successfully scraped perfume pushed to your dataset |
You only pay for perfumes that actually get returned β if Fragrantica is unreachable or blocks the run end-to-end, you won't be billed for empty results. Set maxItems to cap your spend predictably.
β Frequently Asked Questions
Why does the run start with no proxy? Fragrantica often responds happily to direct requests with Chrome TLS impersonation. Skipping the proxy makes the first batch faster and cheaper. The Actor auto-escalates the moment it sees a block.
My run says "Residential proxy retry 2/3" β is that bad? No, it just means datacenter wasn't enough and the Actor is trying residential. After 3 residential attempts in a row that still fail, the Actor surfaces a clear error.
Reviews come back empty β why?
Either allReviews is off, the perfume genuinely has no reviews, or the browser couldn't render the review section (very rare). Set allReviews: true and maxItemsReviews: 20 to get a healthy sample.
Can I scrape a search results page instead of perfume pages?
Yes β paste a search URL like https://www.fragrantica.com/search/?query=oud. The Actor reads the query parameter and resolves it to perfume URLs via the catalog.
Is the data complete on every run? The Actor saves partial fields rather than failing the whole record. A crash mid-page leaves the static fields intact even if browser enrichment couldn't finish.
βοΈ Caution & Legal
- All data is collected from publicly available Fragrantica pages.
- Honor Fragrantica's Terms of Service, robots.txt and rate limits.
- The end user is responsible for compliance with GDPR, CCPA and any other applicable regulations.
π¬ Support & Feedback
Open an issue or contact the maintainer via the Actor's Apify Store page. Bug reports with the run ID, input JSON and a description of what looked wrong are the easiest to fix. π