Fragrantica Scraper - Perfume Data Extractor
Pricing
from $5.00 / 1,000 results
Fragrantica Scraper - Perfume Data Extractor
5$ per 1000 listings. Scrape fragrance data from Fragrantica.com - including perfume names, brands, notes, ratings, reviews, images and their clones. Fast and structured data extraction from largest perfume community.
Pricing
from $5.00 / 1,000 results
Rating
0.0
(0)
Developer
Faheem Ahmed
Maintained by CommunityActor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Extract structured perfume data from Fragrantica detail pages. This Apify Actor is built for direct Fragrantica perfume URLs and returns clean JSON records with perfume names, brands, fragrance notes, main accords, ratings, images, launch year, perfumer data, and related perfume links.
Use this Fragrantica scraper for perfume catalog building, fragrance market research, competitive analysis, product enrichment, recommendation systems, and structured fragrance data pipelines.
Key Features
- Scrapes specific Fragrantica perfume detail URLs
- Extracts perfume name, brand, gender, rating, vote count, image, and launch year
- Extracts top notes, middle notes, base notes, and main accords
- Extracts perfumer or nose information when available
- Returns one structured JSON item per input URL
- Includes status and error fields for failed URLs
- Supports static and browser-backed Scrapling fetchers
- Optional proxy URL support for more reliable extraction
- Designed for Apify Dataset output and downstream database workflows
What This Actor Scrapes
This Actor focuses on Fragrantica perfume detail pages, for example:
https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html
It does not crawl search pages, brand pages, pagination, or category listings. Provide the exact perfume URLs you want to extract.
Why Use This Fragrantica Scraper?
Fragrantica contains detailed perfume data, fragrance notes, ratings, accords, brand information, and community-driven perfume metadata. This Actor turns those detail pages into structured JSON that can be used for analysis, enrichment, search indexes, internal catalogs, dashboards, or AI and recommendation workflows.
The scraper is intentionally direct and predictable: each input URL produces one output record, making it easier to track source URLs, retry failures, and map results back to your own product or research data.
Who Is This Actor For?
- Fragrance researchers analyzing perfume notes, accords, and ratings
- E-commerce teams enriching perfume product catalogs
- Data teams building structured fragrance datasets
- Developers building perfume search or recommendation tools
- Market researchers comparing brands, launches, and fragrance profiles
- Collectors and fragrance communities organizing perfume metadata
Input Schema
The Actor expects a JSON input with one or more Fragrantica perfume detail URLs.
{"urls": ["https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html"],"maxConcurrency": 2,"requestDelaySecs": 2,"fetcher": "auto","timeoutSecs": 30,"retries": 2,"proxyUrl": "","includeRawHtml": false}
Input Parameters
| Field | Type | Required | Description |
|---|---|---|---|
urls | Array of strings | Yes | Specific Fragrantica perfume detail URLs to scrape. |
maxConcurrency | Number | No | Number of URLs processed at the same time. Default is 2. |
requestDelaySecs | Number | No | Delay before each request, in seconds. Default is 2. |
fetcher | String | No | Fetch mode: auto, static, browser, dynamic, or stealthy. Default is auto. |
timeoutSecs | Number | No | Request timeout in seconds. Default is 30. |
retries | Number | No | Retry attempts per URL. Default is 2. |
proxyUrl | String | No | Optional proxy URL passed to Scrapling. |
includeRawHtml | Boolean | No | Include raw fetched HTML in output records. Default is false. |
Fetcher Modes
| Mode | Description |
|---|---|
auto | Tries static fetching first, then falls back to browser-backed fetching if needed. |
static | Uses Scrapling static HTTP fetching. Fastest option when the page is accessible. |
browser | Uses a lightweight browser fetch that waits for DOM content instead of full page load. Best fallback for heavy Fragrantica pages. |
dynamic | Uses browser-backed fetching for pages that need rendering. |
stealthy | Uses Scrapling stealthy browser fetching. Useful when static or dynamic fetching is insufficient. |
Proxy Note
Fragrantica may challenge or block some runtime IPs. If requests fail or return challenge pages, use a reliable proxy through proxyUrl and keep concurrency conservative.
Output Schema
Each dataset item represents one Fragrantica perfume URL.
{"sourceUrl": "https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html","canonicalUrl": "https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html","scrapedAt": "2026-05-24T11:09:18.203793+00:00","fetcherUsed": "dynamic","name": "Bleu de Chanel","brand": "Chanel","gender": "men","imageUrl": "https://fimgs.net/mdimg/perfume-social-cards/en-social-9099.jpeg","rating": 4.18,"ratingCount": 20607,"year": 2010,"perfumers": ["Jacques Polge"],"mainAccords": ["citrus","woody","fresh spicy","aromatic","amber"],"topNotes": ["Grapefruit", "Lemon", "Mint", "Pink Pepper"],"middleNotes": ["Ginger", "Nutmeg", "Jasmine", "Iso E Super"],"baseNotes": ["Incense","Cedar","Vetiver","Sandalwood","Patchouli","Labdanum","White Musk"],"description": "Bleu de Chanel by Chanel is a Woody Aromatic fragrance for men.","concentration": "Cologne","seasonVotes": {},"longevityVotes": {},"sillageVotes": {},"genderVotes": {},"priceValueVotes": {},"similarPerfumes": [{"name": "Dior Sauvage","url": "https://www.fragrantica.com/perfume/Dior/Sauvage-31861.html"}],"breadcrumbs": [],"status": "ok","error": null}
Output Fields
| Field | Description |
|---|---|
sourceUrl | Original input URL. |
canonicalUrl | Canonical Fragrantica perfume URL when available. |
scrapedAt | ISO timestamp for the extraction. |
fetcherUsed | Scrapling fetcher that produced the accepted record. |
name | Perfume name. |
brand | Perfume brand or house. |
gender | Gender/category text inferred from the page. |
imageUrl | Main image URL. |
rating | Fragrantica rating value when available. |
ratingCount | Number of rating votes when available. |
year | Launch year when available. |
perfumers | List of perfumers or noses. |
mainAccords | Main fragrance accords as text labels. |
topNotes | Top note names. |
middleNotes | Middle or heart note names. |
baseNotes | Base note names. |
description | Main perfume description text. |
concentration | Detected concentration text where possible. |
seasonVotes | Season vote data when present in fetched markup. |
longevityVotes | Longevity vote data when present in fetched markup. |
sillageVotes | Sillage vote data when present in fetched markup. |
genderVotes | Gender vote data when present in fetched markup. |
priceValueVotes | Price value vote data when present in fetched markup. |
similarPerfumes | Related perfume links found on the detail page. |
breadcrumbs | Breadcrumb labels when available. |
status | ok or failed. |
error | Error message for failed URLs. |
rawHtml | Raw fetched HTML, only when includeRawHtml is true. |
Example Use Cases
- Build a perfume database from a list of Fragrantica URLs
- Enrich fragrance product pages with notes, accords, ratings, and launch data
- Compare perfume brands, ratings, perfumers, and fragrance profiles
- Create datasets for fragrance recommendation engines
- Monitor specific perfumes for rating or metadata changes
- Prepare structured fragrance data for analytics, BI, or AI workflows
Running Locally
Install dependencies:
$pip install -r requirements.txt
Run with Apify CLI:
$apify run -i input.json
Example input.json:
{"urls": ["https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html"],"fetcher": "auto"}
FAQ
Does this Actor scrape Fragrantica search results?
No. This Actor is for specific Fragrantica perfume detail URLs. It does not crawl search pages or pagination.
Can I scrape multiple perfumes at once?
Yes. Add multiple perfume URLs to the urls array. The Actor returns one dataset item per URL.
Does it extract perfume notes?
Yes. It extracts top notes, middle notes, base notes, and main accords where available.
Does it scrape reviews?
No. Review scraping is not part of the current implementation.
What happens when a URL fails?
The Actor still pushes a dataset item with status: "failed" and an error message.
Should I use a proxy?
Use a proxy if Fragrantica blocks or challenges the runtime IP. Keep concurrency low for better reliability.