Fragrantica Scraper - Perfume Data Extractor avatar

Fragrantica Scraper - Perfume Data Extractor

Pricing

from $5.00 / 1,000 results

Go to Apify Store
Fragrantica Scraper - Perfume Data Extractor

Fragrantica Scraper - Perfume Data Extractor

5$ per 1000 listings. Scrape fragrance data from Fragrantica.com - including perfume names, brands, notes, ratings, reviews, images and their clones. Fast and structured data extraction from largest perfume community.

Pricing

from $5.00 / 1,000 results

Rating

0.0

(0)

Developer

Faheem Ahmed

Faheem Ahmed

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

Extract structured perfume data from Fragrantica detail pages. This Apify Actor is built for direct Fragrantica perfume URLs and returns clean JSON records with perfume names, brands, fragrance notes, main accords, ratings, images, launch year, perfumer data, and related perfume links.

Use this Fragrantica scraper for perfume catalog building, fragrance market research, competitive analysis, product enrichment, recommendation systems, and structured fragrance data pipelines.


Key Features

  • Scrapes specific Fragrantica perfume detail URLs
  • Extracts perfume name, brand, gender, rating, vote count, image, and launch year
  • Extracts top notes, middle notes, base notes, and main accords
  • Extracts perfumer or nose information when available
  • Returns one structured JSON item per input URL
  • Includes status and error fields for failed URLs
  • Supports static and browser-backed Scrapling fetchers
  • Optional proxy URL support for more reliable extraction
  • Designed for Apify Dataset output and downstream database workflows

What This Actor Scrapes

This Actor focuses on Fragrantica perfume detail pages, for example:

https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html

It does not crawl search pages, brand pages, pagination, or category listings. Provide the exact perfume URLs you want to extract.


Why Use This Fragrantica Scraper?

Fragrantica contains detailed perfume data, fragrance notes, ratings, accords, brand information, and community-driven perfume metadata. This Actor turns those detail pages into structured JSON that can be used for analysis, enrichment, search indexes, internal catalogs, dashboards, or AI and recommendation workflows.

The scraper is intentionally direct and predictable: each input URL produces one output record, making it easier to track source URLs, retry failures, and map results back to your own product or research data.


Who Is This Actor For?

  • Fragrance researchers analyzing perfume notes, accords, and ratings
  • E-commerce teams enriching perfume product catalogs
  • Data teams building structured fragrance datasets
  • Developers building perfume search or recommendation tools
  • Market researchers comparing brands, launches, and fragrance profiles
  • Collectors and fragrance communities organizing perfume metadata

Input Schema

The Actor expects a JSON input with one or more Fragrantica perfume detail URLs.

{
"urls": [
"https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html"
],
"maxConcurrency": 2,
"requestDelaySecs": 2,
"fetcher": "auto",
"timeoutSecs": 30,
"retries": 2,
"proxyUrl": "",
"includeRawHtml": false
}

Input Parameters

FieldTypeRequiredDescription
urlsArray of stringsYesSpecific Fragrantica perfume detail URLs to scrape.
maxConcurrencyNumberNoNumber of URLs processed at the same time. Default is 2.
requestDelaySecsNumberNoDelay before each request, in seconds. Default is 2.
fetcherStringNoFetch mode: auto, static, browser, dynamic, or stealthy. Default is auto.
timeoutSecsNumberNoRequest timeout in seconds. Default is 30.
retriesNumberNoRetry attempts per URL. Default is 2.
proxyUrlStringNoOptional proxy URL passed to Scrapling.
includeRawHtmlBooleanNoInclude raw fetched HTML in output records. Default is false.

Fetcher Modes

ModeDescription
autoTries static fetching first, then falls back to browser-backed fetching if needed.
staticUses Scrapling static HTTP fetching. Fastest option when the page is accessible.
browserUses a lightweight browser fetch that waits for DOM content instead of full page load. Best fallback for heavy Fragrantica pages.
dynamicUses browser-backed fetching for pages that need rendering.
stealthyUses Scrapling stealthy browser fetching. Useful when static or dynamic fetching is insufficient.

Proxy Note

Fragrantica may challenge or block some runtime IPs. If requests fail or return challenge pages, use a reliable proxy through proxyUrl and keep concurrency conservative.


Output Schema

Each dataset item represents one Fragrantica perfume URL.

{
"sourceUrl": "https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html",
"canonicalUrl": "https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html",
"scrapedAt": "2026-05-24T11:09:18.203793+00:00",
"fetcherUsed": "dynamic",
"name": "Bleu de Chanel",
"brand": "Chanel",
"gender": "men",
"imageUrl": "https://fimgs.net/mdimg/perfume-social-cards/en-social-9099.jpeg",
"rating": 4.18,
"ratingCount": 20607,
"year": 2010,
"perfumers": ["Jacques Polge"],
"mainAccords": [
"citrus",
"woody",
"fresh spicy",
"aromatic",
"amber"
],
"topNotes": ["Grapefruit", "Lemon", "Mint", "Pink Pepper"],
"middleNotes": ["Ginger", "Nutmeg", "Jasmine", "Iso E Super"],
"baseNotes": [
"Incense",
"Cedar",
"Vetiver",
"Sandalwood",
"Patchouli",
"Labdanum",
"White Musk"
],
"description": "Bleu de Chanel by Chanel is a Woody Aromatic fragrance for men.",
"concentration": "Cologne",
"seasonVotes": {},
"longevityVotes": {},
"sillageVotes": {},
"genderVotes": {},
"priceValueVotes": {},
"similarPerfumes": [
{
"name": "Dior Sauvage",
"url": "https://www.fragrantica.com/perfume/Dior/Sauvage-31861.html"
}
],
"breadcrumbs": [],
"status": "ok",
"error": null
}

Output Fields

FieldDescription
sourceUrlOriginal input URL.
canonicalUrlCanonical Fragrantica perfume URL when available.
scrapedAtISO timestamp for the extraction.
fetcherUsedScrapling fetcher that produced the accepted record.
namePerfume name.
brandPerfume brand or house.
genderGender/category text inferred from the page.
imageUrlMain image URL.
ratingFragrantica rating value when available.
ratingCountNumber of rating votes when available.
yearLaunch year when available.
perfumersList of perfumers or noses.
mainAccordsMain fragrance accords as text labels.
topNotesTop note names.
middleNotesMiddle or heart note names.
baseNotesBase note names.
descriptionMain perfume description text.
concentrationDetected concentration text where possible.
seasonVotesSeason vote data when present in fetched markup.
longevityVotesLongevity vote data when present in fetched markup.
sillageVotesSillage vote data when present in fetched markup.
genderVotesGender vote data when present in fetched markup.
priceValueVotesPrice value vote data when present in fetched markup.
similarPerfumesRelated perfume links found on the detail page.
breadcrumbsBreadcrumb labels when available.
statusok or failed.
errorError message for failed URLs.
rawHtmlRaw fetched HTML, only when includeRawHtml is true.

Example Use Cases

  • Build a perfume database from a list of Fragrantica URLs
  • Enrich fragrance product pages with notes, accords, ratings, and launch data
  • Compare perfume brands, ratings, perfumers, and fragrance profiles
  • Create datasets for fragrance recommendation engines
  • Monitor specific perfumes for rating or metadata changes
  • Prepare structured fragrance data for analytics, BI, or AI workflows

Running Locally

Install dependencies:

$pip install -r requirements.txt

Run with Apify CLI:

$apify run -i input.json

Example input.json:

{
"urls": [
"https://www.fragrantica.com/perfume/Chanel/Bleu-de-Chanel-9099.html"
],
"fetcher": "auto"
}

FAQ

Does this Actor scrape Fragrantica search results?

No. This Actor is for specific Fragrantica perfume detail URLs. It does not crawl search pages or pagination.

Can I scrape multiple perfumes at once?

Yes. Add multiple perfume URLs to the urls array. The Actor returns one dataset item per URL.

Does it extract perfume notes?

Yes. It extracts top notes, middle notes, base notes, and main accords where available.

Does it scrape reviews?

No. Review scraping is not part of the current implementation.

What happens when a URL fails?

The Actor still pushes a dataset item with status: "failed" and an error message.

Should I use a proxy?

Use a proxy if Fragrantica blocks or challenges the runtime IP. Keep concurrency low for better reliability.