Leafly Scraper avatar
Leafly Scraper

Pricing

Pay per event

Go to Apify Store
Leafly Scraper

Leafly Scraper

Extract comprehensive dispensary and product data from Leafly.com with zero configuration. This Actor uses pre-configured extraction patterns to scrape dispensary details, product catalogs, pricing, strain information, and special offers — all without requiring external API keys or manual setup.

Pricing

Pay per event

Rating

5.0

(1)

Developer

Paradox Analytics

Paradox Analytics

Maintained by Community

Actor stats

0

Bookmarked

8

Total users

2

Monthly active users

44 minutes ago

Last modified

Share

Leafly Dispensary Scraper - Apify Actor

Apify Version

Extract comprehensive dispensary and product data from Leafly.com with enhanced pagination and flexible scraping modes.

🚀 Quick Start

Default Mode (Fast - 2-5 minutes)

{
"mode": "single_url",
"dispensaryUrl": "https://www.leafly.com/dispensary-info/curaleaf---western",
"proxyType": "residential"
}

State Discovery Mode

{
"mode": "state",
"state": "nevada",
"maxStores": 10,
"proxyType": "residential"
}

✨ Features

V2.0 Updates (Latest)

  • Single Dispensary URL Mode: Scrape specific dispensaries in 2-5 minutes
  • Enhanced Pagination: Burst windows + fixpoint convergence (98-99% capture)
  • Duplicate Detection: No more infinite loops
  • Apify Proxy Integration: Native residential/datacenter proxy support
  • 5-Minute Compliance: Default mode completes within Apify's test requirements

Core Features

  • 🏪 Dispensary Details: Name, address, hours, ratings, license info
  • 🌿 Product Catalog: Full menu with THC/CBD, pricing, strains
  • 💰 Special Offers: Deals, discounts, promotions
  • 📊 Strain Information: Effects, terpenes, genetics, ratings
  • 🔄 Multi-State Support: All legal cannabis states in the US

📊 Input Configuration

Scraping Modes

ModeDescriptionCompletion TimeUse Case
single_urlScrape one dispensary2-5 minutesTesting, monitoring specific stores
stateDiscover multiple dispensaries10+ minutesMarket research, full state data

Required Parameters

  • mode (string): "single_url" or "state"
  • dispensaryUrl (string): Direct Leafly dispensary URL (required for single_url mode)

Optional Parameters

  • state (string): US state slug (only for state mode) - Default: "nevada"
  • maxStores (integer): Max dispensaries to scrape (state mode) - Default: 5
  • taskCount (integer): Parallel workers (state mode) - Default: 3
  • proxyType (string): "residential", "datacenter", or "none" - Default: "residential"
  • includeOffers (boolean): Extract deals - Default: true
  • includeStrainData (boolean): Detailed strain info - Default: true
  • outputFormat (string): "dataset", "csv", or "both" - Default: "dataset"
  • debugMode (boolean): Verbose logging - Default: false

📤 Output Data

Named Datasets

Dispensaries Dataset (dispensaries)

{
"name": "Curaleaf - Western",
"address": "5905 S Eastern Ave, Las Vegas, NV 89119",
"rating": 4.6,
"review_count": 2847,
"license_type": "recreational",
"delivery_available": true,
"hours_today": "9:00 AM - 12:00 AM"
}

Products Dataset (products)

{
"product_name": "Zkittlez - Live Rosin",
"brand": "Qualcan",
"category": "concentrates",
"thc_percent": 78.5,
"cbd_percent": 0.2,
"price": 45.0,
"weight_grams": 1.0,
"strain_type": "indica",
"in_stock": true
}

Offers Dataset (offers)

{
"dispensary_name": "Curaleaf - Western",
"offer_title": "20% Off All Concentrates",
"deal_type": "percentage",
"discount_percent": 20,
"valid_until": "2025-11-30"
}

Default Dataset

Combined data with type field for easy filtering and accurate result counts.

Key-Value Store

  • summary: Run metadata and statistics
  • CSV files (if requested via outputFormat)

🌐 Supported States

  • Nevada, California, Colorado, Oregon, Washington
  • Arizona, Michigan, Illinois, Massachusetts
  • New York, Maine, Montana, New Jersey
  • New Mexico, Oklahoma, Pennsylvania

🔧 Advanced Configuration

Example: Maximum Data Extraction

{
"mode": "state",
"state": "california",
"maxStores": 50,
"taskCount": 10,
"proxyType": "residential",
"includeOffers": true,
"includeStrainData": true,
"outputFormat": "both",
"debugMode": false
}

Example: Fast Testing

{
"mode": "single_url",
"dispensaryUrl": "https://www.leafly.com/dispensary-info/curaleaf---western",
"proxyType": "datacenter",
"includeStrainData": false,
"outputFormat": "dataset"
}

💰 Cost Optimization

Actor Usage Pricing

The Actor uses a per-event pricing model:

  • Actor Start: $0.20 per run (first 5 seconds waived)
  • Browser Open: $0.10 per run
  • Page Scraped: $0.002 per page
  • Results: $0.005 per item in dataset

Estimated Costs:

  • Single Dispensary: ~$1.30 + proxies (~200 products)
  • 5 Dispensaries: ~$5.40 + proxies (~1000 products)

Proxy Credits

  • Single URL Mode: ~200-500 proxy requests per run
  • State Mode (5 stores): ~1000-2500 proxy requests per run
  • Residential Proxies: Higher quality, better success rate, higher cost
  • Datacenter Proxies: Lower cost, higher block rate

Runtime

  • Single URL: 2-5 minutes = minimal compute costs
  • State Mode: Scales with maxStores × 2-3 minutes

🛠️ Technical Details

Architecture

  • Browser: Camoufox (anti-detection) + Playwright fallback
  • Extraction: Pre-configured patterns (no OpenAI required)
  • Pagination: Burst windows with duplicate detection
  • Anti-Blocking: Session warming, proxy rotation, delays
  • Data Quality: 98-99% product capture rate

Performance

  • ~200-500 products per dispensary
  • ~1-2 minutes per dispensary
  • Zero duplicate products
  • Proper pagination handling

🐛 Troubleshooting

Common Issues

Actor times out at 5 minutes

  • Increase Actor timeout in Apify console settings
  • Use single_url mode for faster runs
  • Reduce maxStores in state mode

No products extracted

  • Check if dispensary URL is valid
  • Enable debugMode to see detailed logs
  • Verify proxy type is set correctly

Proxy errors

  • Check Apify proxy subscription status
  • Try switching between residential/datacenter
  • Ensure sufficient proxy credits

📚 Documentation

  • ./DEPLOYMENT_GUIDE_V2.md - Full deployment instructions
  • ./ACTOR_DESCRIPTION.md - Detailed feature list
  • Apify SDK Docs - Python SDK reference

🎯 Use Cases

  1. Market Research: Analyze product availability across regions
  2. Price Monitoring: Track pricing trends and competitive analysis
  3. Inventory Tracking: Monitor stock availability
  4. Compliance: Track licensed dispensaries and products
  5. Deal Aggregation: Collect special offers and promotions

📞 Support

For issues or questions:

  1. Check Troubleshooting section
  2. Enable debugMode and review logs
  3. Contact via Apify Actor support

📄 License

This Actor is provided as-is for data extraction purposes. Users are responsible for complying with Leafly's Terms of Service and applicable laws regarding data scraping.


🔄 Version History

V2.0 (Current)

  • Added single dispensary URL mode
  • Enhanced pagination with burst windows
  • Fixed infinite loop issues
  • Integrated production-json improvements
  • Optimized for Apify's 5-minute test requirement

V1.0

  • Initial release
  • State-based scraping only
  • Basic pagination

Ready to extract cannabis data at scale! 🌿