Cannabis License Scraper - National Aggregator
Pricing
Pay per event
Cannabis License Scraper - National Aggregator
National cannabis license database. Federates state cannabis boards into one normalized dataset: license number, type, status, business name, address, expiration. For cannabis B2B sales: software, payments, packaging, labs, insurance, M&A intel.
Pricing
Pay per event
Rating
0.0
(0)
Developer
BowTiedRaccoon
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
2 days ago
Last modified
Categories
Share
Cannabis License Scraper - National License Aggregator
Aggregate state cannabis-control-board license rosters into one normalized dataset. Returns license number, type, status, business name, geocoded address, owner contact, equity flags, and adult-use vs medical authorization across roughly 6,500 active US cannabis licensees.
Cannabis License Aggregator Features
- Federates six jurisdictions out of the box. New York OCM (adult-use), New York hemp, Massachusetts CCC, Connecticut DCP, Boulder County Colorado, and Utah DOH.
- Normalizes a dozen distinct status codes and license-type taxonomies into a single nine-value enum, so you can filter
license_status = 'active'and have it mean the same thing in every state. - Round-robin pagination across jurisdictions. You see records from every requested state before any one dominates the budget.
- Filters by license type (retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp), active-only status, adult-use only, or medical only.
- Flags social-equity / minority-owned licensees where the source state publishes that signal.
- Pure API access — no browser, no proxies, no CAPTCHA. Just JSON from public open-data portals.
Who Uses Cannabis License Data?
- Cannabis B2B sales (seed-to-sale, payments, packaging) — Pull active retailers and manufacturers by state, feed them into a CRM, run an outbound campaign by Friday.
- Testing labs and ancillary services — Identify cultivators and processors by state with cultivation-tier and environment fields where available.
- Cannabis-only insurance and finance — Active-license filter plus expiration date plus operational status — the inputs an underwriting model actually needs.
- M&A intelligence and market research — Track license issuance, expirations, and equity-program participation across states without rebuilding the same scraper six times.
- Compliance and license verification — Cross-reference a business name or license number against the source state's official roster.
How the Aggregator Works
- Pick jurisdictions by slug (
ny-ocm,ma-ccc,ct-dcp, ...) or by state code (NY,MA,CT). Leave both empty to pull from every supported jurisdiction. - Optionally filter by license type, active-only, adult-use, or medical. The filter runs after normalization, so
license_type = 'retailer'matches dispensaries in NY, retailers in MA, hybrid retailers in CT, and pharmacies in UT in one query. - The aggregator round-robins across jurisdictions, fetching a page from each in turn. Snapshot-style sources (Massachusetts) are loaded once and sliced client-side.
- Records come back in a single flat schema, ready for a warehouse, a CRM, or a CSV export.
Input
{"jurisdictions": ["ny-ocm", "ma-ccc"],"licenseTypes": ["retailer"],"onlyActive": true,"maxItems": 500}
| Field | Type | Default | Description |
|---|---|---|---|
jurisdictions | array[string] | (all) | Jurisdiction slugs. See supported list below. |
states | array[string] | — | 2-letter state codes. Expands to every supported jurisdiction in the state. |
licenseTypes | array[string] | — | Filter to specific normalized types: retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp. |
onlyActive | boolean | false | Drop expired, suspended, revoked, surrendered, denied, and pending licenses. |
onlyAdultUse | boolean | false | Filter to recreational / adult-use licensees only. |
onlyMedical | boolean | false | Filter to medical licensees only. Hybrid licensees pass both adult-use and medical filters. |
maxItems | integer | 15 | Maximum records to return across all jurisdictions. |
proxyConfiguration | object | (off) | Optional Apify proxy. Not needed for any supported portal. |
Supported Jurisdictions
| Slug | Jurisdiction | State | Records (approx.) |
|---|---|---|---|
ny-ocm | New York Office of Cannabis Management (Adult-Use) | NY | 2,800 |
ny-hemp | New York OCM (Cannabinoid Hemp) | NY | 2,700 |
ma-ccc | Massachusetts Cannabis Control Commission | MA | 960 |
ct-dcp | Connecticut DCP (Hybrid + Medical) | CT | 44 |
co-boulder | Boulder County, Colorado | CO | 33 |
ut-doh | Utah Department of Health (Medical Pharmacies) | UT | 10 |
Example: pull every active retailer in New York
{"states": ["NY"],"licenseTypes": ["retailer"],"onlyActive": true,"maxItems": 1000}
Example: cultivators across all jurisdictions, active only
{"licenseTypes": ["cultivator"],"onlyActive": true,"maxItems": 500}
Cannabis License Output Fields
{"license_number": "OCM-RETL-25-000306","business_name": "100 North 3rd Ltd","dba_name": "7 Leaf Clover","license_type": "retailer","license_type_raw": "Adult-Use Retail Dispensary License","license_status": "active","license_status_raw": "Active","license_issued_date": "2025-03-24","license_effective_date": "2025-03-24","license_expiration_date": "2027-03-24","application_number": "OCMRETL-2023-000090","primary_contact_name": "Jennifer Babaian","phone": "","email": "","website": "","address": "132 Metropolitan Ave","city": "Brooklyn","state": "NY","zip": "11249","county": "Kings","region": "Brooklyn","lat": null,"lng": null,"is_adult_use": true,"is_medical": false,"is_social_equity": true,"priority_status": "Women-Owned Business, Minority-Owned Business","operational_status": "Active","commence_operations_date": "","cultivation_environment": "","cultivation_tier": "","license_fee_amount": null,"source_jurisdiction": "ny-ocm","source_state": "NY","source_url": "https://data.ny.gov/Government-Finance/Current-OCM-Licenses/jskf-tt3q"}
| Field | Type | Description |
|---|---|---|
license_number | string | State-assigned license number. Primary key within the source. |
business_name | string | Legal business / entity name. |
dba_name | string | Doing-business-as / trade name when distinct from legal name. |
license_type | string | Normalized type: retailer, cultivator, manufacturer, processor, testing-lab, microbusiness, distributor, delivery, hemp, other. |
license_type_raw | string | Raw type string from the source — useful when you need the state's original taxonomy. |
license_status | string | Normalized status: active, expired, suspended, revoked, pending, inactive, surrendered, denied, other. |
license_status_raw | string | Raw status string from the source. |
license_issued_date | string | Date the license was originally issued (YYYY-MM-DD). |
license_effective_date | string | Date the current term took effect (YYYY-MM-DD). |
license_expiration_date | string | Date the license expires (YYYY-MM-DD). |
application_number | string | Original application identifier when supplied. |
primary_contact_name | string | Primary contact / responsible party name when supplied. |
phone | string | Business phone number when published. |
email | string | Business email when published (Massachusetts only, in practice). |
website | string | Business website URL when published. |
address | string | Establishment street address. |
city | string | Establishment city. |
state | string | Two-letter state code. |
zip | string | ZIP / postal code. |
county | string | County when published. |
region | string | Sub-state region label (e.g. NY OCM exposes Brooklyn, Hudson Valley). |
lat | number | Latitude (WGS84) when published. |
lng | number | Longitude (WGS84) when published. |
is_adult_use | boolean | true when the license authorizes recreational / adult-use sales. |
is_medical | boolean | true when the license authorizes medical cannabis activities. |
is_social_equity | boolean | true when the licensee qualifies under a social-equity / minority-owned program. |
priority_status | string | Equity / priority program label from the source state. |
operational_status | string | Operational status (e.g. Active, Operating, Non-Operational). |
commence_operations_date | string | Date the licensee commenced operations (Massachusetts publishes this). |
cultivation_environment | string | Indoor, Outdoor, or mixed — when the source publishes it. |
cultivation_tier | string | Canopy tier (state-specific). |
license_fee_amount | number | License fee paid in USD when published. |
source_jurisdiction | string | Slug of the jurisdiction that produced this record. |
source_state | string | Two-letter source state code. |
source_url | string | URL of the source dataset on the state open-data portal. |
FAQ
How do I scrape cannabis licenses across multiple US states?
Cannabis License Aggregator ships with six state-level jurisdictions and normalizes them into a single schema. Pick jurisdictions by slug, by state code, or leave both empty to hit every supported portal in one run.
How much does Cannabis License Aggregator cost to run?
Cannabis License Aggregator runs on pay-per-event pricing: $0.10 per actor start plus $0.00125 per record. A full sweep of every supported jurisdiction (~6,500 records) is about $8.
Does Cannabis License Aggregator need proxies?
No. All supported sources are public state-government open-data portals (Socrata SODA and direct JSON snapshots) designed for third-party consumption. The actor defaults to direct requests.
Can I filter to only active licenses?
Yes. onlyActive: true drops every record whose normalized status is anything other than active — that means no expired, no suspended, no revoked, no pending applications. This is what you want for outbound sales lists.
What's the difference between license_type and license_type_raw?
license_type is the normalized value — one of nine canonical categories — so a query for retailer matches dispensaries in NY, retailers in MA, hybrid retailers in CT, and pharmacies in UT. license_type_raw is whatever the source state shipped, for when you need the original taxonomy.
Which states are coming next?
The most-requested additions are California (BCC), Washington (LCB), Oregon (OLCC), Illinois, and Michigan. California publishes licensee data behind CloudFront, Washington publishes only Excel files, and Oregon publishes PDFs. Each requires a per-state adapter. File a request for the state you need.
Need More Features?
Need a state that isn't in the registry, owner-history fields, or violations data? File an issue or get in touch.
Why Use the Cannabis License Aggregator?
- One schema, six jurisdictions — Query once, get normalized results from NY, MA, CT, CO, and UT. No per-state ETL.
- Built for cannabis B2B sales —
licenseTypesfilter,onlyActivestatus, and the social-equity flag cover the screens that seed-to-sale, payments, packaging, and insurance vendors ask for first. - Cheap to run — $0.00125 per record. A national sweep is coffee money, not a budget line item.