Luma/Cerebral Valley
Pricing
from $10.00 / 1,000 results
Luma/Cerebral Valley
Scrapes Luma and Cerebral Valley for events by city, date range, and keywords. Returns a normalized dataset: title, URL, start/end time, venue, city, organizer, description, price. Deduplicated and filtered so results match your chosen location and dates.
Pricing
from $10.00 / 1,000 results
Rating
0.0
(0)
Developer
Petros Hong
Actor stats
0
Bookmarked
7
Total users
3
Monthly active users
25 days
Issues response
25 days ago
Last modified
Categories
Share
Events Agent – Apify Actor
Apify Actor that finds events by location, date window, and keywords from Luma and Cerebral Valley, then outputs a normalized, deduplicated dataset. Use separate actors for Meetup or Google if needed.
Sources
- Luma (lu.ma) – city-specific event pages (SF, NYC, London, etc.)
- Cerebral Valley – cerebralvalley.ai/events, pre-filtered by city
- Extra URLs (optional) – any event or list page URLs you add in
sources.extraUrls
How location filtering works
| Layer | What it does |
|---|---|
| Luma city page | Builds a city-specific URL (e.g. lu.ma/sf) so only local events are listed |
| CV list pre-filter | Reads event-card text on the global CV page and drops cards that mention a conflicting city or "Online" |
| Post-crawl city filter | Rejects events whose location fields clearly belong to a different city |
| Post-crawl country filter | Rejects events from a different country (US/UK/CA normalized) |
| Online detection | Online-only events are excluded when a physical city is requested; hybrid events pass through |
Input schema (summary)
| Field | Type | Default |
|---|---|---|
location | { city?, region?, country? } | {} – city filters by city; country adds secondary filter |
keywords | string[] | [] – annotates events with matchedKeywords |
startDate | date string | today (America/Los_Angeles) |
endDate | date string | start + 14 days |
sources | { luma, cerebralValley, extraUrls[] } | both on, no extras |
maxEventsPerSource | 1–50 | 25 |
headless | boolean | true |
proxyConfiguration | Apify proxy object | optional |
debug | boolean | false |
Date formats accepted: YYYY-MM-DD, MM/DD/YYYY, DD.MM.YYYY. Start date must be on or before end date.
Run locally
-
Install and build
npm installnpm run build -
Run with local storage
# Linux/Macexport APIFY_LOCAL_STORAGE_DIR=./storage# Windowsset APIFY_LOCAL_STORAGE_DIR=./storagenpm startPut your input at
./storage/key_value_stores/default/INPUT.jsonor use the Apify CLI. -
Example input (see
.actor/input.json){"location": { "city": "San Francisco", "region": "CA", "country": "USA" },"keywords": ["AI", "startup", "tech"],"sources": { "luma": true, "cerebralValley": true, "extraUrls": [] },"maxEventsPerSource": 25}
Run timeout and event count
Default run options: timeout 3000s (50 min), memory 2048 MB.
- Quick run:
maxEventsPerSource: 5→ ~12 requests, finishes in a few minutes. - Standard:
maxEventsPerSource: 25→ ~52 requests, well within 3000s. - Maximum:
maxEventsPerSource: 50→ ~102 requests, fits within 3000s.
Output (normalized event)
Results are deduplicated (stable ID + fuzzy title/time/city match) and filtered to real events only.
Each dataset item includes:
url– event page URLtitle– event titledate– YYYY-MM-DD from start timetime– HH:mm from start timecompany– primary company (organizer or first company involved)description– event description
Plus when available: startTime, endTime, venueName, address, city, region, country, lat, lon, isOnline, attendanceType, organizerName, companiesInvolved[], speakers[], tags, matchedKeywords, imageUrl, scrapedFromUrls, and more.
Scripts
npm run build– compile TypeScript todist/npm start– rundist/main.jsnpm run lint– ESLint onsrc/npm test– run Jest unit tests
Extending
Add src/sources/<name>.ts with enqueue* and parse*Page functions, then wire them in main.ts.