Luma/Cerebral Valley avatar

Luma/Cerebral Valley

Pricing

from $10.00 / 1,000 results

Go to Apify Store
Luma/Cerebral Valley

Luma/Cerebral Valley

Scrapes Luma and Cerebral Valley for events by city, date range, and keywords. Returns a normalized dataset: title, URL, start/end time, venue, city, organizer, description, price. Deduplicated and filtered so results match your chosen location and dates.

Pricing

from $10.00 / 1,000 results

Rating

0.0

(0)

Developer

Petros Hong

Petros Hong

Maintained by Community

Actor stats

0

Bookmarked

7

Total users

3

Monthly active users

25 days

Issues response

25 days ago

Last modified

Share

Events Agent – Apify Actor

Apify Actor that finds events by location, date window, and keywords from Luma and Cerebral Valley, then outputs a normalized, deduplicated dataset. Use separate actors for Meetup or Google if needed.

Sources

  • Luma (lu.ma) – city-specific event pages (SF, NYC, London, etc.)
  • Cerebral Valleycerebralvalley.ai/events, pre-filtered by city
  • Extra URLs (optional) – any event or list page URLs you add in sources.extraUrls

How location filtering works

LayerWhat it does
Luma city pageBuilds a city-specific URL (e.g. lu.ma/sf) so only local events are listed
CV list pre-filterReads event-card text on the global CV page and drops cards that mention a conflicting city or "Online"
Post-crawl city filterRejects events whose location fields clearly belong to a different city
Post-crawl country filterRejects events from a different country (US/UK/CA normalized)
Online detectionOnline-only events are excluded when a physical city is requested; hybrid events pass through

Input schema (summary)

FieldTypeDefault
location{ city?, region?, country? }{} – city filters by city; country adds secondary filter
keywordsstring[][] – annotates events with matchedKeywords
startDatedate stringtoday (America/Los_Angeles)
endDatedate stringstart + 14 days
sources{ luma, cerebralValley, extraUrls[] }both on, no extras
maxEventsPerSource1–5025
headlessbooleantrue
proxyConfigurationApify proxy objectoptional
debugbooleanfalse

Date formats accepted: YYYY-MM-DD, MM/DD/YYYY, DD.MM.YYYY. Start date must be on or before end date.

Run locally

  1. Install and build

    npm install
    npm run build
  2. Run with local storage

    # Linux/Mac
    export APIFY_LOCAL_STORAGE_DIR=./storage
    # Windows
    set APIFY_LOCAL_STORAGE_DIR=./storage
    npm start

    Put your input at ./storage/key_value_stores/default/INPUT.json or use the Apify CLI.

  3. Example input (see .actor/input.json)

    {
    "location": { "city": "San Francisco", "region": "CA", "country": "USA" },
    "keywords": ["AI", "startup", "tech"],
    "sources": { "luma": true, "cerebralValley": true, "extraUrls": [] },
    "maxEventsPerSource": 25
    }

Run timeout and event count

Default run options: timeout 3000s (50 min), memory 2048 MB.

  • Quick run: maxEventsPerSource: 5 → ~12 requests, finishes in a few minutes.
  • Standard: maxEventsPerSource: 25 → ~52 requests, well within 3000s.
  • Maximum: maxEventsPerSource: 50 → ~102 requests, fits within 3000s.

Output (normalized event)

Results are deduplicated (stable ID + fuzzy title/time/city match) and filtered to real events only.

Each dataset item includes:

  • url – event page URL
  • title – event title
  • date – YYYY-MM-DD from start time
  • time – HH:mm from start time
  • company – primary company (organizer or first company involved)
  • description – event description

Plus when available: startTime, endTime, venueName, address, city, region, country, lat, lon, isOnline, attendanceType, organizerName, companiesInvolved[], speakers[], tags, matchedKeywords, imageUrl, scrapedFromUrls, and more.

Scripts

  • npm run build – compile TypeScript to dist/
  • npm start – run dist/main.js
  • npm run lint – ESLint on src/
  • npm test – run Jest unit tests

Extending

Add src/sources/<name>.ts with enqueue* and parse*Page functions, then wire them in main.ts.