Event Scraper Pro avatar
Event Scraper Pro

Pricing

Pay per event

Go to Apify Store
Event Scraper Pro

Event Scraper Pro

Developed by

BarriereFix

BarriereFix

Maintained by Community

Professional Apify Actor that aggregates upcoming events from Eventbrite, Meetup, and Lu.ma with attendee counts, organizer info, and unified data model.

0.0 (0)

Pricing

Pay per event

0

3

3

Last modified

5 hours ago

Event Scraper Pro - Multi-Platform Event Aggregator

Professional Apify Actor that aggregates upcoming events from Eventbrite, Meetup, and Lu.ma with attendee counts, organizer info, and unified data model.

Apify TypeScript Pay-per-Event

Features

  • 🎯 Multi-Platform Aggregation: Scrapes events from Eventbrite, Meetup, and Lu.ma in one unified Actor
  • πŸ“Š RSVP Counts: Captures publicly available RSVP numbers from Meetup and Lu.ma events
  • 🏒 Organizer Intelligence: Extracts organizer profiles, follower counts, and contact information for lead generation
  • πŸ”„ Smart Deduplication: Fuzzy matching across platforms to eliminate duplicate events
  • 🏷️ Industry Classification: Map events to custom industry categories for targeted filtering
  • πŸ“… ICS Calendar Export: Generate .ics calendar files for easy import to Google Calendar, Outlook, etc.
  • πŸ”— n8n Webhook Integration: Push results to n8n workflows for automated prospecting
  • πŸ’Ύ State Persistence: Tracks seen events across runs to avoid duplicates
  • πŸ’° Pay-per-Event Pricing: Only pay for events ingested, not compute time

Use Cases

  • Event Discovery: Find relevant events in your target cities and industries
  • Lead Generation: Identify high-engagement event organizers with follower counts
  • Market Research: Analyze event trends, topics, and attendance patterns
  • Community Outreach: Build targeted lists for partnership and sponsorship opportunities
  • Competitive Intelligence: Track competitor events and audience sizes

Data Sources

PlatformRSVP DataOrganizer DataData Quality
Meetupβœ… RSVP counts (always public)Group members, profile URLExcellent
Lu.maβœ… Guest counts (usually public)Host profile, event historyExcellent
Eventbrite❌ Not availableOrganizer followers, profileGood

Input Configuration

Basic Example

{
"queries": [
{
"keywords": ["tech", "startup"],
"locations": ["San Francisco, US"],
"dateFrom": "2026-01-01",
"dateTo": "2026-03-01",
"platforms": ["eventbrite", "meetup", "luma"]
}
],
"minAttendees": 20,
"generateICS": true,
"webhookUrl": "https://n8n.yourdomain.com/webhook/events"
}

Note on minAttendees: This filter uses rsvpCount as the metric. Only Meetup and Lu.ma provide RSVP counts, so Eventbrite events will be included regardless of this setting.

Advanced Example with Industry Mapping

{
"queries": [
{
"keywords": ["devops", "kubernetes", "docker"],
"locations": ["San Francisco, US", "New York, US"],
"dateFrom": "2026-01-01",
"dateTo": "2026-04-01",
"platforms": ["meetup", "luma"],
"onlineOnly": false
}
],
"minAttendees": 50,
"includeFree": true,
"includePaid": true,
"industryMapping": {
"DevOps": ["devops", "kubernetes", "docker", "ci/cd"],
"Cloud": ["aws", "azure", "gcp", "cloud native"]
},
"fuzzyTitleThreshold": 0.82,
"timeWindowMinutes": 90,
"distanceMeters": 800,
"webhookUrl": "https://n8n.yourdomain.com/webhook/tech-events",
"webhookHeaders": {
"Authorization": "Bearer YOUR_TOKEN"
},
"generateICS": true,
"maxItemsPerPlatform": 1000
}

Input Parameters

ParameterTypeRequiredDefaultDescription
queriesarrayβœ… Yes-Array of search queries with keywords, locations, dates, and platforms
minAttendeesnumberNo0Minimum RSVP count to include events (0 = all, only applies to Meetup/Lu.ma)
includeFreebooleanNotrueInclude free events in results
includePaidbooleanNotrueInclude paid/ticketed events
maxItemsPerPlatformnumberNo1000Max events per platform (controls cost/duration)
useApifyProxybooleanNotrueEnable Apify proxy for scraping
proxyGroupsarrayNo["DATACENTER"]Proxy groups to use (DATACENTER works well)
fuzzyTitleThresholdnumberNo0.82Similarity threshold for deduplication (0-1)
timeWindowMinutesnumberNo90Time delta for fuzzy matching (minutes)
distanceMetersnumberNo800Geo distance threshold for deduplication
industryMappingobjectNo-Map event topics to custom industry categories
webhookUrlstringNo-POST results to this webhook (n8n integration)
webhookHeadersobjectNo-Custom headers for webhook requests
generateICSbooleanNotrueGenerate .ics calendar file
storeRawDatabooleanNofalseInclude raw platform JSON (debugging)
maxConcurrencynumberNo10Max concurrent requests
dryRunbooleanNofalseTest mode (no save/webhooks)

Output Schema

Each event record includes:

{
"id": "01HQZX9K3P2VQWE8RTGBNM4567",
"canonicalKey": "abc123def456",
"platform": "meetup",
"event": {
"title": "AI & Machine Learning Meetup",
"descriptionHtml": "...",
"topics": ["ai", "machine-learning", "deep-learning"],
"category": "AI",
"startsAt": "2025-11-15T18:00:00.000Z",
"endsAt": "2025-11-15T21:00:00.000Z",
"timezone": "Europe/Berlin",
"isOnline": false,
"recurrence": null
},
"venueName": "Tech Hub Berlin",
"address": "Hauptstraße 123",
"city": "Berlin",
"country": "DE",
"latitude": 52.5200,
"longitude": 13.4050,
"currency": "EUR",
"priceMin": 0,
"priceMax": 0,
"ticketStatus": "free",
"rsvpCount": 85,
"capacity": 100,
"organizerName": "Berlin AI Community",
"organizerUrl": "https://www.meetup.com/berlin-ai",
"organizerFollowers": 3500,
"coverImageUrl": "https://...",
"eventUrl": "https://www.meetup.com/...",
"ticketUrl": null,
"discoveredAt": "2025-10-02T12:00:00.000Z"
}

Location Fields Explained:

  • city: City name, or "Online" for virtual events
  • country: ISO country code (e.g., "DE", "US", "GB"), or null for online events
  • latitude/longitude: Geographic coordinates (null for online events or when unavailable)

This structure makes it easy to filter events by location while clearly distinguishing online vs. physical events.

Output Formats

  1. Dataset: Normalized JSON records in Apify dataset
  2. ICS Calendar: Saved to Key-Value Store as calendar.ics
  3. Webhook: POST to your n8n workflow with stats and first 100 events

Deduplication Strategy

Events are deduplicated using multi-factor fuzzy matching:

  1. Canonical Key: Hash of (normalized title + start time UTC + normalized city)
  2. Fuzzy Title Matching: Jaro-Winkler similarity β‰₯ 82% (configurable)
  3. Time Window: Events within 90 minutes of each other (configurable)
  4. Geo Distance: Events within 800 meters (configurable)

RSVP Count Availability

PlatformAvailabilityField Used
Meetupβœ… Always publicrsvpCount
Lu.maβœ… Usually publicrsvpCount (guest_count)
Eventbrite❌ Not availablersvpCount (always null)

Note: For Eventbrite events, consider using organizerFollowers as an alternative audience signal when available.

n8n Integration

Webhook Payload Structure

{
"runId": "abc123",
"datasetId": "xyz789",
"stats": {
"total": 150,
"byPlatform": {
"meetup": 80,
"luma": 45,
"eventbrite": 25
},
"avgAttendees": 65,
"dedupedCount": 12
},
"events": [ /* first 100 events */ ]
}

Example n8n Workflow

1. [Webhook] POST /webhook/apify-events
2. [HTTP Request] GET Apify dataset (all events)
3. [Function] Filter by industry categories
4. [Function] Map to prospect list (company, url, city, audience)
5. [Deduplicate] By company+city hash
6. [Branch A] β†’ Google Sheets / Airtable
7. [Branch B] β†’ CRM API / Email campaign
8. [Slack] Notification with run stats

Pricing

This Actor uses pay-per-event pricing:

  • Event Name: event-ingested
  • Price: ~$0.01 per event (configure in Apify Console)
  • No platform fees if published before March 31, 2025 (0% commission for 6 months)

Cost Examples

ScenarioEventsEstimated Cost
Small run (2 cities, 1 keyword, 30 days)~100-300$1-3
Medium run (5 cities, 3 keywords, 60 days)~500-1000$5-10
Large run (10 cities, 5 keywords, 90 days)~2000-5000$20-50

Platform-Specific Notes

Meetup

  • Data Source: __NEXT_DATA__ JSON in page HTML
  • RSVP Counts: Always publicly visible (yes_rsvp_count)
  • Best For: Community events, tech meetups, networking
  • Proxy: Datacenter proxy works fine

Lu.ma

  • Data Source: API endpoint /discover/get-events
  • Guest Counts: Usually publicly visible (guest_count)
  • Best For: Creator events, workshops, online events
  • Proxy: Datacenter proxy works fine

Eventbrite

  • Data Source: JSON-LD structured data + search results
  • Attendee Counts: Rarely visible on public pages
  • Organizer Followers: Often visible (use as audience signal)
  • Best For: Professional conferences, large events, ticketed events
  • Proxy: Datacenter proxy works fine

Development & Testing

Local Testing

# Install dependencies
npm install
# Build TypeScript
npm run build
# Test with pay-per-event simulation
npm run test:ppe
# Regular dev run
npm run dev

Pay-Per-Event Testing

ACTOR_TEST_PAY_PER_EVENT=true \
ACTOR_USE_CHARGING_LOG_DATASET=true \
npm run dev

Check the charging-log dataset to see all billing events triggered.

Limitations

  • LinkedIn Events: Not included (requires residential proxy + complex anti-bot handling)
  • Eventbrite Attendee Counts: Rarely public on event pages
  • Rate Limits: Respects platform rate limits with automatic backoff
  • Geo Accuracy: Depends on platform-provided coordinates
  • Historical Events: Only finds future events (past events not scraped)

Error Handling

The Actor implements robust error handling:

  • 3 retries with exponential backoff per request
  • Platform-level circuit breakers (fail-fast if >10% error rate)
  • Graceful degradation (continues with other platforms if one fails)
  • Error logging with sample HTML snapshots for debugging

Performance

Typical run times (with default concurrency):

  • 2 cities, 2 keywords, 60 days, 3 platforms: 5-10 minutes
  • 5 cities, 3 keywords, 60 days, 3 platforms: 10-15 minutes
  • 10 cities, 5 keywords, 90 days, 3 platforms: 20-30 minutes

Memory usage: < 200 MB Concurrency: 10 concurrent requests (configurable)

Roadmap

  • V1.1: Geocoding normalization (Google Maps API for missing coordinates)
  • V1.2: ML-based classification (fastText/TF-Lite for smarter categorization)
  • V1.3: Language filtering & i18n date parsing
  • V2.0: City geo-radius search (Haversine distance queries)

Support & Issues

License

MIT License - see LICENSE file for details


Built with ❀️ by Barrierefix | Apify Store | Documentation