Multi Source Event Scraper
Pricing
Pay per event
Multi Source Event Scraper
Normalize and dedupe events from Eventbrite, Meetup, and Luma into a single canonical feed. Outputs clean JSON/CSV/iCal with source links, tags, and “last verified” timestamps, perfect for directories, newsletters, and automation.
Pricing
Pay per event
Rating
0.0
(0)
Developer

Alex Pavlov
Actor stats
2
Bookmarked
3
Total users
2
Monthly active users
10 hours ago
Last modified
Categories
Share
Multi-Source Event Extractor
An Apify Actor that extracts events from Eventbrite, Meetup, and Luma, normalizes them to a canonical schema, deduplicates across sources, and outputs clean data in multiple formats.
Features
- Multi-platform extraction: Eventbrite, Meetup, and Luma
- Two operation modes: Direct URLs or location-based discovery
- Smart deduplication: Fingerprint-based matching to identify duplicate events across platforms
- Multiple export formats: JSON (Dataset), CSV, and iCal (.ics)
- Canonical schema: All events normalized to a consistent output format
- Geo-coordinates: Extracts venue latitude/longitude when available
- Date filtering: Filter events by start date range
Input Configuration
Basic Example (URLs Mode)
{"mode": "urls","sources": ["eventbrite", "meetup", "luma"],"startUrls": [{ "url": "https://www.eventbrite.com/o/your-organizer-page" },{ "url": "https://www.meetup.com/your-meetup-group/" },{ "url": "https://lu.ma/your-event" }],"maxItems": 100}
Discovery Mode Example
{"mode": "discover","sources": ["eventbrite", "meetup", "luma"],"discover": {"locationQuery": "San Francisco","keywords": ["tech", "ai"],"dateRange": "this_week","priceRange": "free"},"maxItems": 50}
Input Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
mode | string | Yes | - | "urls" for direct URLs, "discover" for location search |
sources | array | No | All | Platforms to extract from: eventbrite, meetup, luma |
startUrls | array | Required for urls mode | - | List of URLs to process |
discover | object | Required for discover mode | - | Discovery configuration |
maxItems | number | No | 0 (unlimited) | Maximum events to extract |
minStartDate | string | No | - | ISO date string for minimum event start |
maxStartDate | string | No | - | ISO date string for maximum event start |
debug | boolean | No | false | Enable verbose logging |
Discover Configuration
| Parameter | Type | Description |
|---|---|---|
locationQuery | string | City or location to search (e.g., "San Francisco", "New York") |
keywords | array | Search keywords (e.g., ["tech", "networking"]) |
dateRange | string | today, tomorrow, this_week, this_weekend, next_week, this_month, custom |
dateFrom | string | Start date for custom range (ISO format) |
dateTo | string | End date for custom range (ISO format) |
priceRange | string | free, paid, or all |
sortBy | string | date, relevance, or popularity |
onlineOnly | boolean | Only include online events |
Output Schema
Each event is output in the CanonicalEvent format:
{canonicalId: string; // Stable fingerprint-based IDtitle: string;startTime: string; // ISO 8601endTime?: string; // ISO 8601timezone?: string;url: string; // Primary URLplatform: "eventbrite" | "meetup" | "luma" | "multi";venue?: {name?: string;address?: string;city?: string;region?: string;country?: string;lat?: number;lng?: number;};organizer?: {name?: string;url?: string;};price?: {isFree: boolean;min?: number;max?: number;currency?: string;raw?: string;};images?: string[];descriptionSnippet?: string; // Max 500 charssourceListings: Array<{ // All sources where event was foundsource: string;url: string;extractedAt: string;sourceEventId?: string;}>;dedupe?: {fingerprint: string;mergedCount: number;};meta: {extractedAt: string;};}
Output Formats
1. Dataset (JSON)
Events are pushed to the default Apify Dataset. Access via:
- Apify Console: Dataset tab in run details
- API:
https://api.apify.com/v2/datasets/{datasetId}/items
2. iCal Export
Automatically exported to Key-Value Store as events.ics:
- Import directly into Google Calendar, Apple Calendar, Outlook
- Includes location, description, and geo-coordinates
- Download from Key-Value Store tab or API
Tip: For CSV export, use Apify's native Dataset export: add
?format=csvto the Dataset API URL.
Deduplication
The Actor automatically identifies and merges duplicate events across platforms using fingerprint-based matching:
- Title matching: Normalized (lowercase, no special characters)
- Time window: Events within 15 minutes are considered the same
- Venue matching: Geographic bucket (~1km) or city-level fallback
When duplicates are found, events are merged with:
- Platform precedence: Eventbrite > Meetup > Luma
- All source URLs preserved in
sourceListings platformset to"multi"if from multiple sources
Supported URL Patterns
Eventbrite
- Event pages:
eventbrite.com/e/{event-slug} - Organizer pages:
eventbrite.com/o/{organizer} - Search results:
eventbrite.com/d/{location}/{category}
Meetup
- Event pages:
meetup.com/{group}/events/{event-id} - Group pages:
meetup.com/{group} - Search:
meetup.com/find/
Luma
- Event pages:
lu.ma/{event-slug} - City pages:
lu.ma/sf,lu.ma/nyc - Category pages:
lu.ma/tech,lu.ma/ai
Usage Examples
Extract from Specific Organizers
{"mode": "urls","sources": ["eventbrite"],"startUrls": [{ "url": "https://www.eventbrite.com/o/techcrunch-8868586981" }],"maxItems": 20}
Discover Tech Events in Multiple Cities
{"mode": "discover","sources": ["eventbrite", "meetup", "luma"],"discover": {"locationQuery": "New York","keywords": ["tech", "startup", "networking"],"dateRange": "this_month","priceRange": "free"},"maxItems": 100}
Extract Free Events This Weekend
{"mode": "discover","sources": ["eventbrite", "luma"],"discover": {"locationQuery": "Los Angeles","dateRange": "this_weekend","priceRange": "free"}}
Local Development
# Install dependenciesnpm install# Buildnpm run build# Run locallynpx apify-cli run --input-file=input.json# Run testsnpm test
Proxy Configuration
For production use on Apify platform, configure proxy:
{"proxyConfiguration": {"useApifyProxy": true}}
Limitations
- Rate limiting may apply on source platforms
- Some events may require login to access full details
- Geo-coordinates depend on venue data quality from source
- Discovery mode availability varies by platform and location
Cost Estimation
Resource usage depends on:
- Number of URLs/events to process
- Whether JavaScript rendering is needed (always enabled)
- Network conditions and retries
Typical runs:
- 50 events from URLs: ~0.05 compute units
- Discovery mode (100 events): ~0.15 compute units
License
Apache 2.0