Yelp Events avatar

Yelp Events

Pricing

from $0.01 / 1,000 results

Go to Apify Store
Yelp Events

Yelp Events

Fetches Yelp's public events.

Pricing

from $0.01 / 1,000 results

Rating

0.0

(0)

Developer

Payam Pirooznia

Payam Pirooznia

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Yelp Events Apify Actor

A standalone Apify actor repository that extracts public event listings from Yelp city events pages. It is designed for BFU-Feeder ingestion, but the repository itself is intended to be developed, tested, and deployed independently.

What It Does

StepDescription
1. NavigateLoad Yelp events browse pages (e.g., /events/austin-tx-us/browse?official=0)
1b. ThrottleEnforce a 500 ms same-domain delay between Yelp page requests to reduce firewall/challenge risk
2. ClassifyDetect page type: events list, event detail, blocked/challenge, unsupported layout
3. Parse listExtract event cards using multi-strategy selectors
4. Enrich (optional)Follow event detail links for richer data
5. Map & emitMerge card + detail data into YelpEventRecord aligned with BFU-Feeder RawEventContract
6. DiagnosticsSave per-source and global run metrics to key-value store

Input

FieldTypeDefaultDescription
startUrlsStartUrl[]requiredYelp events page URLs
maxItemsinteger200Max total events (0 = unlimited)
maxEventsPerSourceinteger100Max events per source URL (0 = unlimited)
categoriesstring[]undefinedOptional category filter (see below)
includeDetailPagesbooleantrueFollow event detail links for enrichment
proxyConfigurationobjectundefinedApify proxy settings
browserType"chromium" | "firefox" | "webkit""chromium"Browser engine used by Playwright
browserUserAgentsstring[]built-in poolOptional custom user-agent pool; one is selected randomly per run
browserViewportWidthinteger1366Browser viewport width in pixels
browserViewportHeightinteger768Browser viewport height in pixels
browserJavaScriptEnabledbooleantrueEnable JavaScript in browser context
browserIgnoreHTTPSErrorsbooleantrueIgnore TLS/HTTPS certificate errors
debugModebooleanfalseVerbose logging
saveDebugHtmlbooleanfalseSave raw HTML to key-value store
saveDebugScreenshotsbooleanfalseSave page screenshots

Example Start URLs

https://www.yelp.com/events/mission-viejo-ca-us/browse?official=0
https://www.yelp.com/events/austin-tx-us/browse?official=0
https://www.yelp.com/events/san-diego-ca-us/browse?official=0

Category Filter

The optional categories field filters events by Yelp's event category. When provided, each start URL is expanded into one request per selected category by appending &c=<id>.

CategoryID
Other0
Music1
Visual Arts2
Performing Arts3
Film4
Lectures & Books5
Fashion6
Food & Drink7
Festivals & Fairs8
Charities9
Sports & Active life10
Nightlife11
Kids & Family12

Example: 2 start URLs with ["Music", "Food & Drink"] produces 4 requests.

Browser Runtime Defaults

  • Default browser type: chromium
  • Supported browser types: chromium, firefox, webkit
  • Default viewport: 1366x768
  • Default javaScriptEnabled: true
  • Default ignoreHTTPSErrors: true
  • User-Agent selection:
    • If browserUserAgents is provided and non-empty, one value is chosen randomly per run.
    • Otherwise, one value is chosen randomly from a built-in fallback pool.

When browserType is chromium, the actor always applies fixed launch arguments:

  • --disable-gpu
  • --no-sandbox
  • --disable-dev-shm-usage
  • --disable-extensions
  • --disable-popup-blocking
  • --disable-blink-features=AutomationControlled
  • --disable-web-security
  • --disable-background-networking
  • --disable-notifications
  • --disable-infobars

Output

Each dataset item is a YelpEventRecord with these key fields:

FieldTypeDescription
source_family"yelp_events"Fixed identifier
platform"yelp"Fixed identifier
raw_titlestring | nullEvent title
raw_descriptionstring | nullEvent description text (line breaks preserved, including <br>-derived breaks)
raw_start_textstring | nullStart date/time text (ISO when available)
raw_end_textstring | nullEnd date/time text (ISO when available)
raw_location_textstring | nullVenue/location text (line breaks, including <br>, normalized to comma-space)
locationstring | nullVenue/business name extracted from the event page biz-name link
zip_codestring | null5-digit ZIP extracted from raw_location_text when pattern is present
raw_image_urlstring | nullEvent image URL
event_urlstring | nullDirect event page URL
external_event_idstring | nullYelp event slug
raw_payload_jsonobjectCard HTML, detail HTML, strategy, version
is_cancelled_hintboolean | nullCancellation detection
starts_at_hintstring | nullISO datetime if parseable from start text
ends_at_hintstring | nullISO datetime if parseable from end text
fetched_atstringISO timestamp

Full schema: see src/types.ts YelpEventRecord interface.

BFU-Feeder Integration

This repository's output aligns with BFU-Feeder's RawEventContract:

Actor fieldContract field
raw_titleraw_title
raw_start_textraw_start_text
raw_location_textraw_location_text
zip_codezip_code
event_urlraw_url
raw_image_urlraw_image_url
raw_descriptionraw_description
raw_end_textraw_end_text
ends_at_hintends_at_hint
raw_payload_jsonraw_payload_json
is_cancelled_hintis_cancelled_hint
external_event_idexternal_event_id (prefixed with apify- by mapper)
fetched_atfetched_at

Development

Prerequisites

  • Node.js 18+
  • npm 9+

Setup

$npm install

Build

npm run build # TypeScript → dist/
make build # Same as npm run build

Test

npm test # Run all tests with vitest
npm run test:coverage # Run tests with coverage output
npm run test:watch # Watch mode
make test # Run tests with coverage output
make test-watch # Watch mode via make

Common Make targets

make help # List available targets
make install # Install dependencies
make build # Compile TypeScript
make test # Run tests with coverage report
make test-watch # Run Vitest in watch mode
make run # Run the actor locally with Apify CLI
make clean # Remove dist/, coverage/, and local runtime artifacts

Run locally

# Using Apify CLI
apify run --input='{"startUrls": [{"url": "https://www.yelp.com/events/austin-tx-us/browse?official=0"}], "maxItems": 10}'

Architecture

src/
├── main.ts # Actor entry point + PlaywrightCrawler orchestration
├── types.ts # All TypeScript interfaces + category map
├── url_utils.ts # URL expansion for category filter
├── navigation.ts # Page classification (list, detail, blocked, unsupported)
├── list_parser.ts # Multi-strategy event card extraction from browse pages
├── detail_parser.ts # Event detail page enrichment
├── output_mapper.ts # Merge card + detail → YelpEventRecord
├── browser_runtime.ts # Browser runtime normalization and launch settings
└── diagnostics.ts # Run metrics tracker
tests/
├── url_utils.test.ts
├── list_parser.test.ts
├── detail_parser.test.ts
├── output_mapper.test.ts
├── browser_runtime.test.ts
└── diagnostics.test.ts

Parser Strategies

The list parser tries multiple selector strategies in order:

  1. classic-events-listul.events-index_events-list li and similar containers
  2. card-layout[class*="event-card"] and data-testid patterns
  3. anchor-fallback — Links to /events/ detail pages with container inference

Page Classification

ClassificationMeaning
yelp_events_listYelp events browse page with event cards
yelp_event_detailIndividual event detail page
blocked_challengeCAPTCHA, rate-limit block, or "unusual activity" page
unsupported_layoutPage loaded but no recognizable events layout
errorNavigation or timeout error

Diagnostics

Run diagnostics are saved to the key-value store under the key run-diagnostics and include:

  • Pages visited, events found/emitted/skipped
  • Detail page attempt/success counts
  • Partial failure count (events emitted with warnings)
  • Per-source breakdown (URL, classification, counts, warnings)
  • Global parser warnings and unsupported layout notes