Webinar Event Discovery Scraper avatar

Webinar Event Discovery Scraper

Pricing

Pay per event

Go to Apify Store
Webinar Event Discovery Scraper

Webinar Event Discovery Scraper

Discover public webinars, workshops, demos, and B2B event landing pages from domains or URLs with dates, speakers, CTAs, and evidence.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Stas Persiianenko

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Categories

Share

Discover public webinars, workshops, virtual events, product demos, summits, and on-demand event landing pages from company domains or known event URLs.

The actor crawls public B2B marketing pages, follows same-domain event links, extracts structured data when available, and saves normalized records with dates, CTAs, speakers, topics, evidence snippets, and confidence scores.

Use it when you need repeatable event intelligence for lead generation, competitive monitoring, partnership research, content calendars, or CRM enrichment.

What does Webinar Event Discovery Scraper do?

It turns a list of public websites into a clean dataset of webinar and event opportunities.

  • ๐Ÿ”Ž Starts from supplied URLs or domains.
  • ๐Ÿงญ Follows same-domain links that look like webinars, events, demos, workshops, summits, conferences, or training pages.
  • ๐Ÿงฑ Reads structured Event data from JSON-LD when websites publish it.
  • ๐Ÿ“ Falls back to page titles, metadata, headings, body text, CTAs, and speaker blocks when structured data is missing.
  • ๐Ÿ“… Normalizes dates and status hints.
  • ๐ŸŽฏ Finds registration URLs and CTA text.
  • ๐Ÿงช Adds evidence snippets and confidence scores so humans can review borderline pages quickly.

Who is it for?

Demand generation teams

Track upcoming webinars across target accounts, partners, competitors, and industry publishers.

Sales development teams

Find timely campaign hooks and company initiatives to use in outbound messaging.

Competitive intelligence teams

Monitor competitor webinars, demos, workshops, and product education pages.

Partner marketing teams

Build shared event calendars from partner and ecosystem websites.

Content operations teams

Audit old and new event assets across many public resource centers.

Why use it?

Manual webinar research is repetitive and easy to miss.

This actor helps you:

  • Save hours of manual browsing.
  • Find registration pages before events happen.
  • Detect on-demand webinars that can become lead magnets.
  • Standardize messy event pages into one schema.
  • Feed CRM, spreadsheets, Slack alerts, or BI dashboards.
  • Re-run the same sources every week or month.

What data can it extract?

FieldDescription
sourceUrlPage where the event was found
eventTitleWebinar or event title
eventTypeStructured type or inferred event type
statusupcoming, on-demand, past, or unknown
startDateISO start date when detected
endDateISO end date when detected
startTimeTime portion from the start date
endTimeTime portion from the end date
timezoneTimezone hint when present
hostCompanyOrganizer, publisher, site name, or domain
speakersSpeaker or presenter names detected on the page
descriptionMetadata or structured event description
agendaShort agenda-like evidence line
topicsTopic keywords inferred from the page
registrationUrlBest registration, watch, or CTA URL
ctaTextCTA anchor/button text
gatedFormProviderHints such as Marketo, HubSpot, ON24, Zoom, or Cvent
evidenceSnippetsShort snippets proving why the page matched
pageHashStable hash for change detection
discoveredAtExtraction timestamp
confidence0-1 score for review prioritization

How much does it cost to discover webinar event leads?

Pricing uses pay-per-event output.

You pay a small start fee for each run and then a per-record event fee for saved webinar/event records.

This is useful for monitoring workflows because small tests stay cheap while larger recurring crawls scale with the amount of useful data returned.

How to use it

  1. Open the actor on Apify.
  2. Add public event hub URLs or company domains.
  3. Keep the default crawl caps for the first run.
  4. Optionally set include or exclude keywords.
  5. Run the actor.
  6. Export the dataset as CSV, JSON, Excel, or via API.
  7. Schedule it weekly or monthly for monitoring.

Input options

Start URLs

Use this for known event hubs, resource centers, webinar listing pages, or landing pages.

Examples:

  • https://www.salesforce.com/events/webinars/
  • https://www.hubspot.com/resources/webinars
  • https://example.com/events

Domains

Use this when you know the company but not the exact event URL.

Examples:

  • salesforce.com
  • hubspot.com
  • example.com

The actor starts at the homepage and follows event-like same-domain links.

Maximum pages per domain

Controls crawler sprawl.

For a quick test, use 5-15 pages per domain.

For a deeper crawl, use 50-100 pages per domain.

Maximum event records

Stops the run after this many saved records.

Use a low value for testing and a higher value for scheduled discovery.

Include keywords

The default keywords target webinars, events, workshops, demos, summits, conferences, training pages, and on-demand pages.

Add vertical terms like security, AI, data, or CRM when you want a tighter dataset.

Exclude keywords

Use this to drop irrelevant pages.

Common examples:

  • careers
  • investor
  • press release
  • sponsorship

Status filters

Keep upcoming events, on-demand assets, past events, or unknown pages.

Unknown is useful because many landing pages hide dates inside images or scripts.

Date filters

Use fromDate and toDate when you only want records with detected dates in a specific window.

Pages without dates are not removed by date filters unless a comparable detected date exists.

Proxy option

Public B2B marketing pages usually work without a proxy.

Enable Apify Proxy only when a target blocks normal requests.

Output example

{
"sourceUrl": "https://www.example.com/events/ai-webinar",
"eventTitle": "AI Automation Webinar",
"eventType": "Webinar",
"status": "upcoming",
"startDate": "2026-07-10T17:00:00.000Z",
"endDate": null,
"startTime": "17:00:00",
"endTime": null,
"timezone": "UTC",
"hostCompany": "example.com",
"speakers": ["Jane Smith"],
"description": "Join our AI automation webinar for marketing teams.",
"agenda": "Learn how to automate webinar follow-up workflows.",
"topics": ["webinar", "ai", "automation", "marketing"],
"registrationUrl": "https://www.example.com/events/ai-webinar/register",
"ctaText": "Register now",
"gatedFormProvider": "marketo",
"evidenceSnippets": ["Register now for our AI automation webinar"],
"pageHash": "abc123def4567890",
"discoveredAt": "2026-06-25T00:00:00.000Z",
"confidence": 0.9
}

Tips for better results

  • Start with known webinar or event hub URLs when possible.
  • Use domains for discovery, then save high-quality event hubs for recurring runs.
  • Keep page caps conservative for large enterprise websites.
  • Add industry keywords to reduce noisy general event pages.
  • Keep unknown status enabled if you care about pages where dates are hidden in scripts.
  • Use pageHash to detect changed pages over time.

Common workflows

Competitive webinar monitoring

Add competitor domains and schedule the actor weekly.

Export upcoming and on-demand records to a spreadsheet or CRM.

Account-based marketing research

Add target account domains.

Find recent webinars and demos that reveal strategic priorities.

Partner event calendar

Add partner domains.

Collect upcoming workshops and webinars in one dataset.

Lead magnet discovery

Keep on-demand status enabled.

Find gated webinars and resource pages for content intelligence.

Integrations

The dataset can be sent to:

  • Google Sheets for campaign planning.
  • Airtable for editorial calendars.
  • HubSpot or Salesforce for account enrichment.
  • Slack for new event alerts.
  • Snowflake, BigQuery, or S3 for historical analysis.
  • Zapier or Make for no-code automations.

API usage

Node.js

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/webinar-event-discovery-scraper').call({
startUrls: [{ url: 'https://www.salesforce.com/events/webinars/' }],
maxPagesPerDomain: 20,
maxResults: 100
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient
client = ApifyClient("<APIFY_TOKEN>")
run = client.actor("automation-lab/webinar-event-discovery-scraper").call(run_input={
"domains": ["salesforce.com"],
"maxPagesPerDomain": 20,
"maxResults": 100,
})
print(run["defaultDatasetId"])

cURL

curl -X POST "https://api.apify.com/v2/acts/automation-lab~webinar-event-discovery-scraper/runs?token=$APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://www.salesforce.com/events/webinars/"}],"maxResults":100}'

MCP usage

Use the Apify MCP server with this actor in Claude Code, Claude Desktop, or compatible MCP clients.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/webinar-event-discovery-scraper

Claude Code setup:

$claude mcp add apify-webinar-events https://mcp.apify.com/?tools=automation-lab/webinar-event-discovery-scraper

Claude Desktop JSON config:

{
"mcpServers": {
"apify-webinar-events": {
"url": "https://mcp.apify.com/?tools=automation-lab/webinar-event-discovery-scraper"
}
}
}

Example prompts:

  • "Find upcoming webinars from these five competitor domains and summarize the topics."
  • "Run the webinar event discovery scraper for this partner list and export registration URLs."
  • "Compare this week's discovered webinar pages with last week's page hashes."

Scheduling

Schedule the actor weekly for competitor monitoring or monthly for broad account research.

Use a stable source list and compare pageHash, eventTitle, and startDate between runs.

Data quality notes

Websites publish event data inconsistently.

The actor combines structured data and heuristics.

Confidence scores help you decide which records need review.

A missing date does not always mean the page is irrelevant.

Some pages hide dates in images, third-party widgets, or scripts.

FAQ

What websites work best?

Public B2B webinar hubs, event pages, demo pages, and resource centers work best. Private portals, login walls, and heavily scripted widgets may need exact URLs or future browser fallback.

Can I monitor the same sources repeatedly?

Yes. Schedule the actor and compare sourceUrl, eventTitle, startDate, and pageHash across runs.

Troubleshooting

The run saved fewer records than expected

Increase maxPagesPerDomain, keep unknown status enabled, and add broader include keywords such as event, demo, and training.

A page is skipped

The page may be non-HTML, blocked, private, or outside the same domain crawl scope.

Try adding the exact landing page as a start URL.

Dates are missing

The site may render dates with JavaScript or images.

Use the evidence snippets and source URL to review those records manually.

The crawler visits too many pages

Lower maxPagesPerDomain, use more specific start URLs, and add include keywords.

Legality and responsible use

This actor is designed for public pages only.

Do not use it to bypass login walls, private communities, paywalls, or access controls.

Review each target site's terms and your local laws before running large crawls.

Respect robots, rate limits, and data privacy rules.

Related Automation Lab actors that can complement this workflow:

  • Website Contact Finder for company contact enrichment.
  • Domain to company enrichment actors for account lists.
  • Search result scrapers for finding new webinar hubs.
  • LinkedIn or company profile scrapers where allowed by their terms.

Limitations

  • Generic crawling cannot guarantee every event page on every website.
  • JavaScript-only widgets may require future browser fallback work.
  • Registration forms may be embedded by third-party providers.
  • Some sites use regional redirects or consent walls.
  • Speaker extraction depends on page structure.

Best practices

  • Use high-quality seed URLs.
  • Run a small sample first.
  • Review confidence scores.
  • Export and deduplicate by source URL and title.
  • Re-run on a schedule for monitoring.
  • Keep source lists organized by segment or competitor set.

Example source list

{
"startUrls": [
{ "url": "https://www.salesforce.com/events/webinars/" },
{ "url": "https://www.hubspot.com/resources/webinars" }
],
"maxPagesPerDomain": 25,
"maxResults": 100,
"statuses": ["upcoming", "on-demand", "unknown"]
}

Changelog

0.1

Initial private build for public webinar and event discovery.

Support

If a target website consistently blocks public access, reduce crawl depth, add exact event URLs, or enable proxy for that site.

For best commercial results, keep runs focused on public B2B marketing and event pages.