Pricing

Pay per event

Webinar Landing Page Extractor

Extract webinar titles, dates, speakers, registration links, platform hints, and evidence from public event hubs and landing pages.

Pricing

Pay per event

Rating

0.0

(0)

Developer

Stas Persiianenko

Actor stats

Bookmarked

Total users

Monthly active users

24 days ago

Last modified

What does Webinar Landing Page Extractor do?

Webinar Landing Page Extractor crawls public pages that promote webinars, demos, virtual events, workshops, and on-demand sessions.

It reads the HTML, JSON-LD event metadata, headings, CTA links, and visible page text.

Then it returns normalized webinar records that are ready for spreadsheets, CRM enrichment, competitive research, and content calendars.

The actor does not bypass logins, private forms, paywalls, or registrant lists.

It only extracts information visible on public landing pages.

Who is it for?

Demand generation teams use it to monitor competitor webinar programs.

Revenue operations teams use it to build a repeatable feed of upcoming registration pages.

SDR teams use it to find topical events and timing signals for outreach.

Content marketers use it to audit webinar hubs and repurpose event topics.

Agencies use it to compare webinar calendars across many client competitors.

Analysts use it to normalize public event pages from many different website templates.

Why use it?

Public webinar pages are inconsistent.

One site uses JSON-LD Event markup.

Another hides the date in body copy.

Another uses a generic event hub with registration cards.

This actor combines structured extraction with resilient heuristics and evidence snippets, so you can review ambiguous records quickly.

What data can it extract?

Field	Description
`title`	Webinar or event title
`host`	Website, organization, or event organizer
`dateText`	Raw date/time text found on the page
`startDateIso`	Parsed structured start date when available
`timezone`	Timezone abbreviation when visible
`status`	`upcoming`, `on-demand`, `past`, or `unknown`
`registrationUrl`	Best registration, watch, reserve, or join CTA link
`ctaText`	Visible text for the registration CTA
`speakers`	Speaker or presenter names when visible
`topics`	Headings and topic snippets from the page
`platformHints`	Zoom, ON24, Webex, Airmeet, BrightTALK, and similar hints
`confidence`	Extraction confidence score from 0 to 1
`evidenceText`	Raw evidence used to support the record

How much does it cost to extract webinar landing pages?

This actor uses pay-per-event pricing.

You pay a small start fee per run and a per-record event for each webinar record saved.

The default input is intentionally small so first tests stay inexpensive.

For large webinar hubs, increase maxPagesPerStartUrl and maxItems after a smoke test.

Quick start

Open the actor on Apify.
Paste one or more public webinar, event, or registration URLs into startUrls.
Keep discoverLinks enabled if the URL is a hub page.
Set maxItems to the number of webinar records you want.
Run the actor.
Export the dataset as JSON, CSV, Excel, or connect it to your workflow.

Input options

`startUrls`

Use public webinar landing pages, webinar hubs, event pages, or registration pages.

Examples:

https://www.salesforce.com/resources/webinars/
https://www.semrush.com/webinars/
A competitor webinar registration URL
A product demo event page

`maxItems`

Stops the run after this many webinar records are saved.

Use a small number for testing.

Use a larger number for complete hub extraction.

`discoverLinks`

When enabled, the actor follows same-domain links that look like webinar, event, demo, workshop, register, or on-demand pages.

Disable this when you only want the exact URLs you provided.

`maxPagesPerStartUrl`

Caps how many pages the actor fetches for each start URL.

This protects your run budget and avoids crawling an entire website.

`includeKeywords`

Optional keywords that must appear in the extracted title, host, or evidence.

Use this for topic-specific monitoring, such as AI, security, or SEO.

`excludeKeywords`

Optional keywords that exclude matching records.

Use this to remove careers pages, unrelated conferences, or archived content.

Output example

{
  "sourceUrl": "https://www.example.com/webinars/",
  "pageUrl": "https://www.example.com/webinars/ai-demo",
  "title": "AI Demo Webinar",
  "host": "Example",
  "dateText": "July 30, 2026 at 2 PM EST",
  "status": "upcoming",
  "registrationUrl": "https://www.example.com/register/ai-demo",
  "speakers": [{ "name": "Jane Doe", "role": "VP Marketing" }],
  "topics": ["AI Demo Webinar", "How teams automate workflows"],
  "platformHints": ["zoom"],
  "confidence": 0.85,
  "evidenceText": "AI Demo Webinar | July 30, 2026 at 2 PM EST | Register now"
}

Discovery mode

Discovery mode is designed for webinar hub pages.

The actor fetches the hub page first.

Then it follows same-domain links whose URL or anchor text suggests webinar, event, demo, workshop, registration, or on-demand content.

It does not crawl off-domain links during discovery.

This keeps the run focused on the website you provided.

Date and timezone handling

The actor prefers structured JSON-LD event dates when a page provides them.

If no structured date exists, it looks for visible date and time text.

Date parsing can be ambiguous across regions and page templates.

For that reason, the actor always includes dateText and evidenceText so you can audit important records.

Speaker extraction

Speaker data is extracted from JSON-LD performer/speaker fields when available.

The actor also checks common speaker and presenter sections in the HTML.

Because every website template is different, speaker fields may be empty for some pages.

Use evidenceText and topics to review ambiguous pages.

Registration link extraction

The actor searches visible links and buttons for labels such as:

Register
Save my spot
Reserve
Sign up
Watch now
View webinar
On demand
Join

The best matching URL is returned as registrationUrl.

Platform hints

The actor scans public page text and links for common webinar platform hints.

Examples include Zoom, ON24, GoToWebinar, Webex, Microsoft Teams, Airmeet, BrightTALK, Demio, Livestorm, BigMarker, Hopin, Goldcast, and Bizzabo.

These hints are useful for routing events into the right operational workflow.

Confidence score

confidence is a simple extraction-quality score.

It increases when the actor finds a title, date, registration CTA, speakers, structured event metadata, and platform hints.

Low-confidence records are still saved because public pages can be messy.

Use the score to prioritize manual review.

Tips for better results

Start with official webinar hubs rather than generic homepages.

Keep maxPagesPerStartUrl modest for first runs.

Use includeKeywords when monitoring a narrow product category.

Use excludeKeywords to remove archived topics or unrelated event pages.

Review evidenceText for any record that will trigger an automated action.

Integrations

Send extracted webinar records to Google Sheets for editorial calendars.

Push upcoming events into a CRM enrichment queue.

Sync registration links to Slack alerts for competitive-intel teams.

Store webinar topics in a warehouse for trend analysis.

Feed public webinar evidence into an LLM workflow for summarization.

API usage

Node.js

import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/webinar-landing-page-extractor').call({
  startUrls: [{ url: 'https://www.salesforce.com/resources/webinars/' }],
  maxItems: 20,
  discoverLinks: true,
});
console.log(run.defaultDatasetId);

Python

from apify_client import ApifyClient

client = ApifyClient()
run = client.actor('automation-lab/webinar-landing-page-extractor').call(run_input={
    'startUrls': [{'url': 'https://www.salesforce.com/resources/webinars/'}],
    'maxItems': 20,
    'discoverLinks': True,
})
print(run['defaultDatasetId'])

cURL

curl -X POST 'https://api.apify.com/v2/acts/automation-lab~webinar-landing-page-extractor/runs?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"startUrls":[{"url":"https://www.salesforce.com/resources/webinars/"}],"maxItems":20,"discoverLinks":true}'

MCP usage

Use the Apify MCP server with Claude Code, Claude Desktop, or another MCP client.

MCP URL:

https://mcp.apify.com/?tools=automation-lab/webinar-landing-page-extractor

Claude Code setup:

$claude mcp add apify-webinar-extractor https://mcp.apify.com/?tools=automation-lab/webinar-landing-page-extractor

Claude Desktop JSON config:

{
  "mcpServers": {
    "apify-webinar-extractor": {
      "url": "https://mcp.apify.com/?tools=automation-lab/webinar-landing-page-extractor"
    }
  }
}

Example prompts:

"Extract upcoming webinars from these competitor event hubs and return registration URLs."
"Find AI-related webinars on these public marketing sites."
"Summarize the speakers and dates from this webinar dataset."

Legality and compliance

This actor extracts publicly visible webpage content.

Do not use it to bypass login walls, gated forms, private registrant lists, or access controls.

Review the target website terms and applicable laws before using scraped data in production.

Avoid storing personal data unless you have a lawful basis and clear business need.

FAQ

Can it extract private attendee or registrant lists?

No. It only extracts public landing-page information and does not bypass forms, accounts, or registration gates.

Troubleshooting

Why is `dateText` filled but `startDateIso` empty?

Some websites publish human-readable dates without machine-readable dates.

The actor keeps the raw date evidence so you can review or parse it downstream.

Why are speakers empty?

Some landing pages hide speakers in images, scripts, or late-loaded widgets.

Try using a specific webinar detail URL instead of a high-level hub page.

Why did a hub return unrelated topics?

Hub pages contain navigation, product names, and general marketing copy.

Use includeKeywords, excludeKeywords, or lower maxPagesPerStartUrl to focus the run.

Explore other Automation Lab actors at https://apify.com/automation-lab/ for lead research, content extraction, website auditing, and market-intelligence workflows.

Use this actor alongside generic website crawlers when you need normalized event fields instead of raw page text.

Changelog

Initial version extracts public webinar landing-page data with HTTP, Cheerio, JSON-LD parsing, heuristic date/CTA/speaker detection, link discovery, confidence scoring, and evidence snippets.

Public Webinar & Event Page Intelligence Agent

jacksu/public-webinar-event-signal-agent

Extract public webinar, event, demo, workshop, and registration page evidence: date/time, CTA links, speakers, host hints, agenda, topics, location, status, and change hashes.

jack su

Webinar Event Discovery Scraper

automation-lab/webinar-event-discovery-scraper

Discover public webinars, workshops, demos, and B2B event landing pages from domains or URLs with dates, speakers, CTAs, and evidence.

Stas Persiianenko

Public Webinar Event Finder

automation-lab/public-webinar-event-finder

Find public webinars, demos, workshops, speakers, dates, topics, and registration URLs from company event pages.

Stas Persiianenko

BrightTALK Webinar Search Scraper

automation-lab/brighttalk-webinar-search-scraper

Extract public BrightTALK webinar listings with titles, dates, speakers, channels, tags, images, URLs, and status for B2B marketing research.

Stas Persiianenko

Gong Webinar Info Parser Spider

getdataforme/gong-webinar-info-parser-spider

Automate webinar data extraction from Gong web pages with customizable scraping options. This Apify Actor provides structured JSON output, supporting scalability and ease of use for market research, competitive intelligence, and business automation....

GetDataForMe

Landing Page Conversion Auditor

glowing_glove/landing-page-conversion-auditor

Score landing pages for lead forms, CTAs, trust signals, contact paths, analytics hints, and conversion gaps.

Ushba Khan

Public Landing Page Conversion Snapshot Agent

jacksu/public-landing-page-conversion-agent

Analyze public landing pages for positioning clarity, CTA evidence, social proof, pricing/demo/contact paths, risks, and change status.

jack su

Google Stitch AI Landing Page Generator

alizarin_refrigerator-owner/google-stitch-ai-landing-page-generator

Generate high-converting landing pages using AI. Analyze competitor pages for design inspiration, extract client branding, and create SEO-optimized, AI-indexable landing pages with FAQs, CTAs, trust signals, and schema markup.

The Howlers

Landing.jobs Scraper

unfenced-group/landing-jobs-scraper

Extract tech job listings from Landing.jobs. Filter by keyword, country, remote, contract type and salary. Full descriptions, requirements and perks included.

Unfenced Group

ConvertKit / Kit Creator Profiles & Landing Pages Scraper

crawlerbros/convertkit-scraper

Scrape public ConvertKit (Kit) creator landing pages and profiles. Extract creator name, bio, CTA text, avatar, social links, form type, and tags from ck.page, kit.com, and custom domain landing pages

Crawler Bros

Webinar Landing Page Extractor

What does Webinar Landing Page Extractor do?

Who is it for?

Why use it?

What data can it extract?

How much does it cost to extract webinar landing pages?

Quick start

Input options

startUrls

maxItems

discoverLinks

maxPagesPerStartUrl

includeKeywords

excludeKeywords

Output example

Discovery mode

Date and timezone handling

Speaker extraction

Registration link extraction

Platform hints

Confidence score

Tips for better results

Integrations

API usage

Node.js

Python

cURL

MCP usage

Legality and compliance

FAQ

Can it extract private attendee or registrant lists?

Troubleshooting

Why is dateText filled but startDateIso empty?

Why are speakers empty?

Why did a hub return unrelated topics?

Related scrapers

Changelog

You might also like

Public Webinar & Event Page Intelligence Agent

Webinar Event Discovery Scraper

Public Webinar Event Finder

BrightTALK Webinar Search Scraper

Gong Webinar Info Parser Spider

Landing Page Conversion Auditor

Public Landing Page Conversion Snapshot Agent

Google Stitch AI Landing Page Generator

Landing.jobs Scraper

ConvertKit / Kit Creator Profiles & Landing Pages Scraper

`startUrls`

`maxItems`

`discoverLinks`

`maxPagesPerStartUrl`

`includeKeywords`

`excludeKeywords`

Why is `dateText` filled but `startDateIso` empty?