Betalist Scraper with Contacts | $4 / 1k avatar

Betalist Scraper with Contacts | $4 / 1k

Pricing

$3.99 / 1,000 results

Go to Apify Store
Betalist Scraper with Contacts | $4 / 1k

Betalist Scraper with Contacts | $4 / 1k

Extract structured BetaList startup profiles with contact data, founder details, company descriptions, topics, and website links. Built for enterprise-grade startup sourcing, market intelligence, lead generation, and automated CRM or analytics pipelines.

Pricing

$3.99 / 1,000 results

Rating

0.0

(0)

Developer

Fatih Tahta

Fatih Tahta

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

"BetaList Scraper"

Slug: fatihtahta/betalist-scraper

Overview

This actor is built upon the ValidatedMails.com architecture for contact enrichment workflows.

BetaList Scraper collects structured startup listing data from https://betalist.com, including startup names, canonical URLs, short descriptions, longer summaries, topics, images, featured dates, website links, and optional contact details when available. It supports BetaList search pages, topic pages, individual startup pages, and keyword-based discovery in a single workflow. https://betalist.com is a widely used startup discovery platform, which makes its data valuable for tracking emerging products, categories, and market activity. The actor turns that information into consistent JSON records that are easier to analyze, enrich, export, and monitor over time. By automating collection at scale, it reduces manual research work and helps teams keep datasets current with less effort.

Why Use This Actor

  • Market research and analytics teams: map startup categories, compare themes, and track emerging companies across searches such as fintech, remote work, or developer tools.
  • Product and content teams: source examples, spot trends, and build reports, landing pages, or editorial calendars around new products and market themes.
  • Developers and data engineering teams: send structured BetaList data into ETL pipelines, analytics warehouses, internal tools, and downstream APIs.
  • Lead generation and enrichment teams: build prospect lists with startup names, websites, topics, and optional public contact signals for qualification workflows.
  • Monitoring and competitive intelligence teams: schedule repeat runs to watch for new listings, category movement, and startup visibility over time.

Input Parameters

Provide any combination of URLs, queries, and filters to control what the actor collects.

ParameterTypeDescriptionDefault
startUrlsstring[]One or more BetaList URLs to collect directly. Supported URL types include search result pages, topic pages, and individual startup profile pages. Use this when you already know the BetaList pages you want to cover.
queriesstring[]One or more keywords to search on BetaList, such as startup names, product categories, industries, use cases, or broader market themes. Use this when you want the actor to discover matching startups automatically.
limitintegerMaximum number of startup listings to save per query. Minimum value: 10. Choose a smaller number for sampling and validation, or a larger number for broader market mapping and lead generation.50000
getContactsbooleanWhen enabled, enriches records with additional public contact signals such as email addresses, phone numbers, and social profile links when available. Allowed values: true, false.false
proxyConfigurationobjectOptional connection settings for larger or repeated runs. The default configuration uses Apify Proxy with the RESIDENTIAL group selected.Apify Proxy with RESIDENTIAL group

Example Input

{
"startUrls": [
"https://betalist.com/search?q=meeting",
"https://betalist.com/topics/video-on-demand"
],
"queries": [
"fintech",
"ai agents"
],
"limit": 500,
"getContacts": true
}

Output

6.1 Output destination

The actor writes results to an Apify dataset as JSON records. And the dataset is designed for direct consumption by analytics tools, ETL pipelines, and downstream APIs without post-processing.

6.2 Record envelope (all items)

Every record includes these stable identifiers:

  • type (string, required): Record category.
  • id (number, required): Stable BetaList identifier for the entity.
  • url (string, required): Canonical BetaList URL for the entity.

Recommended idempotency key: type + ":" + id

Use this key for deduplication and upserts when the same startup appears across multiple inputs or repeated runs.

6.3 Examples

Example: startup (type = "startup")

{
"type": "startup",
"id": 3772,
"url": "https://betalist.com/startups/pride",
"record_type": "startup",
"startup_id": 3772,
"slug": "pride",
"title": "Pride",
"name": "Pride",
"one_liner": "A mobile collaboration app for business teams",
"short_description": "A mobile collaboration app for business teams",
"visit_url": "https://betalist.com/startups/pride/visit",
"boosted": false,
"description": "Pride is a free, simple mobile collaboration tool for workgroups. Pride lets you keep your entire team up-to-date on customers and projects; build a culture of radical transparency in your workplace; and reduce email, meetings and duplication of effort.",
"featured_at": "2012-06-15T12:01:02Z",
"topics": [
{
"name": "Enterprise Software",
"url": "https://betalist.com/topics/enterprise-software",
"slug": "enterprise-software"
},
{
"name": "Location Based Services",
"url": "https://betalist.com/topics/location-based-services",
"slug": "location-based-services"
},
{
"name": "Collaboration",
"url": "https://betalist.com/topics/collaboration",
"slug": "collaboration"
},
{
"name": "Mobile Enterprise",
"url": "https://betalist.com/topics/mobile-enterprise",
"slug": "mobile-enterprise"
}
],
"topic_names": [
"Enterprise Software",
"Location Based Services",
"Collaboration",
"Mobile Enterprise"
],
"image_urls": [
"https://resize.imagekit.co/placeholder-image-001/h:300/dpr:2/bg:ffffff/plain/s3://betalist-production/placeholder-image-object"
],
"primary_image_url": "https://resize.imagekit.co/placeholder-image-001/h:300/dpr:2/bg:ffffff/plain/s3://betalist-production/placeholder-image-object",
"website_url": "https://placeholder-startup-example.com",
"contacts_lookup_url": "https://placeholder-startup-example.com",
"contacts": {
"email": {
"values": [
"placeholder123@cvent.com"
]
},
"phone_number": {
"values": [
"+1-800-555-0147"
]
},
"social_media": {
"facebook": "https://www.facebook.com/placeholderbrand",
"twitter": "https://twitter.com/placeholderbrand",
"linkedin": "https://www.linkedin.com/company/placeholderbrand",
"youtube": "https://www.youtube.com/@placeholderbrand",
"instagram": "https://www.instagram.com/placeholderbrand"
}
},
"source": {
"betalist_url": "https://betalist.com/startups/pride",
"seed_value": "https://betalist.com/search?q=meeting",
"source_url": "https://betalist.com/startups/pride",
"scraped_time": "2026-03-20T11:28:07Z"
}
}

Field reference

Startup fields (type = "startup")

  • type (string, required): Record category.
  • id (number, required): Stable startup identifier.
  • url (string, required): Canonical BetaList startup URL.
  • record_type (string, optional): Source-specific record category value.
  • startup_id (number, optional): Startup identifier repeated as a convenience field.
  • slug (string, optional): URL-friendly startup slug.
  • title (string, optional): Startup title as displayed.
  • name (string, optional): Startup name.
  • one_liner (string, optional): Short product summary.
  • short_description (string, optional): Compact description text.
  • visit_url (string, optional): Outbound visit link associated with the startup.
  • boosted (boolean, optional): Whether the listing is marked as boosted.
  • description (string, optional): Longer startup description.
  • featured_at (string, optional): Featured timestamp in ISO 8601 format.
  • topics (array[object], optional): Topic objects attached to the startup.
  • topics.name (string, optional): Topic display name.
  • topics.url (string, optional): Topic page URL.
  • topics.slug (string, optional): Topic slug.
  • topic_names (array[string], optional): Flat list of topic names.
  • image_urls (array[string], optional): Startup image URLs.
  • primary_image_url (string, optional): Main image URL.
  • website_url (string, optional): Startup website URL.
  • contacts_lookup_url (string, optional): Public URL used for contact enrichment.
  • contacts (object, optional): Contact details when enrichment is enabled and data is available.
  • contacts.email.values (array[string], optional): Public email addresses.
  • contacts.phone_number.values (array[string], optional): Public phone numbers.
  • contacts.social_media.facebook (string, optional): Facebook profile URL.
  • contacts.social_media.twitter (string, optional): Twitter or X profile URL.
  • contacts.social_media.linkedin (string, optional): LinkedIn profile URL.
  • contacts.social_media.youtube (string, optional): YouTube profile URL.
  • contacts.social_media.instagram (string, optional): Instagram profile URL.
  • source (object, optional): Source and collection metadata.
  • source.betalist_url (string, optional): BetaList startup page URL.
  • source.seed_value (string, optional): Original input value that led to the record.
  • source.source_url (string, optional): Source page where the record was collected.
  • source.scraped_time (string, optional): Extraction timestamp in ISO 8601 format.

Data guarantees & handling

  • Best-effort extraction: fields may vary by region, session, availability, or UI experiments.
  • Optional fields: null-check in downstream code.
  • Deduplication: recommend type + ":" + id.

How to Run on Apify

  1. Open the actor in Apify Console.
  2. Configure your search parameters by adding BetaList URLs, search queries, or both.
  3. Set the maximum number of outputs to collect for each query.
  4. Choose whether to enrich records with contact details.
  5. Click Start and wait for the run to finish.
  6. Download results in JSON, CSV, Excel, or other supported formats.

Scheduling & Automation

Scheduling

Automated Data Collection

You can schedule recurring runs to keep your BetaList dataset fresh without manual work. This is useful for ongoing market tracking, reporting, enrichment, and competitive monitoring.

  • Navigate to Schedules in Apify Console
  • Create a new schedule (daily, weekly, or custom cron)
  • Configure input parameters
  • Enable notifications for run completion
  • Optional: add webhooks for automated processing

Integration Options

  • Webhooks: Trigger downstream actions when a run completes
  • Zapier: Connect to 5,000+ apps without coding
  • Make (Integromat): Build multi-step automation workflows
  • Google Sheets: Export results to a spreadsheet
  • Slack/Discord: Receive notifications and summaries
  • Email: Send automated reports via email

Performance

Estimated run times:

  • Small runs (< 1,000 outputs): ~2-3 minutes
  • Medium runs (1,000-5,000 outputs): ~5-15 minutes
  • Large runs (5,000+ outputs): ~15-30 minutes

Execution time varies based on filters, result volume, and how much information is returned per record.

Compliance & Ethics

Responsible Data Collection

This actor collects publicly available product listings information from https://betalist.com for legitimate business purposes, including:

  • startup ecosystem research and market analysis
  • prospect list building and enrichment
  • competitive monitoring and trend tracking

Users are responsible for ensuring their use of collected data complies with applicable laws, regulations, and the target site's terms. This section is informational and not legal advice.

Best Practices

  • Use collected data in accordance with applicable laws, regulations, and the target site's terms
  • Respect individual privacy and personal information
  • Use data responsibly and avoid disruptive or excessive collection
  • Do not use this actor for spamming, harassment, or other harmful purposes
  • Follow relevant data protection requirements where applicable (e.g., GDPR, CCPA)

Support

For help, use the Issues section or the actor page in Apify Console. Include the input you used with sensitive values redacted, the run ID, a short description of expected versus actual behavior, and, if helpful, a small output sample.