Agency Lead Intelligence avatar

Agency Lead Intelligence

Pricing

from $20.00 / 1,000 results

Go to Apify Store
Agency Lead Intelligence

Agency Lead Intelligence

Production-ready Agency Vista scraper for extracting high-quality B2B agency leads, contact details, websites, services, ratings, and CRM-ready business data for sales, recruiting, and outreach workflows.

Pricing

from $20.00 / 1,000 results

Rating

0.0

(0)

Developer

Rohith S

Rohith S

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

4 days ago

Last modified

Share

Agency Lead Intelligence ๐Ÿš€

Production-ready B2B lead-generation Apify Actor that extracts agency leads from public directories and exports CRM-ready datasets for sales teams, recruiters, outreach agencies, and lead-generation businesses.

Apify Actor TypeScript Crawlee LICENSE


๐ŸŽฏ What This Actor Does

Agency Lead Intelligence crawls agency directories and extracts structured B2B lead data at scale. It supports:

  • agencyvista

Other directories are planned for future releases.

The actor:

  • ๐Ÿ”Ž discovers agencies from listing and search pages
  • ๐Ÿข opens agency profile pages and extracts structured data
  • ๐ŸŒ optionally visits agency websites for lightweight enrichment
  • ๐Ÿ“ค exports results for spreadsheets and CRM workflows

Perfect For

  • ๐Ÿข Sales Teams: Find and contact agencies in target markets
  • ๐Ÿค Recruiters: Build agency outreach and partnership pipelines
  • ๐Ÿ“ง Outreach Agencies: Scale prospecting with cleaner contact data
  • ๐Ÿ’ผ Lead-Gen Businesses: Build curated agency datasets
  • ๐Ÿ” Prospecting Workflows: Move structured results into CRMs quickly

โœจ Key Features

FeatureDetails
๐Ÿ”Ž Directory SupportAgency Vista only for now
โš™๏ธ Smart Search InputsKeywords, categories, locations, services, ratings
๐ŸŒ Website EnrichmentExtract additional emails, phones, socials, contact page links
๐Ÿ“ค CRM-Ready ExportsJSON, CSV, HubSpot, Salesforce, Apollo-style CSV outputs
๐Ÿงน DeduplicationDomain/profile/name-based deduplication logic
๐Ÿ›ก๏ธ Anti-Blocking BasicsProxy support, randomized delays, browser-like headers
๐Ÿ“ˆ Scalable RunsConfigurable concurrency, retries, graceful failure handling
๐Ÿ’ธ Low OverheadLightweight HTTP-first crawl flow using Crawlee + gotScraping

๐Ÿ“Š Extracted Data Fields

Each lead can include fields such as:

  • company_name
  • agency_profile_url
  • website
  • domain
  • emails
  • phone_numbers
  • location
  • categories
  • services
  • rating
  • review_count
  • description
  • employee_range
  • founded_year
  • linkedin_url
  • twitter_url
  • facebook_url
  • instagram_url
  • contact_page_url
  • quality_score
  • source_directory
  • scraped_at
  • enriched
  • enriched_at

Example dataset item:

{
"company_name": "Rocket Digital Agency",
"agency_profile_url": "https://agencyvista.com/agency/rocket-digital/summary",
"website": "https://rocketdigital.com",
"domain": "rocketdigital.com",
"emails": ["hello@rocketdigital.com", "contact@rocketdigital.com"],
"phone_numbers": ["+1-555-123-4567"],
"location": {
"city": "New York",
"state": "NY",
"country": "United States",
"full_address": "123 Broadway, New York, NY 10001"
},
"categories": ["Digital Marketing", "SEO", "PPC"],
"services": ["Google Ads", "SEO Audits", "Content Marketing", "Social Media"],
"rating": 4.8,
"review_count": 47,
"description": "Full-service digital marketing agency specializing in ROI-driven campaigns.",
"employee_range": "10-50",
"founded_year": 2015,
"linkedin_url": "https://linkedin.com/company/rocket-digital",
"twitter_url": "https://twitter.com/rocketdigital",
"facebook_url": "https://facebook.com/rocketdigital",
"instagram_url": "https://instagram.com/rocketdigital",
"contact_page_url": "https://rocketdigital.com/contact",
"source_directory": "agencyvista",
"quality_score": 88,
"enriched": true,
"scraped_at": "2026-05-12T14:22:45.000Z",
"enriched_at": "2026-05-12T14:23:00.000Z"
}

๐Ÿ—๏ธ How It Works

Agency Lead Intelligence
โ”œโ”€โ”€ Search discovery
โ”‚ โ”œโ”€โ”€ Build search URLs from filters
โ”‚ โ””โ”€โ”€ Crawl directory listing/search pages
โ”œโ”€โ”€ Profile extraction
โ”‚ โ”œโ”€โ”€ Discover profile URLs
โ”‚ โ””โ”€โ”€ Extract structured agency data
โ”œโ”€โ”€ Website enrichment
โ”‚ โ”œโ”€โ”€ Visit agency websites when enabled
โ”‚ โ””โ”€โ”€ Try to find emails, phones, socials, contact pages
โ””โ”€โ”€ Output
โ”œโ”€โ”€ Save dataset items
โ””โ”€โ”€ Generate optional export files

Current runtime characteristics:

  • Uses BasicCrawler with gotScraping
  • Uses Agency Vista-specific parsing
  • Enrichment is lightweight and HTTP-first
  • Other directory adapters are planned, but not active yet

โš™๏ธ Input Configuration

{
"keywords": ["digital marketing", "SEO agency"],
"locations": ["New York", "California"],
"maxResults": 200,
"exportFormat": "hubspot"
}

Full Configuration

{
"keywords": ["digital marketing"],
"categories": ["SEO", "PPC", "Social Media"],
"locations": ["United States"],
"services": ["Google Ads", "content marketing"],
"minRating": 4,
"maxResults": 500,
"maxConcurrency": 5,
"enableEnrichment": true,
"deepEnrichment": false,
"exportFormat": "salesforce",
"targetDirectory": "agencyvista",
"proxyConfiguration": {
"useApifyProxy": true,
"apifyProxyGroups": ["RESIDENTIAL"]
}
}

Input Parameters

ParameterTypeDefaultDescription
keywordsstring[][]Search keywords
categoriesstring[][]Agency category filters
locationsstring[][]City, state, country filters
servicesstring[][]Service offering filters
minRatingnumber0Minimum rating from 0-5
maxResultsnumber100Max leads to extract. 0 means no explicit limit
maxConcurrencynumber5Concurrent requests. Current implementation clamps to 1-5
enableEnrichmentbooleantrueVisit agency websites for more data
deepEnrichmentbooleanfalsePresent in input schema, but not actively used in the current runtime flow
exportFormatstringjsonjson, csv, hubspot, salesforce, apollo
startUrlsarray[]Optional custom search URLs. Overrides generated search URLs
targetDirectorystringagencyvistaDirectory to scrape. Currently only agencyvista is active
proxyConfigurationobjectApify proxy prefillProxy settings for larger production runs
requestTimeoutSecsnumber30Request timeout, normalized to 10-120
maxRetriesnumber3Max retries, normalized to 0-10

๐Ÿ“ค Export Formats

JSON (Default)

Full structured data in the default dataset. Best for custom pipelines, APIs, and downstream processing.

Generic CSV

Universal CSV format compatible with spreadsheets and generic CRM imports.

HubSpot CSV

Pre-mapped for HubSpot-style imports, including fields such as:

  • Company Name
  • Website URL
  • Phone Number
  • Email
  • City
  • State/Region
  • Country/Region

Salesforce CSV

Pre-mapped for Salesforce-style account imports, including:

  • Account Name
  • Website
  • Phone
  • BillingCity
  • BillingState
  • BillingCountry
  • Industry

Apollo CSV

Pre-mapped for Apollo-style company imports.

When exportFormat is not json, the actor also writes a file to the default key-value store:

  • leads.csv
  • leads.hubspot
  • leads.salesforce
  • leads.apollo

๐Ÿ›ก๏ธ Anti-Blocking Strategy

  • ๐Ÿ” Rotating Proxies: Supports Apify proxy configuration
  • ๐Ÿ•’ Adaptive Delays: Randomized delays between requests
  • ๐Ÿงพ Realistic Headers: Browser-like request headers
  • โ™ป๏ธ Retry Logic: Failed core requests are retried through the crawl flow
  • โš ๏ธ Graceful Enrichment Failure: If a website is unreachable, the base profile lead is still preserved

๐Ÿ“‹ Use Cases

Sales Prospecting

"Find digital marketing agencies in New York with strong ratings and export them for outreach."

Recruiter Pipeline

"Build a list of web design agencies in California for partnership or recruiting outreach."

Competitive Intelligence

"Map SEO agencies in the UK with employee counts, founding years, and social profiles."

Lead Operations

"Generate structured agency lead datasets for CRM import and internal prospecting workflows."


๐Ÿš€ Quick Start

Prerequisites

Install

$npm install

Run locally

npm run build
npm start

Development scripts

npm run start:dev
npm run dev
npm run type-check
npm run lint

Apify CLI

$apify run

๐Ÿ”ง Technical Details

  • Runtime: Node.js 18+ with TypeScript
  • Crawler Core: Crawlee BasicCrawler
  • HTTP Layer: gotScraping
  • SDK: Apify SDK v3
  • Directory Parsing: Adapter-based extraction per supported directory
  • Deduplication: Domain, normalized profile URL, then company-name hash fallback
  • Exports: csv-stringify with custom field mappers

โš ๏ธ Current Notes

  • deepEnrichment is visible in the input schema, but the current runtime does not use a separate live Playwright enrichment flow.
  • Website enrichment is lightweight and does not guarantee JavaScript-rendered contact data.
  • Dataset output can contain both a base profile item and a later enriched item for the same agency. If you need one final row per company, deduplicate by dedup_key, domain, or agency_profile_url.
  • Directory markup changes over time, so selector updates may occasionally be required.

๐Ÿ“ License

Apache 2.0. See LICENSE for details.


๐Ÿค Support


Built with Crawlee and Apify SDK v3 โœจ