Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

Agency Lead Intelligence

Deprecated

See alternative Actors

Production-ready Agency Vista scraper for extracting high-quality B2B agency leads, contact details, websites, services, ratings, and CRM-ready business data for sales, recruiting, and outreach workflows.

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Rohith S

Actor stats

Bookmarked

Total users

Monthly active users

a month ago

Last modified

Agency Lead Intelligence 🚀

Production-ready B2B lead-generation Apify Actor that extracts agency leads from public directories and exports CRM-ready datasets for sales teams, recruiters, outreach agencies, and lead-generation businesses.

LICENSE

🎯 What This Actor Does

Agency Lead Intelligence crawls agency directories and extracts structured B2B lead data at scale. It supports:

agencyvista

Other directories are planned for future releases.

The actor:

🔎 discovers agencies from listing and search pages
🏢 opens agency profile pages and extracts structured data
🌐 optionally visits agency websites for lightweight enrichment
📤 exports results for spreadsheets and CRM workflows

Perfect For

🏢 Sales Teams: Find and contact agencies in target markets
🤝 Recruiters: Build agency outreach and partnership pipelines
📧 Outreach Agencies: Scale prospecting with cleaner contact data
💼 Lead-Gen Businesses: Build curated agency datasets
🔍 Prospecting Workflows: Move structured results into CRMs quickly

✨ Key Features

Feature	Details
🔎 Directory Support	Agency Vista only for now
⚙️ Smart Search Inputs	Keywords, categories, locations, services, ratings
🌐 Website Enrichment	Extract additional emails, phones, socials, contact page links
📤 CRM-Ready Exports	JSON, CSV, HubSpot, Salesforce, Apollo-style CSV outputs
🧹 Deduplication	Domain/profile/name-based deduplication logic
🛡️ Anti-Blocking Basics	Proxy support, randomized delays, browser-like headers
📈 Scalable Runs	Configurable concurrency, retries, graceful failure handling
💸 Low Overhead	Lightweight HTTP-first crawl flow using Crawlee + `gotScraping`

📊 Extracted Data Fields

Each lead can include fields such as:

company_name
agency_profile_url
website
domain
emails
phone_numbers
location
categories
services
rating
review_count
description
employee_range
founded_year
linkedin_url
twitter_url
facebook_url
instagram_url
contact_page_url
quality_score
source_directory
scraped_at
enriched
enriched_at

Example dataset item:

{
  "company_name": "Rocket Digital Agency",
  "agency_profile_url": "https://agencyvista.com/agency/rocket-digital/summary",
  "website": "https://rocketdigital.com",
  "domain": "rocketdigital.com",
  "emails": ["hello@rocketdigital.com", "contact@rocketdigital.com"],
  "phone_numbers": ["+1-555-123-4567"],
  "location": {
    "city": "New York",
    "state": "NY",
    "country": "United States",
    "full_address": "123 Broadway, New York, NY 10001"
  },
  "categories": ["Digital Marketing", "SEO", "PPC"],
  "services": ["Google Ads", "SEO Audits", "Content Marketing", "Social Media"],
  "rating": 4.8,
  "review_count": 47,
  "description": "Full-service digital marketing agency specializing in ROI-driven campaigns.",
  "employee_range": "10-50",
  "founded_year": 2015,
  "linkedin_url": "https://linkedin.com/company/rocket-digital",
  "twitter_url": "https://twitter.com/rocketdigital",
  "facebook_url": "https://facebook.com/rocketdigital",
  "instagram_url": "https://instagram.com/rocketdigital",
  "contact_page_url": "https://rocketdigital.com/contact",
  "source_directory": "agencyvista",
  "quality_score": 88,
  "enriched": true,
  "scraped_at": "2026-05-12T14:22:45.000Z",
  "enriched_at": "2026-05-12T14:23:00.000Z"
}

🏗️ How It Works

Agency Lead Intelligence
├── Search discovery
│   ├── Build search URLs from filters
│   └── Crawl directory listing/search pages
├── Profile extraction
│   ├── Discover profile URLs
│   └── Extract structured agency data
├── Website enrichment
│   ├── Visit agency websites when enabled
│   └── Try to find emails, phones, socials, contact pages
└── Output
    ├── Save dataset items
    └── Generate optional export files

Current runtime characteristics:

Uses BasicCrawler with gotScraping
Uses Agency Vista-specific parsing
Enrichment is lightweight and HTTP-first
Other directory adapters are planned, but not active yet

⚙️ Input Configuration

Basic Search

{
  "keywords": ["digital marketing", "SEO agency"],
  "locations": ["New York", "California"],
  "maxResults": 200,
  "exportFormat": "hubspot"
}

Full Configuration

{
  "keywords": ["digital marketing"],
  "categories": ["SEO", "PPC", "Social Media"],
  "locations": ["United States"],
  "services": ["Google Ads", "content marketing"],
  "minRating": 4,
  "maxResults": 500,
  "maxConcurrency": 5,
  "enableEnrichment": true,
  "deepEnrichment": false,
  "exportFormat": "salesforce",
  "targetDirectory": "agencyvista",
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

Input Parameters

Parameter	Type	Default	Description
`keywords`	`string[]`	`[]`	Search keywords
`categories`	`string[]`	`[]`	Agency category filters
`locations`	`string[]`	`[]`	City, state, country filters
`services`	`string[]`	`[]`	Service offering filters
`minRating`	`number`	`0`	Minimum rating from `0-5`
`maxResults`	`number`	`100`	Max leads to extract. `0` means no explicit limit
`maxConcurrency`	`number`	`5`	Concurrent requests. Current implementation clamps to `1-5`
`enableEnrichment`	`boolean`	`true`	Visit agency websites for more data
`deepEnrichment`	`boolean`	`false`	Present in input schema, but not actively used in the current runtime flow
`exportFormat`	`string`	`json`	`json`, `csv`, `hubspot`, `salesforce`, `apollo`
`startUrls`	`array`	`[]`	Optional custom search URLs. Overrides generated search URLs
`targetDirectory`	`string`	`agencyvista`	Directory to scrape. Currently only `agencyvista` is active
`proxyConfiguration`	`object`	Apify proxy prefill	Proxy settings for larger production runs
`requestTimeoutSecs`	`number`	`30`	Request timeout, normalized to `10-120`
`maxRetries`	`number`	`3`	Max retries, normalized to `0-10`

📤 Export Formats

JSON (Default)

Full structured data in the default dataset. Best for custom pipelines, APIs, and downstream processing.

Generic CSV

Universal CSV format compatible with spreadsheets and generic CRM imports.

HubSpot CSV

Pre-mapped for HubSpot-style imports, including fields such as:

Company Name
Website URL
Phone Number
Email
City
State/Region
Country/Region

Salesforce CSV

Pre-mapped for Salesforce-style account imports, including:

Account Name
Website
Phone
BillingCity
BillingState
BillingCountry
Industry

Apollo CSV

Pre-mapped for Apollo-style company imports.

When exportFormat is not json, the actor also writes a file to the default key-value store:

leads.csv
leads.hubspot
leads.salesforce
leads.apollo

🛡️ Anti-Blocking Strategy

🔁 Rotating Proxies: Supports Apify proxy configuration
🕒 Adaptive Delays: Randomized delays between requests
🧾 Realistic Headers: Browser-like request headers
♻️ Retry Logic: Failed core requests are retried through the crawl flow
⚠️ Graceful Enrichment Failure: If a website is unreachable, the base profile lead is still preserved

📋 Use Cases

Sales Prospecting

"Find digital marketing agencies in New York with strong ratings and export them for outreach."

Recruiter Pipeline

"Build a list of web design agencies in California for partnership or recruiting outreach."

Competitive Intelligence

"Map SEO agencies in the UK with employee counts, founding years, and social profiles."

Lead Operations

"Generate structured agency lead datasets for CRM import and internal prospecting workflows."

🚀 Quick Start

Prerequisites

Node.js 18+
Apify CLI for local Actor runs

Install

$npm install

Run locally

npm run build
npm start

Development scripts

npm run start:dev
npm run dev
npm run type-check
npm run lint

Apify CLI

$apify run

🔧 Technical Details

Runtime: Node.js 18+ with TypeScript
Crawler Core: Crawlee BasicCrawler
HTTP Layer: gotScraping
SDK: Apify SDK v3
Directory Parsing: Adapter-based extraction per supported directory
Deduplication: Domain, normalized profile URL, then company-name hash fallback
Exports: csv-stringify with custom field mappers

⚠️ Current Notes

deepEnrichment is visible in the input schema, but the current runtime does not use a separate live Playwright enrichment flow.
Website enrichment is lightweight and does not guarantee JavaScript-rendered contact data.
Dataset output can contain both a base profile item and a later enriched item for the same agency. If you need one final row per company, deduplicate by dedup_key, domain, or agency_profile_url.
Directory markup changes over time, so selector updates may occasionally be required.

📝 License

Apache 2.0. See LICENSE for details.

🤝 Support

📧 Found a bug? Open an issue or leave a detailed review note.
📖 Check the Apify Documentation
💬 Join the Apify Discord Community

Built with Crawlee and Apify SDK v3 ✨

US Public Transit Agency Leads & Intelligence Scraper

scrapesage/us-transit-agency-leads-scraper

Scrape every US public transit agency from the official FTA National Transit Database: agency, organization type, website, address, fleet size, modes, annual ridership, operating budget, service area & lead score. Govtech leads, monitor mode.

Scrape Sage

Clutch Reviews Scraper - Low-cost💲🔥🚀⭐

delectable_incubator/clutch-reviews-scraper-low-cost

Scrape Clutch.co company reviews 🔍⭐ with a powerful review scraper. Extract review titles, overall ratings, quality, cost, schedule scores, profile URLs, and more. Ideal for reputation monitoring, competitor benchmarking, agency research, lead intelligence, and market analysis 📊🚀

Prime Scrape

5.0

(1)

Clutch Reviews Scraper - Cheap 🚀⭐

scrapestorm/clutch-reviews-scraper---cheap

🔍 Easily Collect Clutch Company Reviews ⭐ Extract structured review data from Clutch.co including review title, overall rating, quality, cost & schedule ratings profile URL & more Perfect for reputation monitoring, competitor benchmarking, agency research, lead intelligence & market analysis 📊🔥

Storm_Scraper

5.0

(1)

GoodFirms Company Reviews Scraper - Low-cost💲🔥🔍⭐

delectable_incubator/goodfirms-company-reviews-scraper-low-cost

Scrape GoodFirms company reviews 🔍⭐ with a powerful review data scraper. Extract overall ratings, review content, quality scores, company responses, reviewer details & profile URLs. Ideal for reputation monitoring, client sentiment analysis, agency research, and market intelligence 📊🚀

Prime Scrape

5.0

(1)

GoodFirms B2B Companies & Software Scraper

haketa/goodfirms-scraper

GoodFirms scraper & API: find B2B agencies & software by service category and location — company name, rating & reviews, hourly rate, minimum budget, team size, founded year, location, services and website. B2B vendor research and agency lead generation — fast, no login.

Haketa

Shopify Store Opportunity Intelligence

thescrapelab/shopify-store-opportunity-intelligence

Analyze Shopify stores for products, pricing, tech stack, public contacts, competitor monitoring, and agency lead opportunities.

Inus Grobler

Agency Directory Scraper & Lead Finder

ryanclinton/agency-directory-scraper

Scrapes marketing, design, and tech agencies from Google Maps, SuperbCompanies.com, and TheManifest.com into one deduplicated dataset. Extracts name, website, phone, address, services, team size, and rating. $0.05/agency.

Ryan Clinton

Scrapling Website Lead Extractor

solutionssmart/scrapling-website-lead-extractor

Crawl public business websites with Scrapling to extract emails, phones, social profiles, contact pages, automation gaps, and lead scores for CRM-ready outreach.

Solutions Smart

Tennis Abstract Player Match Scraper

parseforge/tennis-abstract-scraper

Surface player and team records from Tennis Abstract with stats, rankings, profiles, history and head to head splits when published. Perfect for fantasy sports, betting analytics, agencies and sports media. Run on demand or on a recurring schedule and feed every row into your favourite analytics.

ParseForge

NoDesk Remote Jobs Scraper

parseforge/nodesk-remote-jobs-scraper

Gather active job listings from Nodesk Remote Jobs with title, company, location, remote flag, posted date, salary when published and the direct apply link. Loved by recruiters, agencies, aggregator sites and job hunters. Run on demand or on a recurring schedule and feed every row into your favou.

ParseForge

Behance Jobs Scraper - Creative Design Roles

parseforge/behance-jobs-scraper

Monitor active job listings from Behance Jobs with title, company, location, remote flag, posted date, salary when published and the direct apply link. Trusted by recruiters, agencies, aggregator sites and job hunters. Run on demand or on a recurring schedule and feed every row into your favourit.

ParseForge

Agency Lead Intelligence

Agency Lead Intelligence 🚀

🎯 What This Actor Does

Perfect For

✨ Key Features

📊 Extracted Data Fields

🏗️ How It Works

⚙️ Input Configuration

Basic Search

Full Configuration

Input Parameters

📤 Export Formats

JSON (Default)

Generic CSV

HubSpot CSV

Salesforce CSV

Apollo CSV

🛡️ Anti-Blocking Strategy

📋 Use Cases

Sales Prospecting

Recruiter Pipeline

Competitive Intelligence

Lead Operations

🚀 Quick Start

Prerequisites

Install

Run locally

Development scripts

Apify CLI

🔧 Technical Details

⚠️ Current Notes

📝 License

🤝 Support

You might also like

US Public Transit Agency Leads & Intelligence Scraper

Clutch Reviews Scraper - Low-cost💲🔥🚀⭐

Clutch Reviews Scraper - Cheap 🚀⭐

GoodFirms Company Reviews Scraper - Low-cost💲🔥🔍⭐

GoodFirms B2B Companies & Software Scraper

Shopify Store Opportunity Intelligence

Agency Directory Scraper & Lead Finder

Scrapling Website Lead Extractor

Tennis Abstract Player Match Scraper

NoDesk Remote Jobs Scraper

Behance Jobs Scraper - Creative Design Roles