Linkedin Profile Scraper avatar
Linkedin Profile Scraper
Under maintenance

Pricing

$1.00 / 1,000 results

Go to Apify Store
Linkedin Profile Scraper

Linkedin Profile Scraper

Under maintenance

An Apify actor that automates scraping of LinkedIn Sales Navigator search results using Playwright browser automation.

Pricing

$1.00 / 1,000 results

Rating

5.0

(3)

Developer

Yurii Lypnyi

Yurii Lypnyi

Maintained by Community

Actor stats

0

Bookmarked

6

Total users

6

Monthly active users

11 days ago

Last modified

Share

LinkedIn Sales Navigator Scraper

An Apify actor that automates scraping of LinkedIn Sales Navigator search results using Playwright browser automation. Extracts lead information including names, job titles, companies, locations, profile URLs, and connection degrees. Supports authentication via LinkedIn session cookies, configurable delays, proxy settings, and multi-page pagination.

Features

  • LinkedIn Authentication - Inject session cookies for authenticated access to Sales Navigator
  • Lead Data Extraction - Parse and extract structured lead information from search results
  • Intelligent Scrolling - Automatically detect and scroll search result containers to load all results on a page
  • Multi-Page Scraping - Automatically handle pagination with configurable page limits
  • Proxy Support - Optional Apify proxy integration with residential IP support
  • Request Delays - Configurable min/max delays between requests to avoid detection
  • Error Handling - Comprehensive error handling with detailed logging
  • Debug Mode - Automatic capture of HTML and screenshots for troubleshooting
  • User Agent Spoofing - Custom user agent support for browser identification
  • Apify SDK - Full Apify integration for cloud execution
  • Playwright - Browser automation for JavaScript-rendered content

Configuration

Input Parameters

  • cookies (required, array) - Array of LinkedIn session cookies for authentication. Each cookie should have:

    • name - Cookie name
    • value - Cookie value
    • domain - Cookie domain (typically .linkedin.com)
    • Other standard cookie properties (path, expires, secure, httpOnly, sameSite)
  • searchUrl (required, string) - LinkedIn Sales Navigator search URL to scrape

    • Example: https://www.linkedin.com/sales/search/people?query=...
  • userAgent (optional, string) - Browser user agent string

    • Default: Modern Chrome user agent
  • maxPages (optional, integer) - Maximum number of search result pages to scrape

    • 0 = unlimited pagination
    • Default: 0
  • minDelay (optional, integer) - Minimum delay between page requests in seconds

    • Default: 5
  • maxDelay (optional, integer) - Maximum delay between page requests in seconds

    • Default: 20
  • proxy (optional, object) - Proxy configuration

    • useApifyProxy - Enable Apify proxy (boolean)
    • apifyProxyGroups - Proxy groups to use, e.g., ["RESIDENTIAL"]

Output Data

The actor extracts the following lead information for each profile:

{
"name": "Full name",
"firstName": "First name",
"lastName": "Last name",
"headline": "Current position headline",
"jobTitle": "Job title",
"companyName": "Current company",
"location": "Location",
"profileUrl": "Direct Sales Navigator profile URL",
"profileId": "LinkedIn profile ID",
"profilePictureUrl": "URL to profile picture",
"connectionType": "1st"
}

Connection Types:

  • "1st" = 1st degree connection
  • "2nd" = 2nd degree connection
  • "3rd" = 3rd+ degree connection or not connected

Summary Data

The actor also outputs a summary with scraping statistics:

{
"totalResults": 25,
"pages": 1,
"timestamp": "2025-11-16T13:26:01.283Z"
}

This summary is stored in the Actor's key-value store under the summary key and can be used to track scraping progress and performance.

Getting Started

Local Testing

  1. Update apify_storage/key_value_stores/default/INPUT.json or INPUT.json with your LinkedIn cookies and search URL
  2. Run the actor locally:
cd Linkedin-Profile-Scraper
apify run --input-file=INPUT.json

Note: The actor will try to load input from the Actor API first, then fall back to apify_storage, and finally to local INPUT.json.

Extracting LinkedIn Cookies

  1. Open LinkedIn Sales Navigator in your browser
  2. Open Developer Tools (F12)
  3. Go to Application → Cookies
  4. Export cookies (you can use browser extensions or manually copy them)
  5. Format as JSON array and add to INPUT.json

Project Structure

src/
├── main.py - Main actor entry point and orchestration
├── auth.py - LinkedIn authentication and cookie handling
├── parser.py - HTML parsing and lead data extraction
├── proxy.py - Proxy management and configuration
├── pagination.py - Multi-page navigation handling
├── delays.py - Request delays with jitter and exponential backoff
└── errors.py - Error handling and logging utilities

Deployment to Apify

Method 1: Git Repository

  1. Go to Actor creation page
  2. Click Link Git Repository button
  3. Configure and deploy

Method 2: Local Push

apify login
apify push

Error Handling

The actor includes comprehensive error handling for:

  • Authentication Errors - Invalid or expired cookies
  • Network Errors - Connection timeouts and failures
  • Parsing Errors - HTML structure changes
  • Navigation Errors - Page navigation failures
  • Proxy Errors - Proxy connection issues

All errors are logged with context for debugging.

Viewing Results

In Apify Platform

When the actor runs on Apify, results are available in multiple places:

  1. Dataset - Browse all scraped lead records

    • Go to your run's detail page → Dataset tab
    • View results as table, JSON, or export to CSV/Excel
  2. Summary - View scraping statistics

    • Go to your run's detail page → Key-value store tab
    • Look for the summary key with totals and timestamp
  3. Logs - Monitor execution progress

    • Go to your run's detail page → Logs tab
    • Shows page-by-page progress and error messages

Local Results

When running locally:

  • Dataset output - Results are stored in storage/datasets/default/
  • Summary - Summary stored in storage/key_value_stores/default/ under summary key
  • Debug files - HTML and screenshots stored in debug/ directory

Debug Output

When running the actor, it automatically creates debug files in the debug/ directory:

  • debug_page_{N}.html - Raw HTML content of each scraped page (useful for troubleshooting parsing issues)
  • page_{N}.png - Full-page screenshots of each scraped page

These files are helpful for debugging parsing issues or verifying that the page loaded correctly.

Rate Limiting

The actor implements configurable delays to avoid detection:

  • Random delay between minDelay + 2 and maxDelay + 5 seconds between page transitions
  • Variable scroll intervals (700-1200ms between scrolls)
  • Graceful handling of dynamic content loading

Important Notes

  • ⚠️ Ensure LinkedIn session cookies are valid and not expired
  • ⚠️ Respect LinkedIn's Terms of Service
  • ⚠️ Use appropriate delays to avoid account restrictions
  • 💡 Consider using Apify's Residential Proxy for better success rates

Resources