Linkedin Profile Scraper
Pricing
$1.00 / 1,000 results
Linkedin Profile Scraper
An Apify actor that automates scraping of LinkedIn Sales Navigator search results using Playwright browser automation.
Pricing
$1.00 / 1,000 results
Rating
5.0
(3)
Developer

Yurii Lypnyi
Actor stats
0
Bookmarked
6
Total users
6
Monthly active users
11 days ago
Last modified
Categories
Share
LinkedIn Sales Navigator Scraper
An Apify actor that automates scraping of LinkedIn Sales Navigator search results using Playwright browser automation. Extracts lead information including names, job titles, companies, locations, profile URLs, and connection degrees. Supports authentication via LinkedIn session cookies, configurable delays, proxy settings, and multi-page pagination.
Features
- LinkedIn Authentication - Inject session cookies for authenticated access to Sales Navigator
- Lead Data Extraction - Parse and extract structured lead information from search results
- Intelligent Scrolling - Automatically detect and scroll search result containers to load all results on a page
- Multi-Page Scraping - Automatically handle pagination with configurable page limits
- Proxy Support - Optional Apify proxy integration with residential IP support
- Request Delays - Configurable min/max delays between requests to avoid detection
- Error Handling - Comprehensive error handling with detailed logging
- Debug Mode - Automatic capture of HTML and screenshots for troubleshooting
- User Agent Spoofing - Custom user agent support for browser identification
- Apify SDK - Full Apify integration for cloud execution
- Playwright - Browser automation for JavaScript-rendered content
Configuration
Input Parameters
-
cookies (required, array) - Array of LinkedIn session cookies for authentication. Each cookie should have:
name- Cookie namevalue- Cookie valuedomain- Cookie domain (typically.linkedin.com)- Other standard cookie properties (path, expires, secure, httpOnly, sameSite)
-
searchUrl (required, string) - LinkedIn Sales Navigator search URL to scrape
- Example:
https://www.linkedin.com/sales/search/people?query=...
- Example:
-
userAgent (optional, string) - Browser user agent string
- Default: Modern Chrome user agent
-
maxPages (optional, integer) - Maximum number of search result pages to scrape
- 0 = unlimited pagination
- Default: 0
-
minDelay (optional, integer) - Minimum delay between page requests in seconds
- Default: 5
-
maxDelay (optional, integer) - Maximum delay between page requests in seconds
- Default: 20
-
proxy (optional, object) - Proxy configuration
useApifyProxy- Enable Apify proxy (boolean)apifyProxyGroups- Proxy groups to use, e.g., ["RESIDENTIAL"]
Output Data
The actor extracts the following lead information for each profile:
{"name": "Full name","firstName": "First name","lastName": "Last name","headline": "Current position headline","jobTitle": "Job title","companyName": "Current company","location": "Location","profileUrl": "Direct Sales Navigator profile URL","profileId": "LinkedIn profile ID","profilePictureUrl": "URL to profile picture","connectionType": "1st"}
Connection Types:
"1st"= 1st degree connection"2nd"= 2nd degree connection"3rd"= 3rd+ degree connection or not connected
Summary Data
The actor also outputs a summary with scraping statistics:
{"totalResults": 25,"pages": 1,"timestamp": "2025-11-16T13:26:01.283Z"}
This summary is stored in the Actor's key-value store under the summary key and can be used to track scraping progress and performance.
Getting Started
Local Testing
- Update
apify_storage/key_value_stores/default/INPUT.jsonorINPUT.jsonwith your LinkedIn cookies and search URL - Run the actor locally:
cd Linkedin-Profile-Scraperapify run --input-file=INPUT.json
Note: The actor will try to load input from the Actor API first, then fall back to apify_storage, and finally to local INPUT.json.
Extracting LinkedIn Cookies
- Open LinkedIn Sales Navigator in your browser
- Open Developer Tools (F12)
- Go to Application → Cookies
- Export cookies (you can use browser extensions or manually copy them)
- Format as JSON array and add to INPUT.json
Project Structure
src/├── main.py - Main actor entry point and orchestration├── auth.py - LinkedIn authentication and cookie handling├── parser.py - HTML parsing and lead data extraction├── proxy.py - Proxy management and configuration├── pagination.py - Multi-page navigation handling├── delays.py - Request delays with jitter and exponential backoff└── errors.py - Error handling and logging utilities
Deployment to Apify
Method 1: Git Repository
- Go to Actor creation page
- Click Link Git Repository button
- Configure and deploy
Method 2: Local Push
apify loginapify push
Error Handling
The actor includes comprehensive error handling for:
- Authentication Errors - Invalid or expired cookies
- Network Errors - Connection timeouts and failures
- Parsing Errors - HTML structure changes
- Navigation Errors - Page navigation failures
- Proxy Errors - Proxy connection issues
All errors are logged with context for debugging.
Viewing Results
In Apify Platform
When the actor runs on Apify, results are available in multiple places:
-
Dataset - Browse all scraped lead records
- Go to your run's detail page → Dataset tab
- View results as table, JSON, or export to CSV/Excel
-
Summary - View scraping statistics
- Go to your run's detail page → Key-value store tab
- Look for the
summarykey with totals and timestamp
-
Logs - Monitor execution progress
- Go to your run's detail page → Logs tab
- Shows page-by-page progress and error messages
Local Results
When running locally:
- Dataset output - Results are stored in
storage/datasets/default/ - Summary - Summary stored in
storage/key_value_stores/default/undersummarykey - Debug files - HTML and screenshots stored in
debug/directory
Debug Output
When running the actor, it automatically creates debug files in the debug/ directory:
- debug_page_{N}.html - Raw HTML content of each scraped page (useful for troubleshooting parsing issues)
- page_{N}.png - Full-page screenshots of each scraped page
These files are helpful for debugging parsing issues or verifying that the page loaded correctly.
Rate Limiting
The actor implements configurable delays to avoid detection:
- Random delay between
minDelay + 2andmaxDelay + 5seconds between page transitions - Variable scroll intervals (700-1200ms between scrolls)
- Graceful handling of dynamic content loading
Important Notes
- ⚠️ Ensure LinkedIn session cookies are valid and not expired
- ⚠️ Respect LinkedIn's Terms of Service
- ⚠️ Use appropriate delays to avoid account restrictions
- 💡 Consider using Apify's Residential Proxy for better success rates