Linkedin Url Scrape

Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

Linkedin Url Scrape

Deprecated

See alternative Actors

Scrape Unlimited Linkedin Profile URL's

Pricing

Pay per usage

Rating

5.0

(3)

Developer

Mike Powers

Maintained by Community

Actor stats

Bookmarked

1.3K

Total users

Monthly active users

2 years ago

Last modified

.actor/Dockerfile

# First, specify the base Docker image.
# You can see the Docker images from Apify at https://hub.docker.com/r/apify/.
# You can also use any other image from Docker Hub.
FROM apify/actor-python:3.11

# Second, copy just requirements.txt into the Actor image,
# since it should be the only file that affects the dependency install in the next step,
# in order to speed up the build
COPY requirements.txt ./

# Install the packages specified in requirements.txt,
# Print the installed Python version, pip version
# and all installed packages with their versions for debugging
RUN echo "Python version:" \
 && python --version \
 && echo "Pip version:" \
 && pip --version \
 && echo "Installing dependencies:" \
 && pip install -r requirements.txt \
 && echo "All installed Python packages:" \
 && pip freeze

# Next, copy the remaining files and directories with the source code.
# Since we do this after installing the dependencies, quick build will be really fast
# for most source file changes.
COPY . ./

# Use compileall to ensure the runnability of the Actor Python code.
RUN python3 -m compileall -q .

# Specify how to launch the source code of your Actor.
# By default, the "python3 -m src" command is run
CMD ["python3", "-m", "src"]

.actor/actor.json

{
    "actorSpecification": 1,
    "name": "my-actor-15",
    "title": "Scrape single page in Python",
    "description": "Scrape data from single page with provided URL.",
    "version": "0.0",
    "meta": {
        "templateId": "python-start"
    },
    "input": "./input_schema.json",
    "dockerfile": "./Dockerfile"
}

.actor/input_schema.json

{
    "title": "Scrape LinkedIn profiles based on keywords",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "keywords": {
            "title": "Search Keywords",
            "type": "array",
            "description": "Enter the keywords to search for LinkedIn profiles, e.g., job titles, industries, locations.",
            "editor": "stringList",
            "items": {
                "type": "string"
            },
            "prefill": ["chief product officer", "united states", "insurance"]
        },
        "numPages": {
            "title": "Number of Pages",
            "type": "integer",
            "description": "The number of pages to scrape (each page corresponds to a set of search results).",
            "editor": "number",
            "minimum": 1,
            "default": 1
        }
    },
    "required": ["keywords", "numPages"]
}

src/main.py

1"""
2This module serves as the entry point for executing the Apify Actor. It handles the configuration of logging
3settings. The `main()` coroutine is then executed using `asyncio.run()`.
4
5Feel free to modify this file to suit your specific needs.
6"""
7
8import asyncio
9import logging
10
11from apify.log import ActorLogFormatter
12
13from .main import main
14
15# Configure loggers
16handler = logging.StreamHandler()
17handler.setFormatter(ActorLogFormatter())
18
19apify_client_logger = logging.getLogger('apify_client')
20apify_client_logger.setLevel(logging.INFO)
21apify_client_logger.addHandler(handler)
22
23apify_logger = logging.getLogger('apify')
24apify_logger.setLevel(logging.DEBUG)
25apify_logger.addHandler(handler)
26
27# Execute the Actor main coroutine
28asyncio.run(main())

src/main.py

1import asyncio
2from bs4 import BeautifulSoup
3import requests
4from apify import Actor
5import re
6
7async def main() -> None:
8    async with Actor() as actor:
9        actor_input = await actor.get_input() or {}
10        keywords = actor_input.get('keywords', ["chief product officer", "united states", "insurance"])
11        num_pages = actor_input.get('numPages', 1)  # Get the number of pages from input
12
13        base_url = 'https://www.google.com/search?q=site%3Alinkedin.com%2Fin%2F+'
14        formatted_keywords = '+'.join(f'(%22{keyword.replace(" ", "+")}%22)' for keyword in keywords)
15        
16        headers = {
17            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
18
19        linkedin_urls = []
20
21        # Loop through pages based on `num_pages`, incrementing `start` by 10 for each page
22        for page_start in range(0, num_pages * 10, 10):
23            url = f"{base_url}{formatted_keywords}&start={page_start}"
24            print(url)
25            response = requests.get(url, headers=headers)
26            soup = BeautifulSoup(response.text, 'html.parser')
27            links = soup.find_all('a', href=True)
28
29            # Extract LinkedIn URLs
30            for link in links:
31                match = re.search(r'(https?://www\.linkedin\.com/in/[^&]+)', link['href'])
32                if match:
33                    linkedin_url = match.group(1)
34                    if linkedin_url not in linkedin_urls:  # Avoid duplicates
35                        linkedin_urls.append(linkedin_url)
36
37        # Output the LinkedIn URLs
38        for url in linkedin_urls:
39            await actor.push_data({"LinkedIn URL": url})
40
41        actor.log.info(f"Found and saved {len(linkedin_urls)} LinkedIn URLs based on the keywords across {num_pages} pages.")
42
43if __name__ == '__main__':
44    asyncio.run(main())

.dockerignore

# configurations
.idea

# crawlee and apify storage folders
apify_storage
crawlee_storage
storage

# installed files
.venv

# git folder
.git

.editorconfig

root = true

[*]
indent_style = space
indent_size = 4
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
end_of_line = lf

.gitignore

# This file tells Git which files shouldn't be added to source control

.idea
.DS_Store

apify_storage
storage/*
!storage/key_value_stores
storage/key_value_stores/*
!storage/key_value_stores/default
storage/key_value_stores/default/*
!storage/key_value_stores/default/INPUT.json

.venv/
.env/
__pypackages__
dist/
build/
*.egg-info/
*.egg

__pycache__

.mypy_cache
.dmypy.json
dmypy.json
.pytest_cache
.ruff_cache

.scrapy
*.log

requirements.txt

1# Feel free to add your Python dependencies below. For formatting guidelines, see:
2# https://pip.pypa.io/en/latest/reference/requirements-file-format/
3
4apify ~= 1.6.0
5beautifulsoup4 ~= 4.12.2
6httpx ~= 0.25.2
7types-beautifulsoup4 ~= 4.12.0.7
8requests ~= 2.28.1

LinkedIn Profile Comments Bulk Scraper (No Cookies) ✅ $2 per 1k

harvestapi/linkedin-profile-comments

Extract LinkedIn profile comments, as well as comment social activities such as likes and reactions. No cookies or account required. Concurrency + fast response times ⚡

HarvestAPI

179

5.0

(1)

LinkedIn Profile Reactions Scraper (No Cookies) ✅ $2 per 1k

harvestapi/linkedin-profile-reactions

Extract LinkedIn profile reactions, as well as full post data and social activities. No cookies or account required. Concurrency + fast response times ⚡

HarvestAPI

106

Linkedin Profile Scraper

getdataforme/linkedin-profile-scraper

The Linkedin Profile Scraper Actor extracts public LinkedIn profile details including name, avatar, current company, education, experience, posts, recommendations, followers, connections, and activity. Ideal for recruiters, lead generation, market research, and data enrichment workflows.

GetDataForMe

500 Global Scraper

michael.g/500-global-scraper

Scrape data on 500 Global companies from the 500 Global portfolio.

Michael G

5.0

(3)

Contact Info Finder

supreme_coder/contact-info-finder

Find people and their contact info from any website or Linkedin URL in bulk

Supreme Coder

121

1.0

(1)

Linkedin Comment Scraper

capable_cauldron/linkedin-comment-scraper

Scrape comments and replies from LinkedIn posts. Supports pagination, filtering, and comprehensive data extraction.

Capable Cauldron

3.8

(4)

LinkedIn Post & Comments Scraper ✅ No cookies

practicaltools/Apify-linkedin-post-scraper

A robust Actor that extracts structured data from public LinkedIn posts, including author, text, images, videos, date and reactions.

Practical Tools

Bulletproof : Advanced Linkedin Profile Scraper

bronze_shovel/linkedin-profile-scraper-pro

Biffer

115

LinkedIn Profile Search Scraper No Cookies ✅ Find all people 📧

harvestapi/linkedin-profile-search

Search for LinkedIn profiles with filters and extract detailed profile information, including work experience, education history, location and more. No cookies or account required.

HarvestAPI

3.6K

4.7

(16)

LinkedIn Company Employees Scraper ✅ No Cookies 📧 $4/12 per 1k

harvestapi/linkedin-company-employees

Extract all LinkedIn Company employees with filters and detailed profile information, including complete work experience, and more. No cookies or account required. This actor can try to find contact emails.

HarvestAPI

976

3.8

(6)

Linkedin Post Search Scraper (No Cookies)

harvestapi/linkedin-post-search

Search LinkedIn Posts with advanced filters by target profiles or companies. No cookies or account required. Concurrency + fast response times make mass scraping fast

HarvestAPI

2.2K

4.9

(6)

Linkedin Url Scrape

Linkedin Url Scrape

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__main__.py

src/main.py

.dockerignore

.editorconfig

.gitignore

requirements.txt

You might also like

LinkedIn Profile Comments Bulk Scraper (No Cookies) ✅ $2 per 1k

LinkedIn Profile Reactions Scraper (No Cookies) ✅ $2 per 1k

Linkedin Profile Scraper

500 Global Scraper

Contact Info Finder

Linkedin Comment Scraper

LinkedIn Post & Comments Scraper ✅ No cookies

Bulletproof : Advanced Linkedin Profile Scraper

LinkedIn Profile Search Scraper No Cookies ✅ Find all people 📧

LinkedIn Company Employees Scraper ✅ No Cookies 📧 $4/12 per 1k

Linkedin Post Search Scraper (No Cookies)

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__main__.py

src/main.py

.dockerignore

.editorconfig

.gitignore

requirements.txt

src/main.py

src/main.py