Login with Selenium

Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

Login with Selenium

Deprecated

See alternative Actors

Does a simple login using selenium. returns cookies

Pricing

Pay per usage

Rating

0.0

(0)

Developer

José Eduardo Piña Castro

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

2 months ago

Last modified

.actor/actor.json

{
    "actorSpecification": 1,
    "name": "my-actor",
    "title": "Getting started with Python and Selenium",
    "description": "Scrapes titles of websites using Selenium.",
    "version": "0.0",
    "buildTag": "latest",
    "meta": {
        "templateId": "python-selenium"
    },
    "input": "./input_schema.json",
    "dockerfile": "./Dockerfile"
}

.actor/input_schema.json

{
    "title": "Python Selenium Scraper",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "start_urls": {
            "title": "Start URLs",
            "type": "array",
            "description": "URLs to start with",
            "prefill": [
                { "url": "https://apify.com" }
            ],
            "editor": "requestListSources"
        },
        "max_depth": {
            "title": "Maximum depth",
            "type": "integer",
            "description": "Depth to which to scrape to",
            "default": 1
        }
    },
    "required": ["start_urls"]
}

src/init.py

src/main.py

1import asyncio
2
3from .main import main
4
5# Execute the Actor entry point.
6asyncio.run(main())

src/main.py

1"""This module defines the main entry point for the Apify Actor.
2
3Feel free to modify this file to suit your specific needs.
4
5To build Apify Actors, utilize the Apify SDK toolkit, read more at the official documentation:
6https://docs.apify.com/sdk/python
7"""
8
9import asyncio
10from urllib.parse import urljoin
11
12from time import sleep
13
14from apify import Actor, Request
15from selenium import webdriver
16from selenium.webdriver.chrome.service import Service
17from selenium.webdriver.chrome.options import Options
18from selenium.webdriver.common.by import By
19from selenium.webdriver.common.keys import Keys
20from selenium.webdriver.support.ui import WebDriverWait
21from selenium.webdriver.support import expected_conditions as EC
22from webdriver_manager.chrome import ChromeDriverManager
23
24# To run this Actor locally, you need to have the Selenium Chromedriver installed.
25# Follow the installation guide at:
26# https://www.selenium.dev/documentation/webdriver/getting_started/install_drivers/
27# When running on the Apify platform, the Chromedriver is already included
28# in the Actor's Docker image.
29
30
31async def main() -> None:
32    """
33
34    Returns cookies for a given site, after login
35
36    This coroutine is executed using `asyncio.run()`, so it must remain an asynchronous function for proper execution.
37    Asynchronous execution is required for communication with Apify platform, and it also enhances performance in
38    the field of web scraping significantly.
39    """
40    # Enter the context of the Actor.
41    async with Actor:
42        # Retrieve the Actor input, and use default values if not provided.
43        actor_input = await Actor.get_input() or {}
44        start_urls = actor_input.get('start_urls')
45        credentials = actor_input.get('credentials')
46
47        if not start_urls:
48            Actor.log.error('No start_urls specified in actor input, exiting...')
49            await Actor.exit()
50        login_url = start_urls[0]
51
52        if not login_url:
53            Actor.log.error('No login url specified in actor input, exiting...')
54            await Actor.exit()
55
56        if not credentials:
57            Actor.log.error('No credentials specified in actor input, exiting...')
58            await Actor.exit()
59
60        user_info = credentials.get("user", {})
61        password_info = credentials.get("password", {})
62        submit_info = credentials.get("submit", {})
63        extra_info = credentials.get("extra", {})
64
65        user = user_info.get("value")
66        password = password_info.get("value")
67
68        if not (user and password):
69            Actor.log.info('User/Password info is not complete, exiting...')
70            await Actor.exit()
71
72        # xpaths info
73        xpath_user = user_info.get("xpath")
74        xpath_password = password_info.get("xpath")
75        xpath_submit = submit_info.get("xpath")
76
77        if not (xpath_user and xpath_password and xpath_submit):
78            Actor.log.info('XPath info for user/password/submit is not complete, exiting...')
79            await Actor.exit()
80
81        Actor.log.debug("All info ok")
82
83        # Optionals: wait time and OK page marker
84        wait_time = extra_info.get("wait_time", 5)
85        if isinstance(wait_time, (str, float)):
86            wait_time = int(wait_time)
87
88        ok_page_xpath = extra_info.get("ok_page_xpath")
89
90        # We got all we need, so we can start
91
92        # Enqueue the Login URL in the default request queue
93        queue_name = "my-login-queue"
94        request_queue = await Actor.open_request_queue(name=queue_name)
95
96        # Delete everything in the queue to start clean
97        await request_queue.drop()  # WARNING: deletes the queue
98        request_queue = await Actor.open_request_queue(name=queue_name)
99        
100        for start_url in start_urls:
101            url = start_url.get('url')
102            Actor.log.info(f'Enqueuing {url} ...')
103            new_request = Request.from_url(url, user_data={'depth': 0}, unique_key="login")
104            await request_queue.add_request(new_request)
105
106        # Launch a new Selenium Chrome WebDriver
107        Actor.log.info('Launching Chrome Headless WebDriver...')
108        chrome_options = Options()
109        chrome_options.add_argument('--headless')
110        chrome_options.add_argument('--no-sandbox')
111        chrome_options.add_argument('--disable-dev-shm-usage')
112        driver = webdriver.Chrome(options=chrome_options)
113
114        # Process the requests in the queue one by one
115        while request := await request_queue.fetch_next_request():
116            url = request.url
117            Actor.log.info(f'Login check: {url} ...')
118
119            try:
120                # Open the URL in the Selenium WebDriver
121                driver.get(url)
122                Actor.log.info(f'Sleeping for this much time: {wait_time} seconds ...')
123                # Wait for the login form to appear
124                wait = WebDriverWait(driver, wait_time)
125
126                # Find elements
127                username_input = wait.until(EC.presence_of_element_located((By.XPATH, xpath_user)))
128                password_input = wait.until(EC.presence_of_element_located((By.XPATH, xpath_password)))
129                submit_button = wait.until(EC.element_to_be_clickable((By.XPATH, xpath_submit)))
130
131                # Populate
132                username_input.send_keys(user)
133                password_input.send_keys(password)
134
135                # Click and wait
136                submit_button.click()
137                wait = WebDriverWait(driver, wait_time)
138                if ok_page_xpath:
139                    wait.until(EC.presence_of_element_located((By.XPATH, ok_page_xpath)))
140                else:
141                    sleep(wait_time)
142                    Actor.log.info("Wake up!")
143
144
145                await Actor.push_data({'url': url, 'cookies': driver.get_cookies()})
146            except Exception:
147                Actor.log.exception(f'Cannot login: URL is {url}.')
148            finally:
149                await request_queue.mark_request_as_handled(request)
150
151        driver.quit()

src/py.typed

.dockerignore

.git
.mise.toml
.nvim.lua
storage

# The rest is copied from https://github.com/github/gitignore/blob/main/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
.python-version

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

.gitignore

.mise.toml
.nvim.lua
storage

# The rest is copied from https://github.com/github/gitignore/blob/main/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
.python-version

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

requirements.txt

1# Feel free to add your Python dependencies below. For formatting guidelines, see:
2# https://pip.pypa.io/en/latest/reference/requirements-file-format/
3
4apify >= 1.7.0
5selenium ~= 4.14.0
6webdriver_manager

ONE Line Prices

maged120/one-line-prices

Scrapes shipping prices from ONE Line's e-commerce platform for specified origin, destination, containers, and commodity. Supports authentication via access/refresh tokens or username/password.

Maged

5.0

(1)

Advanced LinkedIn Sales Navigator Scraper - Verified Leads

solana/advanced-linkedin-sales-navigator-scraper-verified-leads

Extract accurate LinkedIn Sales Navigator leads – full names, roles & companies. Blazing-fast, enhanced, CRM-ready. Works securely with your `li_at` linkedin cookie.

Solana Dev

227

1.9

(6)

Facebook Advert Scraper

lucascode/facebook-advert-scraper

Provide one or more URLs of Facebook Marketplace Adverts and receive the contents of the adverts.

Lucas

5.0

(3)

AI LinkedIn Job Matcher

james.logantech/ai-linkedin-job-matcher

AI LinkedIn Job Matcher helps job seekers find the most relevant LinkedIn job postings using NLP, and OpenAI's GPT-4. It analyzes job descriptions, matches them to resumes, and ranks opportunities by relevance. Automate job searching, save time and discover the best career matches easily!

James

121

5.0

(1)

Twitter X Profile Viewer

igview-owner/x-twitter-profile-viewer

Powerful Twitter/X profile scraper that extracts user data, follower counts, bio information, and verification status. Works with usernames or full Twitter/X URLs. Ideal for social media research, influencer analysis, and bulk data collection.

Sachin Kumar Yadav

Udemy Course Reviews Scraper

scraper-engine/udemy-course-reviews-scraper

Udemy Course Reviews Scraper extracts student reviews, ratings, timestamps, and feedback from any Udemy course. Perfect for sentiment analysis, competitor research, or course evaluation. Export structured data in JSON, CSV, or Excel for insights and reporting.

Scraper Engine

Jobs Scrapper

ai-scraper-labs/ambition-box-Jobs-scrapper

Powerful AmbitionBox Job Scraper that extracts detailed job listings by role and location. Includes responsibilities, skills, qualifications, company insights, and Naukri integration for technical details. Fast, structured, and proxy-supported for large-scale data collection.

ai-scraper-labs

Udemy Reviews Scraper

api-empire/udemy-course-reviews-scraper

Scrape detailed course reviews with the Apify Udemy Course Reviews Scraper. Extract reviewer names, ratings, dates, comments, and course info. Ideal for sentiment analysis, market research, and course quality tracking. Fast, accurate, and simple to automate for large-scale insights.

API Empire

Udemy Course Reviews Scraper

scrapier/udemy-course-reviews-scraper

Collect detailed feedback with the Udemy Course Reviews Scraper. Extract course reviews, ratings, reviewer info, and timestamps for any Udemy course. Ideal for market research, course analysis, and sentiment tracking. Fast, accurate, and scalable for bulk data collection.

Scrapier

Udemy Reviews Actor

scrapio/udemy-reviews-actor

Scrapes student reviews from Udemy courses, capturing reviewer names, ratings, comments, timestamps, instructor details, and course URLs. Ideal for sentiment analysis, competitor research, course quality evaluation, and large-scale Udemy review data extraction

Scrapio

Udemy Course Reviews Scraper

simpleapi/udemy-course-reviews-scraper

The Udemy Course Reviews Scraper collects detailed course feedback from Udemy, including reviewer names, ratings, comments, and timestamps. Ideal for instructors, analysts, and marketers to assess course performance, track sentiment, and identify improvement opportunities effectively.

SimpleAPI

Login with Selenium

Login with Selenium

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/init.py

src/main.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

ONE Line Prices

Advanced LinkedIn Sales Navigator Scraper - Verified Leads

Facebook Advert Scraper

AI LinkedIn Job Matcher

Twitter X Profile Viewer

Udemy Course Reviews Scraper

Jobs Scrapper

Udemy Reviews Scraper

Udemy Course Reviews Scraper

Udemy Reviews Actor

Udemy Course Reviews Scraper

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/init.py

src/main.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

Login with Selenium

Login with Selenium

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__init__.py

src/__main__.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

You might also like

ONE Line Prices

Advanced LinkedIn Sales Navigator Scraper - Verified Leads

Facebook Advert Scraper

AI LinkedIn Job Matcher

Twitter X Profile Viewer

Udemy Course Reviews Scraper

Jobs Scrapper

Udemy Reviews Scraper

Udemy Course Reviews Scraper

Udemy Reviews Actor

Udemy Course Reviews Scraper

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__init__.py

src/__main__.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

src/init.py

src/main.py

src/init.py

src/main.py