Deprecated

Pricing

Pay per usage

See alternative Actors

Go to Apify Store

X scrapper python

Deprecated

See alternative Actors

easy and cheap to use

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Mv 07

Actor stats

Bookmarked

Total users

Monthly active users

a year ago

Last modified

.actor/Dockerfile

# First, specify the base Docker image.
# You can see the Docker images from Apify at https://hub.docker.com/r/apify/.
# You can also use any other image from Docker Hub.
FROM apify/actor-python:3.13

# Second, copy just requirements.txt into the Actor image,
# since it should be the only file that affects the dependency install in the next step,
# in order to speed up the build
COPY requirements.txt ./

# Install the packages specified in requirements.txt,
# Print the installed Python version, pip version
# and all installed packages with their versions for debugging
RUN echo "Python version:" \
 && python --version \
 && echo "Pip version:" \
 && pip --version \
 && echo "Installing dependencies:" \
 && pip install -r requirements.txt \
 && echo "All installed Python packages:" \
 && pip freeze

# Next, copy the remaining files and directories with the source code.
# Since we do this after installing the dependencies, quick build will be really fast
# for most source file changes.
COPY . ./

# Use compileall to ensure the runnability of the Actor Python code.
RUN python3 -m compileall -q .

# Create and run as a non-root user.
RUN useradd --create-home apify && \
    chown -R apify:apify ./
USER apify

# Specify how to launch the source code of your Actor.
# By default, the "python3 -m src" command is run
CMD ["python3", "-m", "src"]

.actor/actor.json

{
    "actorSpecification": 1,
    "name": "my-actor",
    "title": "Scrape single page in Python",
    "description": "Scrape data from single page with provided URL.",
    "version": "0.0",
    "buildTag": "latest",
    "meta": {
        "templateId": "python-start"
    },
    "input": "./input_schema.json",
    "dockerfile": "./Dockerfile"
}

.actor/input_schema.json

{
    "title": "Scrape data from a web page",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "url": {
            "title": "URL of the page",
            "type": "string",
            "description": "The URL of website you want to get the data from.",
            "editor": "textfield",
            "prefill": "https://www.apify.com/"
        }
    },
    "required": ["url"]
}

src/init.py

src/main.py

1import asyncio
2
3from .main import main
4
5# Execute the Actor entry point.
6asyncio.run(main())

src/main.py

1from apify import Actor
2import requests
3import json
4import os
5
6async def main():
7    # Inisialisasi actor
8    await Actor.init()
9    
10    # Ambil input dari user
11    input_data = await Actor.get_input() or {}
12    keyword = input_data.get("keyword", "Timnas Indonesia")
13    max_tweets = input_data.get("maxTweets", 50)
14
15    # Scraping data pakai snscrape
16    tweets = []
17    cmd = f"snscrape --jsonl --max-results {max_tweets} twitter-search '{keyword}'"
18    result = os.popen(cmd).read()
19
20    for line in result.splitlines():
21        tweet = json.loads(line)
22        tweets.append({
23            "text": tweet["content"],
24            "user": tweet["user"]["username"],
25            "timestamp": tweet["date"]
26        })
27
28    # Simpan data ke dataset Apify
29    await Actor.push_data(tweets)
30    
31    # Selesai
32    awai

src/py.typed

.dockerignore

.git
.mise.toml
.nvim.lua
storage

# The rest is copied from https://github.com/github/gitignore/blob/main/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
.python-version

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

.gitignore

.mise.toml
.nvim.lua
storage

# The rest is copied from https://github.com/github/gitignore/blob/main/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
.python-version

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/

requirements.txt

1# Feel free to add your Python dependencies below. For formatting guidelines, see:
2# https://pip.pypa.io/en/latest/reference/requirements-file-format/
3
4apify < 3.0
5beautifulsoup4[lxml]
6httpx
7types-beautifulsoup4
8snscrape

X-Scrapper/Twitter-Scraper

akshaynceo/x-scrapper-twitter-scraper

AKSHAY N

243

Twitter X Images Scrapper

mwitiderrick/Twitter-X-Images-Scrapper

The Twitter/X Image & Video Scraper is a powerful tool designed to extract images and videos from Twitter/X based on search queries quickly and efficiently.

Derrick Mwiti

Tweet Scraper V1.0 - / Twitter Scrapper

rhamadhanigb19/Scrapping-X

This actor was created to automatically scrape tweets for various purposes, such as research and business. You can use this actor easily. We do not store any cookies required to run the actor.

rhama dhani

ScrapeClaw - Youtube_Scraper

scrapeclaw/youtube-scrapper

Part of ScrapeClaw (https://scrapeclaw.cc/) — a suite of production-ready, agentic social media scrapers for Instagram, YouTube, X/Twitter, TikTok, and Facebook. Built with Python & Playwright. No API keys required.

Scrapeclaw

Scrapeclaw - Instagram Scraper

scrapeclaw/scrapeclaw---instagram-scraper

Scrapeclaw

Youtube Email scrapper Pro

lagic/youtubeemailscrapperpro

emails from YouTube at scale with social links. Enter a keyword, scan thousands of channels in minutes. Get verified emails, subscriber counts, Instagram, Twitter, TikTok, Facebook, LinkedIn and website links. Perfect for influencer outreach and lead gen. 60+ emails/min.

LAGIC

176

Amazon Product Details Scrapper

kawsar/amazon-product-details-scrapper

Amazon product data scraper that extracts titles, prices, ratings, images, and specifications from product pages, so you can automate price monitoring, catalog building, and competitive research at scale.

Kawsar

Scrapeclaw - Twitter Scraper

scrapeclaw/scrapeclaw---twitter-scraper

Scrapeclaw

OnlyFans Scraper Pro | Posts, Media & Profiles

pintxuki/onlyfans-scrapper

Scrape OnlyFans profiles with ease. Extract posts, images, videos, bio info, engagement stats, and pricing. Perfect for research, analysis, automation, and growth tracking. Auth support for private content. Fast, flexible, and code-free.

Inspecto

282

Google Maps Lead Extractor with Email Finder

electrifying_haircut/google-map-scrapper-neo

Extract outreach-ready leads from Google Maps — with emails. Get business names, phone numbers, websites, ratings, reviews, GPS coordinates AND business emails scraped directly from their websites. Skip the scrape-then-enrich pipeline. One actor, complete leads.

Gagan

194

Google Maps Scraper | Emails & Social Media Enrichment

rp_openpro.ai/google-maps-scraper

Extract business data from Google Maps at scale: names, addresses, phones, websites, emails, social media profiles (Facebook, Instagram, LinkedIn, Twitter/X, YouTube, Pinterest), ratings, reviews, GPS coordinates, and marketing tags (Google Analytics, GTM, Facebook Pixel).

Rémi Pelloux

122

5.0

(1)

X scrapper python

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__init__.py

src/__main__.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

You might also like

X-Scrapper/Twitter-Scraper

Twitter X Images Scrapper

Tweet Scraper V1.0 - / Twitter Scrapper

ScrapeClaw - Youtube_Scraper

Scrapeclaw - Instagram Scraper

Youtube Email scrapper Pro

Amazon Product Details Scrapper

Scrapeclaw - Twitter Scraper

OnlyFans Scraper Pro | Posts, Media & Profiles

Google Maps Lead Extractor with Email Finder

Google Maps Scraper | Emails & Social Media Enrichment

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

src/__init__.py

src/__main__.py

src/main.py

src/py.typed

.dockerignore

.gitignore

requirements.txt

src/init.py

src/main.py

src/init.py

src/main.py