TikTok Keyword Scraper
Under maintenancePricing
Pay per usage
TikTok Keyword Scraper
Under maintenanceScrapes TikTok video search results by keyword using Playwright, with persistent browser profiles, CAPTCHA solving, and an optional Apify residential proxy that can be fully disabled to run direct (no proxy).
Pricing
Pay per usage
Rating
5.0
(2)
Developer
Yara Mohamed
Maintained by CommunityActor stats
1
Bookmarked
7
Total users
5
Monthly active users
a day ago
Last modified
Categories
Share
TikTok Keyword Scraper — Apify Actor
Scrapes TikTok video search results by keyword using Playwright, with persistent browser profiles, CAPTCHA solving, proxy rotation via a local relay, and an optional Apify proxy that can be fully disabled to run with no proxy at all.
This package runs two ways from the same codebase:
- As an Apify Actor (
.actor/+src/) — the intended way to deploy on Apify's cloud. - As a local FastAPI server (
server.py) — for local development, debugging, or running outside Apify.
Project Structure
.├── .actor/│ ├── actor.json # Actor metadata│ ├── INPUT_SCHEMA.json # Input fields shown in Apify Console (incl. "Use proxy" toggle)│ └── Dockerfile # apify/actor-python-playwright:3.11 base image├── src/│ ├── __init__.py│ ├── __main__.py # Actor entrypoint (`python -m src`)│ └── main.py # Reads input, runs keywords concurrently, pushes to dataset├── browser.py # Browser/context factory, cookies, scroll/nav helpers├── captcha.py # CAPTCHA detection + solving (SadCaptcha → SolveCaptcha)├── scrapers.py # scrape_tiktok_search / hashtag / profile / download-url├── data_helpers.py # Video cleaning, API parsing, shared scroll loop├── config.py # Constants, proxy pool (now toggleable), selectors├── profiles.py / profile_pool.py # Persistent browser-profile pool + proxy assignment├── proxy_relay.py # Local TCP relay (works around Chromium proxy-auth bugs)├── downloader.py # Streaming video downloader├── server.py # FastAPI server for local/non-Apify use├── tiktok_cookies.json # Bundled fallback cookie set (see "Cookie fallback" below)├── requirements.txt└── .dockerignore
Running No Proxy at All (the toggle you asked for)
The Actor input has a "Use proxy" checkbox (useProxy, default true).
- ON (default): every browser session routes through the Apify residential
proxy pool (
BUYPROXIES94952group), same as before. - OFF: every session connects directly, with no proxy — useful for local testing, debugging, or if your Apify plan has no proxy quota left.
How it works under the hood: src/main.py sets the environment variable
USE_PROXY=true|false from that checkbox before any scrape starts.
config.get_proxy_pool() checks USE_PROXY on every call (it's not cached at
import time), and returns an empty list when proxying is off. With an empty
pool, profiles.get_proxy_for_profile() returns None, and
browser.make_browser_and_context() already had a "no proxy configured"
direct-connection branch — so turning the toggle off requires no other code
changes anywhere in the scraper.
You can also flip this manually outside the Actor input by setting the
USE_PROXY env var directly (e.g. for local CLI/server runs):
$USE_PROXY=false python server.py
There's also an optional proxyPassword input field (marked secret) if you
want to override the Apify proxy password baked into config.py with your
own, without editing code.
Cookie Fallback (Apify /tmp wipe fix)
Apify's containers are ephemeral — /tmp (and therefore every per-profile
cookies.json under /tmp/tiktok_profiles/) is wiped between separate Actor
runs. A brand-new container's first navigation then has zero session cookies,
which is what causes TikTok to silently route searches to the Users tab
instead of the Videos tab.
browser.load_cookies() now falls back to the bundled tiktok_cookies.json
at the project root whenever a profile has no (or expired) per-profile cookie
file yet, giving every fresh container at least one valid baseline session
instead of a completely cold one. This file is copied into the Docker image
by COPY . ./ in the Dockerfile, so it ships with every build.
Deploying to Apify
npm install -g apify-cliapify logincd tiktok-apify-actor/apify push
apify push builds the Docker image from .actor/Dockerfile and uploads
everything else (COPY . ./ in the Dockerfile copies the whole repo root
into the image, including src/, the scraper modules, and the bundled
cookie file).
Actor Input Example
{"keywords": ["funny cats", "cooking recipe"],"maxResults": 50,"maxConcurrency": 3,"dateFilter": 0,"scrollPause": 3.0,"headless": true,"useProxy": true}
Run with no proxy at all:
{"keywords": ["funny cats"],"useProxy": false}
Running via REST API
import requestsAPIFY_TOKEN = "your_token_here"ACTOR_ID = "your_username/tiktok-keyword-scraper"response = requests.post(f"https://api.apify.com/v2/acts/{ACTOR_ID}/runs",headers={"Content-Type": "application/json"},params={"token": APIFY_TOKEN},json={"keywords": ["python tutorial"], "maxResults": 30, "useProxy": True},)run_id = response.json()["data"]["id"]
Results land in the run's default dataset (one row per video, plus a
search_keyword field), and a run-level summary is written to the
key-value store under OUTPUT.
Running Locally (FastAPI server, unchanged)
pip install -r requirements.txtplaywright install chromiumpython server.py# → http://localhost:8000/docs
| Method | Path | Description |
|---|---|---|
POST | /search | Search by keyword |
POST | /batch-search | Up to 100 keywords at once |
GET | /job/{job_id} | Poll job status / results |
GET | /proxy-test | Test every proxy in the pool |
POST | /download | Download a video on demand |
GET | /health | Health check |
Notes
maxConcurrencyis capped atPROFILE_POOL_SIZE(10) — each parallel job needs its own browser-profile slot.dateFiltermatches TikTok's "Posted" search filter:0=all,1=24h,7=week,30=month,90=3mo,180=6mo.- This tool is for educational/research purposes only.