TikTok Profile, Video URL & Keyword Scraper avatar

TikTok Profile, Video URL & Keyword Scraper

Pricing

from $1.49 / 1,000 results

Go to Apify Store
TikTok Profile, Video URL & Keyword Scraper

TikTok Profile, Video URL & Keyword Scraper

Scrape public TikTok profiles, video URLs, hashtags, and keyword results into clean JSON. Get captions, transcripts when available, views, likes, comment counts (not comment text), shares, saves, music, media links, and account context. Built for trend tracking, influencer research, and analytics.

Pricing

from $1.49 / 1,000 results

Rating

0.0

(0)

Developer

Inus Grobler

Inus Grobler

Maintained by Community

Actor stats

1

Bookmarked

32

Total users

5

Monthly active users

a day ago

Last modified

Share

TikTok Scraper for Profiles, Hashtags, Keywords, Videos & Transcripts

Scrape public TikTok profiles, hashtags, keyword search results, and individual video URLs into clean structured datasets. Use this TikTok profile scraper, TikTok hashtag scraper, TikTok keyword scraper, TikTok video scraper, and TikTok transcript scraper to collect video metadata, engagement counts, captions, transcripts when available, media links, hashtags, music data, and account context without building or maintaining your own TikTok data pipeline.

This Actor is built for teams that need TikTok data for trend tracking, influencer research, brand monitoring, social listening, content analysis, market research, AI enrichment, and analytics workflows. Results are written to an Apify dataset, so you can export TikTok data to JSON, CSV, Excel, XML, RSS, or connect it to your API, warehouse, dashboard, spreadsheet, CRM, or automation stack.

What This TikTok Scraper Does

  • Scrapes public videos from TikTok usernames, @handles, and profile URLs
  • Enriches specific TikTok video URLs with available public metadata
  • Discovers videos from TikTok keyword and hashtag searches
  • Extracts captions, hashtags, music title, duration, timestamps, and author context
  • Collects engagement metrics including views, likes, comment counts, shares, and saves when available
  • Extracts subtitle transcript text when TikTok exposes transcript or caption metadata
  • Falls back to cleaned caption text when a transcript track is not available
  • Streams results to the dataset during the run to reduce memory usage on larger jobs
  • Adds quality flags so downstream systems can separate complete rows from partial rows

This Actor does not scrape TikTok comment text, comment authors, replies, or comment threads. It only returns comment_count as a video-level metric.

Best Use Cases

  • Monitor TikTok accounts for new public videos
  • Track hashtags, trends, products, brands, and creator niches
  • Build influencer discovery and creator research datasets
  • Enrich known TikTok video URLs with metrics, captions, media links, and metadata
  • Feed TikTok video data into BI dashboards, spreadsheets, CRMs, or data warehouses
  • Collect TikTok captions and available transcripts for content analysis or AI enrichment
  • Schedule repeat runs for lightweight TikTok monitoring

Who Uses This Actor

  • Social listening teams tracking TikTok conversations, hashtags, and campaigns
  • Brand and reputation monitoring teams watching products, events, competitors, and creators
  • Influencer marketing teams researching creator activity, engagement, and content themes
  • Market researchers collecting public TikTok videos for trend and audience analysis
  • AI and data teams extracting captions and available transcripts for classification, search, and summarization
  • Journalists, researchers, and analysts monitoring public narratives and emerging topics
GoalInput to useTypical output
Track TikTok hashtagskeywords: ["#skincare", "#aitools"]Videos, captions, transcripts, metrics, authors, hashtags
Monitor creatorsusers: ["tiktok", "khaby.lame"]Public videos, account context, engagement counts, captions
Enrich known videosurls: ["https://www.tiktok.com/@.../video/..."]One row per video with metadata, caption, transcript, media links
Build influencer listskeywords first, then scrape discovered usersCreators, videos, engagement signals, content themes
Low-cost discoverykeywordYtdlpEnrichmentLimit: 0Faster keyword results with lower enrichment cost
Full research exportoutputMode: "full"Rich JSON rows for analysis, AI workflows, and databases

What You Get

Data categoryExample fields
Video identityvideo_id, video_url, source_input_url, search_keyword
Text contentcaption_text, transcript, has_transcript, has_caption_text_fallback
Engagement metricsplay_count, like_count, comment_count, share_count, save_count
Creator dataauthor_username, author_id, account
Discovery contexthashtags, music_title, create_time, create_time_epoch
Media datamedia_links, direct playback URL when available, thumbnail links when available
Quality checksmetadata_quality, missing_fields, transcript_detail

Use the output for TikTok trend monitoring, TikTok influencer research, TikTok hashtag tracking, TikTok social listening, competitor research, campaign reporting, content classification, and AI analysis pipelines.

Simple Input

Provide any combination of users, urls, and keywords. The Actor assumes useful defaults: enriched video output, transcript discovery, caption fallback, engagement counts, profile context, media links, dataset streaming, and run diagnostics.

Scrape TikTok Users

{
"users": ["tiktok", "khaby.lame"],
"maxItems": 20
}

Enrich TikTok Video URLs

{
"urls": [
"https://www.tiktok.com/@tiktok/video/1234567890123456789"
],
"maxItems": 10
}

Search TikTok Keywords Or Hashtags

{
"keywords": ["ai tools", "fitness tips", "#skincare"],
"maxItems": 20
}

Combine Inputs In One Run

{
"users": ["tiktok"],
"urls": ["https://www.tiktok.com/@tiktok/video/1234567890123456789"],
"keywords": ["creator economy"],
"maxItems": 10
}

Input Reference

FieldTypeDefaultDescription
usersarray[string]-TikTok usernames, @handles, or profile URLs.
urlsarray[string]-TikTok video URLs to enrich directly.
keywordsarray[string]-TikTok keyword or hashtag searches.
maxItemsinteger20Maximum videos per user or keyword, and maximum video URLs to process.
keywordYtdlpEnrichmentLimitinteger-1Maximum keyword result rows to enrich with yt-dlp after browser discovery. Use 0 to return keyword browser results faster, or -1 to enrich every eligible row.
keywordYtdlpEnrichmentTimeoutinteger0Per-video timeout in seconds for keyword yt-dlp enrichment. Use 0 to reuse the normal yt-dlp timeout setting.
skipProfileScrapebooleanfalseSkip browser profile-page scraping for user runs to reduce runtime when profile fields are not required.
ytdlpTranscriptSubtitleDownloadLimitintegerdynamicMaximum videos per user/keyword to retry by downloading subtitle files. Lower values reduce runtime when transcript yield is low.
ytdlpDetailEnrichLimitintegerdynamicMaximum videos per user to enrich with full yt-dlp detail metadata.
ytdlpTranscriptDetailEnrichLimitintegerdynamicMaximum videos per user to retry with full detail payloads for transcript discovery.

Provide at least one of users, urls, or keywords.

Dataset Output

Each dataset item is one TikTok video record. The output is designed for analysis, search indexing, enrichment, and automated workflows.

Common fields include:

  • video_id
  • video_url
  • caption_text
  • transcript
  • has_transcript
  • has_caption_text_fallback
  • create_time
  • create_time_epoch
  • author_username
  • author_id
  • hashtags
  • play_count
  • like_count
  • comment_count
  • share_count
  • save_count
  • duration_s
  • music_title
  • media_links
  • account
  • search_keyword
  • metadata_quality
  • missing_fields

Example dataset item:

{
"video_id": "1234567890123456789",
"video_url": "https://www.tiktok.com/@creator/video/1234567890123456789",
"caption_text": "Example TikTok caption #trend",
"transcript": "Transcript text when TikTok exposes captions or subtitles.",
"has_transcript": true,
"has_caption_text_fallback": false,
"author_username": "creator",
"hashtags": ["trend"],
"play_count": 125000,
"like_count": 8400,
"comment_count": 320,
"share_count": 180,
"save_count": 95,
"duration_s": 24,
"music_title": "Original sound",
"metadata_quality": "full",
"missing_fields": []
}

Transcript Handling

TikTok does not expose transcripts for every video. When transcript or subtitle metadata is available, the Actor returns it in transcript and marks has_transcript as true.

When no transcript track is available, the Actor uses cleaned caption text as a fallback and marks has_caption_text_fallback as true. Caption fallback is useful for search and classification, but it is not the same as a spoken-word transcript.

Quality Signals

TikTok availability can vary by region, account, video type, and current platform behavior. To make the output easier to trust, each row includes quality fields:

  • metadata_quality: full, partial, or url_only
  • missing_fields: important fields that were not available for that row
  • transcript_detail: transcript status and source details when available

These fields help you filter results before sending TikTok data into analytics, AI workflows, or databases.

Run Summary

Every run stores an OUTPUT_SUMMARY record in the default key-value store. Useful fields include:

  • status
  • videos_scraped
  • videos_with_transcript
  • videos_with_caption_text_fallback
  • metadata_quality_counts
  • dataset_items_pushed
  • dataset_push_batches
  • streaming_enabled
  • memory_peak_mb
  • health

Performance And Scaling

Rows are streamed to the dataset after each user, video URL, or keyword is processed. This keeps memory usage lower than holding the full run output until the end.

For best results:

  • Start with a small maxItems value when testing a new query
  • Use urls when you already know the exact TikTok videos to enrich
  • Use users for account monitoring
  • Use keywords for discovery, trends, hashtags, and topic research
  • Schedule repeat runs for ongoing monitoring instead of running very large one-off jobs
  • Put multiple users in one run when possible. The Actor processes and streams each user separately, so batching small accounts reduces fixed actor-start overhead without holding all results in memory.
  • For lower-cost keyword discovery, set keywordYtdlpEnrichmentLimit to a small number or 0, then scrape the discovered authors with users
  • For lower-cost user monitoring, set skipProfileScrape to true and reduce transcript/detail enrichment limits when full transcript discovery is not required
  • If pay-per-event pricing is enabled, keyword searches can charge the custom keyword-search-started event once per keyword before discovery begins.

The Actor defaults to 2048 MB memory and a 4 hour timeout, with up to 4096 MB available for unusually heavy runs. The higher default gives browser-based user and keyword workflows enough headroom while streamed dataset output keeps long multi-input runs from accumulating all results in memory.

API Usage

Python

from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run_input = {
"users": ["tiktok"],
"keywords": ["ai tools"],
"maxItems": 5,
}
actor = client.actor("thescrapelab/tiktok-scraper-2-0")
run = actor.call(run_input=run_input)
dataset = client.dataset(run["defaultDatasetId"])
for item in dataset.iterate_items():
print(item["video_url"], item.get("play_count"), item.get("transcript"))

HTTP

curl -sS -X POST \
"https://api.apify.com/v2/acts/thescrapelab~tiktok-scraper-2-0/run-sync-get-dataset-items?format=json&clean=true" \
-H "Authorization: Bearer YOUR_APIFY_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"keywords": ["ai tools"],
"maxItems": 5
}'

Notes And Limitations

  • The Actor collects public TikTok data that is available during the run.
  • Some videos, accounts, or search results can be unavailable because of region, age restrictions, login challenges, or TikTok platform changes.
  • Transcript tracks are only returned when TikTok exposes caption or subtitle metadata.
  • Direct media URLs can be signed and short-lived. Store stable video_url values for long-term references.
  • This Actor is not affiliated with TikTok.