Tiktok Video Text Extractor avatar

Tiktok Video Text Extractor

Pricing

from $0.90 / 1,000 videos

Go to Apify Store
Tiktok Video Text Extractor

Tiktok Video Text Extractor

Extract all visible on-screen text from public TikTok videos — overlays, captions, stickers, hashtags, and watermarks — with timestamps, type, position, and language. Use for content intelligence, brand monitoring, creator analytics, and turning short-form video text into searchable structured data.

Pricing

from $0.90 / 1,000 videos

Rating

0.0

(0)

Developer

rainminer

rainminer

Maintained by Community

Actor stats

0

Bookmarked

4

Total users

1

Monthly active users

15 days ago

Last modified

Share

The TikTok Video Text Extractor is an Apify Actor that reads public TikTok videos and returns every piece of visible on-screen text as structured data. Creators and brands overlay text, captions, stickers, hashtags, and watermarks directly onto their videos — this Actor unlocks that content for search, analysis, and database ingestion without manual review.

Each video is processed by AI vision and the results are returned as a flat, structured dataset of text segments — each with its timestamp, type, screen position, language, and confidence rating.


Key Features

  • Full on-screen text extraction: Captures animated overlays, editor captions, TikTok stickers, watermarks, hashtags, and @mentions.
  • Timestamp-aware: Records the MM:SS timestamp when each text element first appears.
  • Type classification: Distinguishes overlay, caption, sticker, watermark, hashtag, mention, and other text types.
  • Multilingual: Detects the ISO 639-1 language code for each segment from the text itself.
  • Position detection: Classifies text placement as top, center, or bottom of screen.
  • Confidence rating: High/medium/low rating based on text clarity in the video frame.
  • No login required: Works with any public TikTok video.

Why Extract Text from TikTok Videos?

TikTok is a primary publishing surface for businesses, creators, and brands. Product drops, event announcements, hiring notices, and pricing updates are routinely shared only as video overlays — never structured or indexed. This Actor makes that content machine-readable for:

  • Content intelligence and brand monitoring tracking what competitors publish on TikTok.
  • Food and hospitality capturing daily specials and seasonal menus announced via video.
  • Event aggregators extracting event names, dates, and lineup text from promotional videos.
  • Retail and e-commerce indexing product drops, discount codes, and launch dates.
  • Market research tracking pricing, offers, and messaging trends across accounts.
  • Accessibility tools converting visual video text to readable formats.

Who Is It For?

  • Marketing and analytics teams monitoring brand or competitor TikTok content at scale.
  • Product and data teams building structured datasets from TikTok video content.
  • Developers integrating TikTok text extraction into discovery or monitoring pipelines.
  • Researchers studying visual communication trends in short-form video.

Input Schema

{
"videoUrls": ["https://www.tiktok.com/@tiktok/video/7018846970735028738"],
"maxItems": 10
}

videoUrls is required. All other fields are optional.

FieldTypeDefaultDescription
videoUrlsArray of stringsPublic TikTok video URLs (@user/video/..., vm.tiktok.com/...)
maxItemsInteger10Maximum number of videos to process in a single run

Output Schema

Each dataset item represents one video and all the on-screen text found in it:

{
"videoUrl": "https://www.tiktok.com/@tiktok/video/7018846970735028738",
"videoId": "7018846970735028738",
"duration": 42.5,
"textSegments": [
{
"text": "Summer sale — 50% off today only",
"timestamp": "00:02",
"type": "overlay",
"position": "center",
"language": "en",
"confidence": "high"
}
],
"scrapedAt": "2026-06-01T08:15:42.000Z"
}
FieldDescription
videoUrlNormalized canonical URL of the video
videoIdTikTok numeric video ID or short-link identifier
durationVideo length in seconds (from the downloaded file)
textSegmentsArray of all on-screen text elements found
textSegments[].textThe visible text content as it appears on screen
textSegments[].timestampMM:SS when the text first appears — null if indeterminate
textSegments[].typeoverlay | caption | sticker | watermark | hashtag | mention | other
textSegments[].positiontop | center | bottom — vertical screen position, null if it moves
textSegments[].languageISO 639-1 language code detected from the text, e.g. "en", "es"null if indeterminate
textSegments[].confidencehigh | medium | low — extraction confidence based on text clarity
scrapedAtISO timestamp of when this video was processed

How It Works

  1. Validate inputs — each URL is checked against accepted TikTok video URL patterns and normalized to a canonical form.
  2. Download (retriable) — each video is fetched via a dedicated crawler step using Apify Proxy. Failed downloads are retried automatically with a different proxy (up to 5 attempts).
  3. Process — the downloaded file is analyzed by AI vision to extract all visible on-screen text in a single pass.
  4. Structured output — each text segment is classified by type, position, language, and confidence.
  5. Push to dataset — one dataset row is pushed per video containing all its text segments.

Notes and Limitations

  • Public videos only: Private accounts and videos that require login to view are not supported.
  • Video availability: Deleted or expired videos will fail to download and are skipped with a warning.
  • OCR accuracy: Fast-moving, small-font, or low-contrast text may yield lower confidence extractions.
  • Rate limiting: TikTok may rate-limit requests at high volume. Reduce concurrency or add delays between runs if you encounter failures.
  • Video size: Very large videos are skipped automatically.
  • Audio not included: Spoken content is intentionally excluded — only text visually rendered on screen is extracted.