# TikTok Transcript Scraper · Pay Per Found Caption (`aticode/tiktok-transcript-scraper`) Actor

Extract TikTok video transcripts from native captions in seconds. Fair pricing: only charged for videos that actually have captions. No AI cost, no login.

- **URL**: https://apify.com/aticode/tiktok-transcript-scraper.md
- **Developed by:** [Attila](https://apify.com/aticode) (community)
- **Categories:** AI, Social media, Videos
- **Stats:** 2 total users, 0 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.00 / 1,000 transcript delivereds

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## TikTok Transcript Scraper — TikTok Captions to Text API

**Extract the transcript of any public TikTok video** in seconds. This TikTok transcript scraper pulls the spoken-word text straight from TikTok's **native captions** (subtitles) — no slow AI transcription, no audio download, no watermark removal tools, no login. You get clean JSON: plain-text transcript, timestamped segments, language, and video metadata.

Keywords: tiktok transcript, tiktok captions, tiktok subtitles, tiktok to text, tiktok transcript api, tiktok subtitle extractor.

### How to extract a transcript from a TikTok video

1. Paste one or more TikTok video URLs into `videoUrls` (standard links and `vm.`/`vt.` short links both work).
2. Run the Actor.
3. Get one result row per video — each with the full `transcript` text plus timestamped `segments`.

That's it. No API key from TikTok, no browser, no manual subtitle download.

### Fair pricing — pay only for transcripts you actually get

**You are only charged for results that contain a transcript.** Videos without captions, deleted videos or temporary errors still produce a clean result row — **for free**. Most competing TikTok transcript scrapers charge for every row, including empty ones.

| Event | Price | When |
|---|---|---|
| Transcript delivered | $0.001 | Only when `hasCaption: true` |
| Caption-less / deleted / error rows | free | Always documented, never charged |
| Actor start | $0.00005 | Platform-standard start fee, once per run |

### What the TikTok transcript output looks like

```json
{
    "url": "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
    "videoId": "7629032018113809695",
    "author": "mrbeast",
    "description": "...",
    "durationSec": 34,
    "playCount": 12400000,
    "likeCount": 980000,
    "commentCount": 4100,
    "createTime": "2026-05-28T17:02:11.000Z",
    "hasCaption": true,
    "language": "eng-US",
    "captionSource": "ASR",
    "transcript": "Drop the wrecking ball! ... that didn't work. Let's try wood. Drop it!",
    "segments": [
        { "start": 0, "end": 2.1, "text": "Drop the wrecking ball!" }
    ],
    "availableLanguages": ["eng-US", "vie-VN"]
}
````

- **`transcript`** — clean plain text, ASR duplicates removed
- **`segments`** — timestamped lines for subtitle / SRT / caption workflows, included at no extra cost
- **`captionSource`** — `ASR` (original spoken language) or `MT` (machine-translated)
- **Video metadata included** — author, stats, duration, creation date

### What you can build with TikTok transcripts

- **Content analysis at scale** — feed transcripts into your LLM or AI agent pipelines (clean, MCP-friendly JSON)
- **Brand & trend monitoring** — search what is actually being *said* in videos, not just hashtags
- **Content repurposing** — turn TikToks into blog posts, newsletters, captions, SRT subtitles
- **Research & journalism** — quote and document spoken video content reliably

### Input

```json
{
    "videoUrls": [
        "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
        "https://vm.tiktok.com/ZMabc123/"
    ],
    "preferredLanguage": "eng",
    "useResidentialFallback": false
}
```

- `videoUrls` — standard links and `vm.`/`vt.` short links are supported
- `preferredLanguage` — optional (`eng-US`, or prefix like `eng`); falls back to the original spoken-language track
- `useResidentialFallback` — optional; retries blocked requests via residential proxy

### Frequently asked questions

**How do I get the transcript of a TikTok video?**
Paste the video URL into `videoUrls` and run this Actor. It returns the spoken-word transcript as plain text plus timestamped segments, pulled from TikTok's native captions.

**Does every TikTok video have a transcript?**
Most spoken-word videos carry auto-generated (ASR) captions, which this scraper extracts. Music-only or speechless videos have no captions — those rows are returned for free (`hasCaption: false`), so you never pay for an empty result.

**Can I get timestamped subtitles / SRT from TikTok?**
Yes. Every transcript includes a `segments` array with start/end times, ready to convert into SRT or WebVTT subtitles — at no extra cost.

**Do I need a TikTok API key or login?**
No. The Actor reads publicly available caption data and needs no TikTok account, API key, or browser session.

**How much does it cost to scrape TikTok transcripts?**
$0.001 per delivered transcript ($1 per 1,000). Videos without captions are free. There is a $0.00005 platform start fee per run.

**Can AI agents use this via MCP?**
Yes. The output is structured JSON with named fields (`transcript`, `language`, `segments`, `videoId`), so LLM agents can call it through the Apify MCP server and consume results directly.

### Honest limitations

- Works with TikTok's **native caption tracks**. Videos without any captions (music-only, no speech) return `hasCaption: false` — for free. In practice the large majority of spoken-word videos have ASR captions.
- No AI fallback transcription (yet) — that keeps the price at $0.001 instead of $0.01+.

### Legal & privacy note (GDPR)

This Actor processes **publicly available subtitle data** that TikTok serves with every video page. It collects no private data, requires no login and does not download videos. You are responsible for ensuring that your use of the extracted data complies with applicable laws (including GDPR when processing personal data contained in transcripts) and TikTok's Terms of Service in your jurisdiction.

### Roadmap

- Optional AI fallback transcription for videos without native captions
- Timestamped SRT / WebVTT export as a one-click output format
- On-screen text extraction (OCR) as an optional premium event

***

Questions or a feature request? Open an issue on the Actor page — feedback directly shapes the roadmap.

# Actor input Schema

## `videoUrls` (type: `array`):

List of TikTok video URLs (standard or vm./vt. short links). Each URL produces exactly one result row — you are only charged for rows that contain a transcript.

## `preferredLanguage` (type: `string`):

Optional language code such as 'eng-US' or prefix 'eng'. If the video has no caption track in this language, the original (ASR) track is used. Leave empty for automatic selection.

## `useResidentialFallback` (type: `boolean`):

If datacenter proxies get blocked, retry once via residential proxy (higher proxy cost on your plan, more reliable for high volumes).

## `proxyConfiguration` (type: `object`):

Apify Proxy settings for fetching video pages. Datacenter proxies with rotation are used by default.

## Actor input object example

```json
{
  "videoUrls": [
    "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
    "https://www.tiktok.com/@theanh28entertainment/video/7640319697522642184"
  ],
  "preferredLanguage": "",
  "useResidentialFallback": false,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "videoUrls": [
        "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
        "https://www.tiktok.com/@theanh28entertainment/video/7640319697522642184"
    ],
    "preferredLanguage": "",
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("aticode/tiktok-transcript-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "videoUrls": [
        "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
        "https://www.tiktok.com/@theanh28entertainment/video/7640319697522642184",
    ],
    "preferredLanguage": "",
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("aticode/tiktok-transcript-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "videoUrls": [
    "https://www.tiktok.com/@mrbeast/video/7629032018113809695",
    "https://www.tiktok.com/@theanh28entertainment/video/7640319697522642184"
  ],
  "preferredLanguage": "",
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call aticode/tiktok-transcript-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=aticode/tiktok-transcript-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "TikTok Transcript Scraper · Pay Per Found Caption",
        "description": "Extract TikTok video transcripts from native captions in seconds. Fair pricing: only charged for videos that actually have captions. No AI cost, no login.",
        "version": "0.1",
        "x-build-id": "T78oCFr3oSoeWukXC"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/aticode~tiktok-transcript-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-aticode-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/aticode~tiktok-transcript-scraper/runs": {
            "post": {
                "operationId": "runs-sync-aticode-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/aticode~tiktok-transcript-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-aticode-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "videoUrls"
                ],
                "properties": {
                    "videoUrls": {
                        "title": "TikTok video URLs",
                        "type": "array",
                        "description": "List of TikTok video URLs (standard or vm./vt. short links). Each URL produces exactly one result row — you are only charged for rows that contain a transcript.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "preferredLanguage": {
                        "title": "Preferred caption language",
                        "type": "string",
                        "description": "Optional language code such as 'eng-US' or prefix 'eng'. If the video has no caption track in this language, the original (ASR) track is used. Leave empty for automatic selection.",
                        "default": ""
                    },
                    "useResidentialFallback": {
                        "title": "Residential proxy fallback",
                        "type": "boolean",
                        "description": "If datacenter proxies get blocked, retry once via residential proxy (higher proxy cost on your plan, more reliable for high volumes).",
                        "default": false
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Apify Proxy settings for fetching video pages. Datacenter proxies with rotation are used by default.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
