# TikTok Transcript Scraper (`automation-lab/tiktok-transcript-scraper`) Actor

Extract TikTok video transcripts, timestamped caption segments, hashtags, and engagement metadata for AI analysis, content repurposing, and viral hook research.

- **URL**: https://apify.com/automation-lab/tiktok-transcript-scraper.md
- **Developed by:** [Stas Persiianenko](https://apify.com/automation-lab) (community)
- **Categories:** Social media
- **Stats:** 8 total users, 5 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.
Since this Actor supports Apify Store discounts, the price gets lower the higher subscription plan you have.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## TikTok Transcript Scraper

Extract TikTok video transcripts, caption segments, and video metadata from public TikTok video URLs.

### What does TikTok Transcript Scraper do?

TikTok Transcript Scraper turns a list of TikTok video links into structured transcript data.
It fetches public TikTok video pages, looks for TikTok caption/subtitle metadata, downloads available subtitle files, and saves one dataset row per video.

### Who is it for?

- 📣 Social media managers who repurpose TikTok videos into blog posts, captions, and newsletters
- 🧠 AI workflow builders who need transcript text for summarization or classification
- 📊 Brand researchers who monitor creator messaging and campaign language
- 🎬 Video editors who need text from multiple TikTok clips
- 📰 Journalists and analysts who review public short-form video content

### Why use this actor?

Manual transcript collection is slow and inconsistent.
This actor normalizes transcript availability, subtitle sources, video metadata, engagement counters, and errors into a predictable dataset.

### Data you can extract

| Field | Description |
| --- | --- |
| `url` | Submitted TikTok URL |
| `finalUrl` | Final URL after redirects |
| `videoId` | TikTok video ID |
| `authorUsername` | Creator username when available |
| `authorDisplayName` | Creator display name |
| `description` | TikTok caption text |
| `hashtags` | Hashtags parsed from TikTok metadata/caption |
| `publishedAt` | Publish timestamp when TikTok exposes it |
| `durationSeconds` | Video duration |
| `likeCount` | Like count |
| `commentCount` | Comment count |
| `shareCount` | Share count |
| `playCount` | Play/view count |
| `transcriptAvailable` | Whether readable public captions were found |
| `transcriptText` | Full transcript text |
| `transcriptSegments` | Timed transcript segments |
| `transcriptLanguage` | Transcript language or language code |
| `subtitleSources` | Raw subtitle source metadata |
| `error` | Reason transcript/metadata was unavailable |
| `scrapedAt` | Timestamp of extraction |

### How much does it cost to scrape TikTok transcripts?

The actor uses pay-per-event pricing.
There is a small start event per run and a per-video item event for saved dataset rows.
Exact tiered prices are shown on the Apify actor page and may include free monthly usage for eligible Apify users.

### How to use TikTok Transcript Scraper

1. Open the actor on Apify.
2. Paste TikTok video URLs into **TikTok video URLs**.
3. Set **Maximum videos** for the run.
4. Keep **Save videos without transcripts** enabled if you want rows for missing captions.
5. Start the run.
6. Download results as JSON, CSV, Excel, XML, RSS, or via API.

### Input example

```json
{
  "startUrls": [
    { "url": "https://www.tiktok.com/@tedtoks/video/7295065135788477742" }
  ],
  "maxItems": 1,
  "includeMetadataOnly": true
}
````

### Output example

```json
{
  "url": "https://www.tiktok.com/@example/video/1234567890",
  "videoId": "1234567890",
  "authorUsername": "example",
  "description": "Public TikTok caption #example",
  "transcriptAvailable": true,
  "transcriptText": "This is the transcript text.",
  "transcriptSegments": [
    { "startTime": 0, "endTime": 2.1, "text": "This is the transcript text." }
  ],
  "error": null
}
```

### Transcript availability

Not every TikTok video has public captions.
Some creators disable captions, some videos have no speech, and some videos are private, removed, age-gated, or geo-restricted.
When captions are missing, the actor can still save a row with `transcriptAvailable: false` and a clear `error` value.

### Tips for best results

- Use direct TikTok video URLs rather than profile URLs.
- Keep the first run small to confirm the URLs are public.
- Enable metadata-only fallback when auditing a large list.
- Disable metadata-only fallback when you only want rows with transcripts.
- Expect some TikTok videos to return no captions.

### Integrations

Use the actor in workflows such as:

- TikTok-to-blog summarization pipelines
- UGC campaign monitoring dashboards
- Creator content research notebooks
- Brand safety review queues
- AI prompt generation from short-form videos
- Cross-platform content repurposing systems

### API usage with Node.js

```js
import { ApifyClient } from 'apify-client';

const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('automation-lab/tiktok-transcript-scraper').call({
  startUrls: [{ url: 'https://www.tiktok.com/@tedtoks/video/7295065135788477742' }],
  maxItems: 1,
  includeMetadataOnly: true,
});

const { items } = await client.dataset(run.defaultDatasetId).listItems();
console.log(items);
```

### API usage with Python

```python
from apify_client import ApifyClient

client = ApifyClient()
run = client.actor('automation-lab/tiktok-transcript-scraper').call(run_input={
    'startUrls': [{'url': 'https://www.tiktok.com/@tedtoks/video/7295065135788477742'}],
    'maxItems': 1,
    'includeMetadataOnly': True,
})
items = client.dataset(run['defaultDatasetId']).list_items().items
print(items)
```

### API usage with cURL

```bash
curl -X POST 'https://api.apify.com/v2/acts/automation-lab~tiktok-transcript-scraper/runs?token=YOUR_APIFY_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{"startUrls":[{"url":"https://www.tiktok.com/@tedtoks/video/7295065135788477742"}],"maxItems":1,"includeMetadataOnly":true}'
```

### MCP usage

Connect this actor to Claude Desktop, Claude Code, or another MCP client through Apify MCP Server.
Use this MCP URL:

```text
https://mcp.apify.com/?tools=automation-lab/tiktok-transcript-scraper
```

Add it to Claude Code with:

```bash
claude mcp add apify-tiktok-transcript "https://mcp.apify.com/?tools=automation-lab/tiktok-transcript-scraper"
```

For Claude Desktop or other JSON-based MCP clients, add:

```json
{
  "mcpServers": {
    "apify-tiktok-transcript": {
      "url": "https://mcp.apify.com/?tools=automation-lab/tiktok-transcript-scraper"
    }
  }
}
```

Example prompts:

- "Extract transcripts from these TikTok URLs and summarize common themes."
- "Find the hashtags and transcript text for this list of campaign videos."
- "Turn these TikTok transcripts into newsletter snippets."

### Legality

### Legal and ethical notes

This actor is designed for public TikTok video pages.
You should only process content you are allowed to access and use.
Respect TikTok terms, copyright, privacy rights, and applicable laws.
Do not use transcript data for harassment, spam, discrimination, or other harmful activity.

### Limitations

- Private, deleted, age-gated, or geo-restricted videos may fail.
- TikTok may not expose captions for every video.
- Auto-generated captions can contain mistakes.
- Engagement counters may be missing or rounded depending on TikTok response data.
- Upstream TikTok page structure can change.

### FAQ

#### Can this actor transcribe videos without TikTok captions?

No. The MVP extracts public subtitle/caption files that TikTok exposes for the video. It does not run paid speech-to-text on videos with no captions.

#### Does it need TikTok login cookies?

No. The actor is designed for public video URLs and does not request your TikTok account credentials.

### Troubleshooting

#### Why is `transcriptAvailable` false?

The video probably has no public caption file, captions are disabled, the video has no speech, or TikTok did not expose subtitle metadata for the request.
Keep `includeMetadataOnly` enabled to inspect the error field.

#### Why did a TikTok URL return only an error?

The URL may be private, removed, invalid, region-limited, or blocked by TikTok.
Try opening it in a logged-out browser and confirm it is a direct video URL.

### Related scrapers

Other automation-lab actors you may use with this actor:

- https://apify.com/automation-lab/tiktok-scraper
- https://apify.com/automation-lab/tiktok-comments-scraper
- https://apify.com/automation-lab/youtube-transcript-scraper
- https://apify.com/automation-lab/video-transcript-scraper

### Performance

This actor is HTTP-based and avoids launching a browser for the MVP.
That keeps runs lightweight for batches of direct video URLs.

### Privacy

The actor does not ask for your TikTok login or cookies.
It works from public URLs and saves only data returned by public TikTok pages and subtitle files.

### Changelog

- Initial version: public TikTok URL transcript extraction with metadata fallback.

### Support

If a public video URL fails unexpectedly, share the run and URL through Apify support so we can inspect the current TikTok response shape.

### Final notes

TikTok transcript availability varies by video.
For reliable analysis pipelines, keep the `error` and `transcriptAvailable` fields in downstream processing so missing captions can be handled gracefully.

# Actor input Schema

## `startUrls` (type: `array`):

Paste one or more public TikTok video URLs. Short vm.tiktok.com links are accepted if TikTok redirects them.

## `videoUrls` (type: `array`):

Optional API-friendly array of TikTok video URL strings. Use this instead of startUrls if calling programmatically.

## `maxItems` (type: `integer`):

Maximum number of supplied TikTok URLs to process in this run.

## `includeMetadataOnly` (type: `boolean`):

If enabled, the actor saves a metadata row with transcriptAvailable=false when TikTok exposes no public captions.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
    }
  ],
  "videoUrls": [
    "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
  ],
  "maxItems": 3,
  "includeMetadataOnly": true
}
```

# Actor output Schema

## `overview` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
        }
    ],
    "videoUrls": [
        "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("automation-lab/tiktok-transcript-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "startUrls": [{ "url": "https://www.tiktok.com/@tedtoks/video/7295065135788477742" }],
    "videoUrls": ["https://www.tiktok.com/@tedtoks/video/7295065135788477742"],
}

# Run the Actor and wait for it to finish
run = client.actor("automation-lab/tiktok-transcript-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
    }
  ],
  "videoUrls": [
    "https://www.tiktok.com/@tedtoks/video/7295065135788477742"
  ]
}' |
apify call automation-lab/tiktok-transcript-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=automation-lab/tiktok-transcript-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "TikTok Transcript Scraper",
        "description": "Extract TikTok video transcripts, timestamped caption segments, hashtags, and engagement metadata for AI analysis, content repurposing, and viral hook research.",
        "version": "0.1",
        "x-build-id": "tmYikuxLSAOcookIb"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/automation-lab~tiktok-transcript-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-automation-lab-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/automation-lab~tiktok-transcript-scraper/runs": {
            "post": {
                "operationId": "runs-sync-automation-lab-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/automation-lab~tiktok-transcript-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-automation-lab-tiktok-transcript-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "TikTok video URLs",
                        "type": "array",
                        "description": "Paste one or more public TikTok video URLs. Short vm.tiktok.com links are accepted if TikTok redirects them.",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "videoUrls": {
                        "title": "TikTok video URLs (API array)",
                        "type": "array",
                        "description": "Optional API-friendly array of TikTok video URL strings. Use this instead of startUrls if calling programmatically.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxItems": {
                        "title": "Maximum videos",
                        "minimum": 1,
                        "maximum": 1000,
                        "type": "integer",
                        "description": "Maximum number of supplied TikTok URLs to process in this run.",
                        "default": 3
                    },
                    "includeMetadataOnly": {
                        "title": "Save videos without transcripts",
                        "type": "boolean",
                        "description": "If enabled, the actor saves a metadata row with transcriptAvailable=false when TikTok exposes no public captions.",
                        "default": true
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
