Tiktok Video Transcirpt Using OpenAI Whisper API avatar
Tiktok Video Transcirpt Using OpenAI Whisper API

Pricing

$2.00/month + usage

Go to Store
Tiktok Video Transcirpt Using OpenAI Whisper API

Tiktok Video Transcirpt Using OpenAI Whisper API

Developed by

ius iyb

ius iyb

Maintained by Community

This Apify actor uses the OpenAI Whisper API to either transcribe Tiktok video into its original language or translate it into English. It's built to be robust, automatically handling video-to-audio conversion and compression to stay within API limits.

0.0 (0)

Pricing

$2.00/month + usage

0

Total users

1

Monthly users

1

Last modified

15 hours ago

TikTok Video Transcriber & Translator

This Apify actor downloads a public TikTok video, extracts its audio, and uses the OpenAI Whisper API to either transcribe it into its original language or translate it into English. It's built to be robust, automatically handling video-to-audio conversion and compression to stay within API limits.

It supports all the latest features from the OpenAI Audio API, including multiple models (whisper-1, gpt-4o-transcribe), various output formats (JSON, SRT, VTT), and word-level timestamp generation.

Key Features

  • Direct Transcription from URL: Simply provide a public TikTok video URL.
  • Transcription & Translation: Choose to transcribe in the original language or translate directly to English.
  • Multiple OpenAI Models: Supports whisper-1, gpt-4o-transcribe, and gpt-4o-mini-transcribe.
  • Automatic Audio Handling: The actor automatically extracts the audio from the .mp4 video and compresses it to an .mp3 to avoid OpenAI's 25 MB file size limit.
  • Word-Level Timestamps: Get precise start and end times for each word in the transcript (requires whisper-1).
  • Rich Output Formats: Get your transcript back as plain text, structured JSON, or subtitle formats like SRT and VTT.
  • Advanced Control: Customize the output with optional parameters like language hints and temperature settings.

Cost of Usage

This actor uses two services that may have associated costs:

  1. Apify Platform: You will be charged for Apify Compute Units (CUs) while the actor runs. The consumption is generally low for this task, as it primarily waits for API responses.
  2. OpenAI API: The primary cost will come from the OpenAI API. You must provide your own OpenAI API key, and you will be billed by OpenAI for the audio processing. For the most current pricing, please refer to the OpenAI Pricing Page.

Input Configuration

The actor requires the following input configuration.

FieldTypeDescription
tiktokUrlString(Required) The full URL of the public TikTok video you want to process.
openaiApiKeyString(Required) Your secret API key from OpenAI. It's highly recommended to set this as a secret environment variable for security.
taskStringThe task to perform. Can be transcription (default) or translation. Translation will always output in English.
modelStringThe OpenAI model to use. whisper-1 (default) supports all features. gpt-4o-transcribe and gpt-4o-mini-transcribe are newer but have some limitations. Note: Translation and Timestamps require whisper-1.
languageString(Optional) The language of the audio in ISO-639-1 format (e.g., en, es, zh). Supplying this can improve accuracy.
promptString(Optional) A text prompt to guide the model's style or to correct specific words and acronyms that are often misrecognized.
response_formatStringThe output format of the transcript. Default is json. Note: gpt-4o models only support json or text. Timestamps require verbose_json.
temperatureString(Optional) A value between 0 and 1. Higher values (e.g., 0.8) make the output more random; lower values (e.g., 0.2) make it more focused.
timestamp_granularitiesArray(Optional) Request word or segment-level timestamps. Requires whisper-1 model and verbose_json response format.

Output

The actor will save its results to the default Apify dataset. The output is a JSON object that contains the task and model used, along with the full result from the OpenAI API.

Example Output (verbose_json with word timestamps)

[{
"task": "transcription",
"model": "whisper-1",
"result": {
"text": "This is a test of the TikTok transcriber.",
"segments": [
{
"id": 0,
"seek": 0,
"start": 0,
"end": 3.5,
"text": " This is a test of the TikTok transcriber.",
"tokens": [ ... ],
"temperature": 0,
"avg_logprob": -0.25,
"compression_ratio": 1.2,
"no_speech_prob": 0.1
}
],
"words": [
{
"word": "This",
"start": 0.5,
"end": 0.7
},
{
"word": "is",
"start": 0.7,
"end": 0.8
},
{
"word": "a",
"start": 0.8,
"end": 0.9
},
{
"word": "test",
"start": 0.9,
"end": 1.2
},
...
],
"language": "english"
}
}]

How to Use

  1. Navigate to the actor on the Apify platform.
  2. Click the "Try actor" button.
  3. Enter the TikTok Video URL and your OpenAI API Key.
  4. (Optional) Adjust the other configuration options as needed.
  5. Click the Start button and wait for the run to finish.
  6. Once finished, check the Output tab in the run console to view and download your results.

Limitations

  • The input TikTok video must be publicly accessible. Private or deleted videos will cause an error.