Pricing

$5.00/month + usage

Audio & Video to Text

Transcribes video and audio files into plain text and subtitle formats (TXT, SRT, VTT, TSV, JSON) using OpenAI's Whisper model. Supports preloaded tiny, base, and small models.

Pricing

$5.00/month + usage

Rating

0.0

(0)

Developer

Donjuan

Actor stats

Bookmarked

Total users

Monthly active users

10 months ago

Last modified

Categories

Social media

Developer tools

Open source

You can access the Audio & Video to Text programmatically from your own applications by using the Apify API. You can also choose the language preference from below. To use the Apify API, you’ll need an Apify account and your API token, found in Integrations settings in Apify Console.

Python

JavaScript

CLI

OpenAPI

HTTP

MCP

{
  "openapi": "3.0.1",
  "info": {
    "version": "0.0",
    "x-build-id": "YECau3w1LQ0yXbGvh"
  },
  "servers": [
    {
      "url": "https://api.apify.com/v2"
    }
  ],
  "paths": {
    "/acts/donjuan_mime~audio-video-to-text/run-sync-get-dataset-items": {
      "post": {
        "operationId": "run-sync-get-dataset-items-donjuan_mime-audio-video-to-text",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK"
          }
        }
      }
    },
    "/acts/donjuan_mime~audio-video-to-text/runs": {
      "post": {
        "operationId": "runs-sync-donjuan_mime-audio-video-to-text",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor and returns information about the initiated run in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/runsResponseSchema"
                }
              }
            }
          }
        }
      }
    },
    "/acts/donjuan_mime~audio-video-to-text/run-sync": {
      "post": {
        "operationId": "run-sync-donjuan_mime-audio-video-to-text",
        "x-openai-isConsequential": false,
        "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
        "tags": [
          "Run Actor"
        ],
        "requestBody": {
          "required": true,
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/inputSchema"
              }
            }
          }
        },
        "parameters": [
          {
            "name": "token",
            "in": "query",
            "required": true,
            "schema": {
              "type": "string"
            },
            "description": "Enter your Apify token here"
          }
        ],
        "responses": {
          "200": {
            "description": "OK"
          }
        }
      }
    }
  },
  "components": {
    "schemas": {
      "inputSchema": {
        "type": "object",
        "required": [
          "source_url",
          "model"
        ],
        "properties": {
          "source_url": {
            "title": "Source URL",
            "type": "string",
            "description": "Direct URL to the video or audio file (e.g., .mp4, .wav). YouTube links are not supported."
          },
          "model": {
            "title": "Whisper Model",
            "type": "string",
            "description": "Whisper model to use: 'tiny', 'base', or 'small' (pre-installed). Other models require download."
          }
        }
      },
      "runsResponseSchema": {
        "type": "object",
        "properties": {
          "data": {
            "type": "object",
            "properties": {
              "id": {
                "type": "string"
              },
              "actId": {
                "type": "string"
              },
              "userId": {
                "type": "string"
              },
              "startedAt": {
                "type": "string",
                "format": "date-time",
                "example": "2025-01-08T00:00:00.000Z"
              },
              "finishedAt": {
                "type": "string",
                "format": "date-time",
                "example": "2025-01-08T00:00:00.000Z"
              },
              "status": {
                "type": "string",
                "example": "READY"
              },
              "meta": {
                "type": "object",
                "properties": {
                  "origin": {
                    "type": "string",
                    "example": "API"
                  },
                  "userAgent": {
                    "type": "string"
                  }
                }
              },
              "stats": {
                "type": "object",
                "properties": {
                  "inputBodyLen": {
                    "type": "integer",
                    "example": 2000
                  },
                  "rebootCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "restartCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "resurrectCount": {
                    "type": "integer",
                    "example": 0
                  },
                  "computeUnits": {
                    "type": "integer",
                    "example": 0
                  }
                }
              },
              "options": {
                "type": "object",
                "properties": {
                  "build": {
                    "type": "string",
                    "example": "latest"
                  },
                  "timeoutSecs": {
                    "type": "integer",
                    "example": 300
                  },
                  "memoryMbytes": {
                    "type": "integer",
                    "example": 1024
                  },
                  "diskMbytes": {
                    "type": "integer",
                    "example": 2048
                  }
                }
              },
              "buildId": {
                "type": "string"
              },
              "defaultKeyValueStoreId": {
                "type": "string"
              },
              "defaultDatasetId": {
                "type": "string"
              },
              "defaultRequestQueueId": {
                "type": "string"
              },
              "buildNumber": {
                "type": "string",
                "example": "1.0.0"
              },
              "containerUrl": {
                "type": "string"
              },
              "usage": {
                "type": "object",
                "properties": {
                  "ACTOR_COMPUTE_UNITS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_WRITES": {
                    "type": "integer",
                    "example": 1
                  },
                  "KEY_VALUE_STORE_LISTS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_INTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_SERPS": {
                    "type": "integer",
                    "example": 0
                  }
                }
              },
              "usageTotalUsd": {
                "type": "number",
                "example": 0.00005
              },
              "usageUsd": {
                "type": "object",
                "properties": {
                  "ACTOR_COMPUTE_UNITS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATASET_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "KEY_VALUE_STORE_WRITES": {
                    "type": "number",
                    "example": 0.00005
                  },
                  "KEY_VALUE_STORE_LISTS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_READS": {
                    "type": "integer",
                    "example": 0
                  },
                  "REQUEST_QUEUE_WRITES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_INTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "DATA_TRANSFER_EXTERNAL_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                    "type": "integer",
                    "example": 0
                  },
                  "PROXY_SERPS": {
                    "type": "integer",
                    "example": 0
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Audio & Video to Text OpenAPI definition

OpenAPI is a standard for designing and describing RESTful APIs, allowing developers to define API structure, endpoints, and data formats in a machine-readable way. It simplifies API development, integration, and documentation.

OpenAPI is effective when used with AI agents and GPTs by standardizing how these systems interact with various APIs, for reliable integrations and efficient communication.

By defining machine-readable API specifications, OpenAPI allows AI models like GPTs to understand and use varied data sources, improving accuracy. This accelerates development, reduces errors, and provides context-aware responses, making OpenAPI a core component for AI applications.

You can download the OpenAPI definitions for Audio & Video to Text from the options below:

OpenAPI.json

If you’d like to learn more about how OpenAPI powers GPTs, read our blog post.

You can also check out our other API clients:

Audio & Video to Text API in Python

Audio & Video to Text API in JavaScript

Audio & Video to Text API through CLI

Audio & Video to Text API

Transcribe Video to Text & Audio to Text — 99+ Languages

sian.agency/INCREDIBLY-FAST-audio-transcriber

Transcribe video to text and audio to text in bulk on Apify. 99+ languages, word-level timestamps, speaker diarization, SRT/VTT export. Try free.

SIÁN OÜ

5.0

Tiktok Transcript Scraper/Downloader

scraper-mind/tiktok-transcript-scraper

Extract TikTok video transcripts, captions, and metadata fast with our TikTok Transcript Scraper. Supports batch processing, proxy fallback, and JSON export. Ideal for creators, researchers, and marketers. Just $5 per run—accurate, scalable, and reliable!

Scraper Mind

1.0

Analyze Image

akash9078/analyze-image

Unlock powerful AI analysis with just one upload. Detect objects, styles, colors, emotions, and more in seconds. Perfect for designers, researchers, and creators who want to understand images deeply.

Akash Kumar Naik

5.0

TikTok Transcript

agentx/tiktok-transcript

TikTok transcript API — pass any TikTok video URL and the response includes the spoken-word transcript with timestamps, the on-screen caption text, and an optional translation into any of 100+ languages. Works for short-form posts, longer videos, and embedded slideshow audio.

AgentX

366

5.0

Video Script Generator

powerai/video-script-generator

Generate engaging video scripts with AI-powered agent, tailored to your content type, audience, and key messages. Get professional scripts in multiple formats (Markdown, HTML, PDF) with visual theme suggestions and tone guidance.

PowerAI

Best Tiktok Transcripts Scraper

scrape-creators/best-tiktok-transcripts-scraper

Extract spoken transcripts from TikTok videos (where available) with Best TikTok Transcripts Scraper. Just enter video URLs to get transcripts. Perfect for content analysis, AI pipelines, or trend research.

Scrape Creators

1.2K

4.4

TikTok Transcript Scraper

crawlerbros/tiktok-transcript-scraper

Extract transcripts and subtitles from TikTok videos in all available languages. Returns timestamped segments plus full plain-text transcript per language.

Crawler Bros

5.0

Youtube Search Scraper

akash9078/youtube-search-scraper

Search YouTube with one or more queries via the official YouTube Data API and export structured results with optional video-level statistics and durations.

Akash Kumar Naik

Google Maps Scraper Api

akash9078/google-maps-scraper-api

Extract business data from Google Maps using official Places API. Get names, addresses, phone numbers, ratings, reviews, photos, hours & more for restaurants, hotels, shops & any business.

Akash Kumar Naik

Fast YouTube Transcript Scraper

akash9078/fast-youtube-transcript

Extract accurate YouTube video transcripts and timestamped captions without an API key. Supports YouTube Shorts, live stream VODs, Premieres, embedded videos, and auto-generated subtitles in 100+ languages. Fast YouTube transcript extractor for SEO, AI, research, and content creation.

Akash Kumar Naik