# AI Audio to Text Transcriber (`jungle_synthesizer/ai-audio-to-text-transcriber`) Actor

Transcribe audio files to text using OpenAI Whisper. Accepts public audio URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC) and returns full transcripts with language, duration, and timed segments. BYO OpenAI key required.

- **URL**: https://apify.com/jungle\_synthesizer/ai-audio-to-text-transcriber.md
- **Developed by:** [BowTiedRaccoon](https://apify.com/jungle_synthesizer) (community)
- **Categories:** AI, Developer tools, Automation
- **Stats:** 1 total users, 0 monthly users, 0.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

Pay per event

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## AI Audio to Text Transcriber

Transcribe audio files to text using OpenAI Whisper. Supply a list of public audio file URLs and your OpenAI API key — the actor downloads each file, sends it to the Whisper API, and returns a verbatim transcript alongside language detection, duration, and timed segments.

### What it does

- Accepts a list of public audio file URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC)
- Downloads each file to temporary storage (max 25 MB per file — OpenAI limit)
- Transcribes via OpenAI Whisper (`whisper-1`) with `verbose_json` output
- Returns the full text transcript, detected language, audio duration, and segment-level timestamps
- Processes up to 3 files concurrently for faster batch runs
- Saves one dataset record per file, including error records for files that fail

### Use cases

- Podcast indexing and search
- Meeting recording notes
- Compliance and call-center transcription
- Generating training data for NLP models
- Subtitles and captions for video content
- Multilingual content analysis

### Input

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `audioUrls` | Array | Yes | Public audio file URLs to transcribe |
| `openaiApiKey` | String | Yes | Your OpenAI API key (`sk-...`). Not stored. |
| `language` | String | No | ISO 639-1 hint (e.g. `en`, `es`, `ja`). Omit for auto-detect. |
| `maxItems` | Integer | No | Maximum files to transcribe per run. Default: 15. |

**Supported audio formats:** MP3, MP4, M4A, WAV, WEBM, OGG, FLAC
**Max file size:** 25 MB (OpenAI Whisper hard limit)

#### Example input

```json
{
  "audioUrls": [
    "https://example.com/podcast-episode-1.mp3",
    "https://example.com/meeting-recording.wav"
  ],
  "openaiApiKey": "sk-...",
  "language": "en",
  "maxItems": 10
}
````

### Output

One dataset record per audio file.

| Field | Type | Description |
|-------|------|-------------|
| `sourceUrl` | String | Original audio file URL |
| `transcript` | String | Full verbatim transcription text |
| `language` | String | Detected language (e.g. `english`, `spanish`) |
| `durationSeconds` | Number | Audio duration in seconds |
| `segments` | String | JSON array of timed segments `[{start, end, text}]` |
| `model` | String | Whisper model used (`whisper-1`) |
| `transcribedAt` | String | ISO timestamp |
| `status` | String | `success` or `error` |
| `errorMsg` | String | Error description on failure, `null` on success |

#### Example output record

```json
{
  "sourceUrl": "https://example.com/podcast-ep1.mp3",
  "transcript": "Welcome to today's episode. Today we're discussing the future of AI...",
  "language": "english",
  "durationSeconds": 1823.4,
  "segments": "[{\"start\":0.0,\"end\":3.2,\"text\":\"Welcome to today's episode.\"}]",
  "model": "whisper-1",
  "transcribedAt": "2026-05-26T12:00:00Z",
  "status": "success",
  "errorMsg": null
}
```

### Requirements

- **OpenAI API key** — Bring your own key at `https://platform.openai.com/api-keys`. Whisper pricing is approximately $0.006 per minute of audio (billed by OpenAI to your account).
- **Public audio URLs** — Files must be publicly accessible without authentication.

### Pricing

This actor charges **$0.10 per start** + **$0.001 per file processed** (including error records). OpenAI Whisper API costs are separate and billed directly to your OpenAI account.

### Error handling

Files that fail to download or transcribe are not dropped — the actor saves an error record to the dataset with `status: "error"` and a descriptive `errorMsg`. This ensures your dataset always has one row per input URL for easy reconciliation.

Common errors:

- `HTTP 401` — Invalid API key
- `HTTP 429` — OpenAI rate limit exceeded (retry with fewer files or lower concurrency)
- `File exceeds 25 MB limit` — Source file too large for Whisper API
- `Download timed out` — URL not reachable within 60 seconds

# Actor input Schema

## `sp_intended_usage` (type: `string`):

Please describe how you plan to use the data extracted by this crawler.

## `sp_improvement_suggestions` (type: `string`):

Provide any feedback or suggestions for improvements.

## `sp_contact` (type: `string`):

Provide your email address so we can get in touch with you.

## `audioUrls` (type: `array`):

List of public audio file URLs to transcribe. Supported formats: MP3, MP4, M4A, WAV, WEBM, OGG, FLAC. Max file size: 25 MB per file.

## `openaiApiKey` (type: `string`):

Your OpenAI API key (sk-...). Required. The key is used to call the Whisper transcription API. It is not stored or logged.

## `language` (type: `string`):

Optional ISO 639-1 language code (e.g. "en", "es", "fr", "ja"). Supplying this improves accuracy and speed. If omitted, Whisper auto-detects the language.

## `maxItems` (type: `integer`):

Maximum number of audio files to transcribe per run.

## Actor input object example

```json
{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "audioUrls": [
    "https://filesamples.com/samples/audio/mp3/sample3.mp3"
  ],
  "maxItems": 5
}
```

# Actor output Schema

## `results` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "audioUrls": [
        "https://filesamples.com/samples/audio/mp3/sample3.mp3"
    ],
    "maxItems": 5
};

// Run the Actor and wait for it to finish
const run = await client.actor("jungle_synthesizer/ai-audio-to-text-transcriber").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "sp_intended_usage": "Describe your intended use...",
    "sp_improvement_suggestions": "Share your suggestions here...",
    "sp_contact": "Share your email here...",
    "audioUrls": ["https://filesamples.com/samples/audio/mp3/sample3.mp3"],
    "maxItems": 5,
}

# Run the Actor and wait for it to finish
run = client.actor("jungle_synthesizer/ai-audio-to-text-transcriber").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "sp_intended_usage": "Describe your intended use...",
  "sp_improvement_suggestions": "Share your suggestions here...",
  "sp_contact": "Share your email here...",
  "audioUrls": [
    "https://filesamples.com/samples/audio/mp3/sample3.mp3"
  ],
  "maxItems": 5
}' |
apify call jungle_synthesizer/ai-audio-to-text-transcriber --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=jungle_synthesizer/ai-audio-to-text-transcriber",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "AI Audio to Text Transcriber",
        "description": "Transcribe audio files to text using OpenAI Whisper. Accepts public audio URLs (MP3, MP4, M4A, WAV, WEBM, OGG, FLAC) and returns full transcripts with language, duration, and timed segments. BYO OpenAI key required.",
        "version": "0.1",
        "x-build-id": "xMLDb6disQZhjJj0d"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/jungle_synthesizer~ai-audio-to-text-transcriber/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-jungle_synthesizer-ai-audio-to-text-transcriber",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~ai-audio-to-text-transcriber/runs": {
            "post": {
                "operationId": "runs-sync-jungle_synthesizer-ai-audio-to-text-transcriber",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/jungle_synthesizer~ai-audio-to-text-transcriber/run-sync": {
            "post": {
                "operationId": "run-sync-jungle_synthesizer-ai-audio-to-text-transcriber",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "sp_intended_usage",
                    "sp_improvement_suggestions",
                    "audioUrls",
                    "openaiApiKey"
                ],
                "properties": {
                    "sp_intended_usage": {
                        "title": "What is the intended usage of this data?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Please describe how you plan to use the data extracted by this crawler."
                    },
                    "sp_improvement_suggestions": {
                        "title": "How can we improve this crawler for you?",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide any feedback or suggestions for improvements."
                    },
                    "sp_contact": {
                        "title": "Contact Email",
                        "minLength": 1,
                        "type": "string",
                        "description": "Provide your email address so we can get in touch with you."
                    },
                    "audioUrls": {
                        "title": "Audio URLs",
                        "type": "array",
                        "description": "List of public audio file URLs to transcribe. Supported formats: MP3, MP4, M4A, WAV, WEBM, OGG, FLAC. Max file size: 25 MB per file.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "openaiApiKey": {
                        "title": "OpenAI API Key",
                        "type": "string",
                        "description": "Your OpenAI API key (sk-...). Required. The key is used to call the Whisper transcription API. It is not stored or logged."
                    },
                    "language": {
                        "title": "Language Hint (Optional)",
                        "type": "string",
                        "description": "Optional ISO 639-1 language code (e.g. \"en\", \"es\", \"fr\", \"ja\"). Supplying this improves accuracy and speed. If omitted, Whisper auto-detects the language."
                    },
                    "maxItems": {
                        "title": "Max Items",
                        "type": "integer",
                        "description": "Maximum number of audio files to transcribe per run.",
                        "default": 15
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
