# YouTube Transcript Search (`lead_maker/youtube-transcript-keyword-search`) Actor

Search YouTube video transcripts and captions for any keyword or phrase. Scan entire channels, get exact timestamps, preview clips, and export results as CSV. Find what was said, when it was said, and download the clip.

- **URL**: https://apify.com/lead\_maker/youtube-transcript-keyword-search.md
- **Developed by:** [Lead Maker](https://apify.com/lead_maker) (community)
- **Categories:** Automation, Videos, Lead generation
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $6.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## YouTube Transcript Search — Caption Keyword Finder & Clip Extractor

Search **YouTube video transcripts** and **captions** for any keyword or phrase. Scan entire channels to find exactly when a topic is mentioned — with **exact timestamps**, **transcript context**, and **clip download commands**.

Stop scrubbing through hours of video. This **YouTube transcript search tool** scans every video in a channel and tells you the exact second a word was spoken — across **hundreds of videos in minutes**.

### 🎯 Best For

- **Content research** — find every video that mentions your topic, with exact timecodes
- **Brand monitoring** — track **brand mentions** across YouTube channels
- **Competitive analysis** — see what competitors say about specific subjects
- **Academic research** — locate references to papers, theories, or people across lecture channels
- **Clip extraction** — get download commands for specific video segments
- **Podcast production** — find sound bites and quotes from interview channels

### 📊 What You Get

- **Exact timestamps** where the keyword appears in the transcript
- **Clickable YouTube links** that jump to that moment in the video
- **Transcript excerpts** with keywords highlighted
- **yt-dlp download commands** for each matching clip
- **Interactive HTML report** with video thumbnails and copy buttons
- **CSV export** for spreadsheets and data analysis
- **Batch download scripts** (.bat and .sh) to grab all clips at once

### ⚙️ How It Works

**Step 1 — Enter channels and keywords**

Paste **YouTube channel URLs** and the **words or phrases** to search for. Supports multiple channels and multiple keywords — comma-separated or one per line.

**Step 2 — Automatic transcript scanning**

The Actor fetches every video's transcript and searches for your keywords. **YouTube Shorts** are filtered out automatically. Results stream to your dataset in real time.

**Step 3 — Browse and download**

Open the **HTML report** to preview results visually. Click thumbnails to preview on YouTube. Copy **yt-dlp commands** to download clips. Export as **CSV**.

### 🔑 Key Features

- **YouTube transcript search** — full-text search across all video captions
- **YouTube caption search** — works with auto-generated and manual captions
- **Multiple keywords** — search for several terms at once
- **Multiple channels** — scan across several channels in one run
- **Smart clip merging** — nearby mentions combine into single clips
- **Minimum clip duration** — filter out brief passing mentions
- **Exclude Shorts** — skip videos under 60 seconds
- **HTML report** — interactive report with thumbnails and copy buttons
- **CSV export** — spreadsheet-ready results
- **Batch download** — scripts to download all clips at once
- **Abort-safe** — results saved every 25 videos
- **No API key** — works without YouTube Data API credentials
- **Residential proxies** — avoids YouTube rate limiting

### 📝 Input

| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `channelUrls` | string | Yes | — | **YouTube channel URLs**, comma-separated or one per line |
| `keywords` | string | Yes | — | **Search terms**, comma-separated |
| `maxVideos` | integer | No | all | Max videos to scan per channel |
| `excludeShorts` | boolean | No | true | Skip **YouTube Shorts** |
| `minClipDurationSec` | integer | No | 0 | Minimum clip length in seconds |
| `snippetPaddingSec` | integer | No | 15 | Padding before/after match |
| `mergeGapSec` | integer | No | 30 | Merge mentions within this gap |
| `contextLines` | integer | No | 3 | Context lines around each match |
| `language` | string | No | `en` | Transcript language preference |

#### Example Input

```json
{
    "channelUrls": "https://www.youtube.com/@veritasium",
    "keywords": "quantum, relativity",
    "maxVideos": 100
}
````

### 💰 Pricing

Apify charges for compute time, proxy bandwidth, and storage. Typical costs:

| Videos scanned | Approximate cost |
|---------------|-----------------|
| 10 videos | ~$0.05 |
| 50 videos | ~$0.25 |
| 100 videos | ~$0.50 |
| 500 videos | ~$2.50 |

Costs vary based on transcript length, proxy usage, and result count. Set a **max cost per run** in Apify to control spending.

### 📤 Sample Output

```json
{
    "channelName": "veritasium",
    "videoTitle": "The Internet Was Weeks Away From Disaster and No One Knew",
    "keyword": "science",
    "timestampFormatted": "00:05:12",
    "videoUrlAtTimestamp": "https://www.youtube.com/watch?v=aoag03mSuXQ&t=312",
    "transcriptExcerpt": "...was a young computer **science** student who just happened to be building his own kernel...",
    "mentionCount": 1,
    "ytDlpCommand": "yt-dlp --download-sections \"*00:04:57-00:05:30\" -o \"The_Internet_Was_Weeks_Away_00-04-57.%(ext)s\" \"https://www.youtube.com/watch?v=aoag03mSuXQ\""
}
```

### 🏢 Built for Teams & Automation

- **API integration** — trigger runs via the Apify API
- **Scheduling** — recurring searches on any cadence
- **Webhooks** — notifications when runs complete
- **Integrations** — Google Sheets, Slack, Zapier, Make, and more

### ❓ FAQ

**Does it work with auto-generated captions?**
Yes — uses whatever transcript is available, including YouTube's **auto-generated captions**.

**What languages are supported?**
Any language with available transcripts. Set the `language` parameter.

**How long does a full channel take?**
About **15-25 minutes** for 500 videos. Use `maxVideos` for faster runs.

**What if I stop the run early?**
Results save every 25 videos — you keep everything including the HTML report and CSV.

**Can I search multiple channels?**
Yes — comma-separate or one per line.

**Do I need a YouTube API key?**
No. No Google API credentials required.

**Why are some videos skipped?**
Videos without transcripts, age-restricted videos, or temporary YouTube rate limits. The Actor logs which videos are skipped and why.

### 🔄 How Does This Compare?

**vs YouTube's built-in search** — YouTube finds videos *about* a topic. This Actor finds the exact *second* a word is spoken inside a video.

**vs full transcript scrapers** — Most scrapers dump raw text. This Actor **searches** it and returns only the relevant moments with timestamps and download commands.

**vs manual searching** — Opening each video and skimming captions takes hours. This scans an entire channel in minutes.

### 🔧 Technical Details

- `scrapetube` for channel video listing (no API key)
- `youtube-transcript-api` for timestamped transcripts
- Residential proxy rotation per request
- 2-minute timeout on channel listing
- Auto-saves every 25 videos
- 1.5s delay between requests

# Actor input Schema

## `channelUrls` (type: `string`):

Paste one or more YouTube channel URLs, separated by commas. Example: https://www.youtube.com/@veritasium, https://www.youtube.com/@3blue1brown

## `keywords` (type: `string`):

What to look for in the video transcripts. Separate multiple terms with commas. Example: black hole, dark matter

## `maxVideos` (type: `integer`):

Only search this many of the most recent videos. Leave empty to search the entire channel.

## `excludeShorts` (type: `boolean`):

Skip YouTube Shorts (videos under 60 seconds).

## `language` (type: `string`):

Language code for transcripts (e.g., 'en', 'es', 'de'). Falls back to English if unavailable.

## `snippetPaddingSec` (type: `integer`):

How many seconds of extra video to include before and after each match. Gives you more context around the keyword.

## `mergeGapSec` (type: `integer`):

If the keyword is mentioned multiple times within this many seconds, combine them into one clip instead of creating separate ones. Set to 0 to keep every mention separate.

## `minClipDurationSec` (type: `integer`):

Skip clips shorter than this. Useful for filtering out brief, passing mentions and only keeping longer discussions.

## `concurrency` (type: `integer`):

How many video transcripts to fetch at the same time. Higher = faster but uses more proxy bandwidth. Default 5 is a good balance.

## `contextLines` (type: `integer`):

How many lines of transcript to show before and after each match in the results.

## Actor input object example

```json
{
  "channelUrls": "https://www.youtube.com/@veritasium",
  "keywords": "science",
  "maxVideos": 10,
  "excludeShorts": true,
  "language": "en",
  "snippetPaddingSec": 15,
  "mergeGapSec": 30,
  "minClipDurationSec": 0,
  "concurrency": 5,
  "contextLines": 3
}
```

# Actor output Schema

## `report` (type: `string`):

HTML report with embedded YouTube players for each match, copy-to-clipboard buttons for yt-dlp commands, and a per-video breakdown table.

## `results` (type: `string`):

Dataset with all matches including channel name, video title, keyword, timestamp, transcript excerpt, YouTube timestamp link, and yt-dlp command.

## `csv` (type: `string`):

All results as a CSV spreadsheet for use in Excel or Google Sheets.

## `summary` (type: `string`):

JSON summary with total mentions, clips, videos searched, and per-video breakdown ranked by mention count.

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "channelUrls": "https://www.youtube.com/@veritasium",
    "keywords": "science",
    "maxVideos": 10
};

// Run the Actor and wait for it to finish
const run = await client.actor("lead_maker/youtube-transcript-keyword-search").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "channelUrls": "https://www.youtube.com/@veritasium",
    "keywords": "science",
    "maxVideos": 10,
}

# Run the Actor and wait for it to finish
run = client.actor("lead_maker/youtube-transcript-keyword-search").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "channelUrls": "https://www.youtube.com/@veritasium",
  "keywords": "science",
  "maxVideos": 10
}' |
apify call lead_maker/youtube-transcript-keyword-search --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=lead_maker/youtube-transcript-keyword-search",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "YouTube Transcript Search",
        "description": "Search YouTube video transcripts and captions for any keyword or phrase. Scan entire channels, get exact timestamps, preview clips, and export results as CSV. Find what was said, when it was said, and download the clip.",
        "version": "0.1",
        "x-build-id": "1dCJe86GHq3XlSeZX"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/lead_maker~youtube-transcript-keyword-search/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-lead_maker-youtube-transcript-keyword-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/lead_maker~youtube-transcript-keyword-search/runs": {
            "post": {
                "operationId": "runs-sync-lead_maker-youtube-transcript-keyword-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/lead_maker~youtube-transcript-keyword-search/run-sync": {
            "post": {
                "operationId": "run-sync-lead_maker-youtube-transcript-keyword-search",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "channelUrls",
                    "keywords"
                ],
                "properties": {
                    "channelUrls": {
                        "title": "Channel URL(s)",
                        "type": "string",
                        "description": "Paste one or more YouTube channel URLs, separated by commas. Example: https://www.youtube.com/@veritasium, https://www.youtube.com/@3blue1brown"
                    },
                    "keywords": {
                        "title": "Search Keywords or Phrases",
                        "type": "string",
                        "description": "What to look for in the video transcripts. Separate multiple terms with commas. Example: black hole, dark matter"
                    },
                    "maxVideos": {
                        "title": "Max Videos per Channel",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Only search this many of the most recent videos. Leave empty to search the entire channel."
                    },
                    "excludeShorts": {
                        "title": "Exclude YouTube Shorts",
                        "type": "boolean",
                        "description": "Skip YouTube Shorts (videos under 60 seconds).",
                        "default": true
                    },
                    "language": {
                        "title": "Transcript Language",
                        "type": "string",
                        "description": "Language code for transcripts (e.g., 'en', 'es', 'de'). Falls back to English if unavailable.",
                        "default": "en"
                    },
                    "snippetPaddingSec": {
                        "title": "Clip Padding",
                        "minimum": 0,
                        "maximum": 120,
                        "type": "integer",
                        "description": "How many seconds of extra video to include before and after each match. Gives you more context around the keyword.",
                        "default": 15
                    },
                    "mergeGapSec": {
                        "title": "Merge Nearby Mentions",
                        "minimum": 0,
                        "maximum": 300,
                        "type": "integer",
                        "description": "If the keyword is mentioned multiple times within this many seconds, combine them into one clip instead of creating separate ones. Set to 0 to keep every mention separate.",
                        "default": 30
                    },
                    "minClipDurationSec": {
                        "title": "Minimum Clip Length",
                        "minimum": 0,
                        "maximum": 600,
                        "type": "integer",
                        "description": "Skip clips shorter than this. Useful for filtering out brief, passing mentions and only keeping longer discussions.",
                        "default": 0
                    },
                    "concurrency": {
                        "title": "Speed (Concurrent Fetches)",
                        "minimum": 1,
                        "maximum": 10,
                        "type": "integer",
                        "description": "How many video transcripts to fetch at the same time. Higher = faster but uses more proxy bandwidth. Default 5 is a good balance.",
                        "default": 5
                    },
                    "contextLines": {
                        "title": "Transcript Context",
                        "minimum": 1,
                        "maximum": 20,
                        "type": "integer",
                        "description": "How many lines of transcript to show before and after each match in the results.",
                        "default": 3
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
