# Youtube Scraper (`scraper-engine/youtube-scraper`) Actor

🎥 YouTube Scraper extracts structured data from videos, channels & playlists — titles, tags, views, likes, comments, captions, thumbnails & publish dates. 🔎 Perfect for SEO, competitor analysis, research & reporting. 🚀 Export-ready for CSV/JSON pipelines.

- **URL**: https://apify.com/scraper-engine/youtube-scraper.md
- **Developed by:** [Scraper Engine](https://apify.com/scraper-engine) (community)
- **Categories:** Videos, SEO tools, Developer tools
- **Stats:** 1 total users, 0 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $4.99 / 1,000 results

This Actor is paid per event and usage. You are charged both the fixed price for specific events and for Apify platform usage.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

### Youtube Scraper

The Youtube Scraper is a production-ready Apify actor that extracts structured data from YouTube search results and direct video URLs — including titles, views, likes, comments count, subscriber counts, descriptions, hashtags, thumbnails, publish dates, and optional transcripts/subtitles. This YouTube scraper tool solves the challenge of collecting clean, export-ready YouTube video metadata at scale without the official API, making it ideal for marketers, developers, data analysts, and researchers. With robust anti-blocking, it enables reliable pipelines for YouTube competitor analysis scraper workflows, SEO tracking, and reporting.

### What data / output can you get?

Below are the main fields pushed to the Apify dataset by the Youtube Scraper. These map directly to the actor’s output and are ready to export as CSV, JSON, or Excel.

| Data type | Description | Example value |
| --- | --- | --- |
| title | Video title | “How to use Crawlee in 10 minutes” |
| type | Content type (video or shorts) | “video” |
| id | YouTube video ID | “dQw4w9WgXcQ” |
| url | Canonical URL (shorts/videos) | “https://www.youtube.com/watch?v=dQw4w9WgXcQ” |
| thumbnailUrl | High-quality thumbnail URL | “https://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg” |
| viewCount | Parsed integer view count | 1500000 |
| date | ISO-like published date (when available) | “2025-01-15T00:00:00.000Z” |
| likes | Total likes (when detected) | 50000 |
| duration | HH:MM:SS (or null if unknown) | “00:03:33” |
| channelName | Channel display name | “Channel Name” |
| channelUrl | Absolute channel URL | “https://www.youtube.com/@channelname” |
| numberOfSubscribers | Subscriber count (when available) | 1000000 |
| commentsCount | Total comments (when detected) | 12000 |
| text | Description/snippet text | “Video description…” |
| descriptionLinks | URLs and hashtag links extracted from description | [{"url":"https://example.com","text":"https://example.com"}] |
| subtitles | Available subtitle language codes (when detected) | ["en","es"] |
| hashtags | Hashtags from title/description | ["#example","#tutorial"] |
| fromYTUrl | Source YouTube results/seed URL | “https://www.youtube.com/results?search_query=crawlee” |
| order | Item index in run | 0 |
| isCreativeCommons | Creative Commons flag (best-effort) | true |
| isPurchased | Purchased/paid flag (best-effort) | false |

Bonus (when subtitles are downloaded): transcript, transcriptLanguage, transcriptFormat. Additional flags include commentsTurnedOff, isMonetized (when present). All outputs are pushed via Actor.pushData for seamless exports to CSV/JSON/Excel.

### Key features

- 🛡️ **Smart anti-blocking & proxy escalation** — Automatically escalates from direct → Apify datacenter → Apify residential with retries, then sticks to a working level for the rest of the run.
- 🧪 **Realistic HTTP fingerprinting** — Uses the impit HTTP client to impersonate modern browsers and bypass TLS/HTTP fingerprinting checks reliably.
- ⚡ **Concurrent metadata enrichment** — Batch-fetches video pages with controlled concurrency to enrich likes, commentsCount, numberOfSubscribers, subtitles, and more.
- 🎯 **Flexible search filters** — Apply post-processing filters and sorting: dateFilter, videoTypeFilter, lengthFilter, sortingOrder, and sortBy for reliable ordering and selection.
- 🎞️ **Quality & format filters** — Filter for isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased to build high-signal datasets.
- 💬 **Transcripts & subtitles** — Toggle downloadSubtitles with subtitlesLanguage, subtitlesFormat (srt, text, timestamp), and preferAutoGenerated for broader coverage.
- 🔎 **YouTube search results scraper** — Scrape videos from search terms at scale with pagination and safety limits.
- 🔗 **Direct video URL support** — Provide a list of video URLs to extract complete metadata and optional transcripts.
- 📦 **Export-ready outputs** — Structured fields for straightforward analytics, making it a robust YouTube data scraper for CSV/JSON pipelines.

### How to use Youtube Scraper - step by step

1. Create or log in to your Apify account at console.apify.com.
2. Navigate to Actors and open “Youtube Scraper”.
3. Add input:
   - searchTerms as a list of keywords, or
   - startUrls with direct video URLs.
4. Configure limits and filters:
   - maxVideos, maxShorts, maxStreams per search term.
   - Quality/features (isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, isPurchased).
   - Sorting and post-filters (sortingOrder, dateFilter, videoTypeFilter, lengthFilter, sortBy).
5. Subtitles & transcripts:
   - Enable downloadSubtitles, choose subtitlesLanguage and subtitlesFormat, and optionally preferAutoGenerated or saveSubtitlesToKvs.
6. Proxy setup:
   - Leave default or set proxyConfiguration; auto-fallback is built in if blocks occur.
7. Click Start to run. Monitor progress in the Log tab (you’ll see page counts, filters applied, and proxy changes).
8. Access results in the Dataset tab and export to JSON, CSV, or Excel.

Pro tip: Use the Apify dataset to plug this YouTube web scraping tool into your reporting or BI stack as a YouTube data extractor for SEO dashboards and competitor tracking.

### Use cases

| Use case name | Description |
| --- | --- |
| SEO teams — video metadata tracking | Track titles, views, likes, and publish dates to benchmark performance and optimize rankings using a YouTube video metadata scraper. |
| Competitor research — content analysis | Monitor competitor uploads, extract hashtags and descriptions, and compare engagement for a YouTube competitor analysis scraper workflow. |
| Keyword discovery — search SERP mining | Use searchTerms to collect top results for queries and build a YouTube keyword scraper dataset for content planning. |
| Research & NLP — transcript collection | Enable subtitles download to power NLP pipelines or topic modeling with a YouTube transcript scraper and subtitles extractor. |
| Reporting & BI — export-ready metrics | Export structured fields to CSV/JSON/Excel for dashboards and periodic performance reporting using a YouTube data extractor. |
| Live/short-form monitoring — format filters | Filter by isLive or collect Shorts with caps (maxShorts, maxStreams) to build specialized watchlists. |

### Why choose Youtube Scraper?

The Youtube Scraper is built for precision, scale, and reliability on the Apify platform.

- 🎯 Accurate metadata parsing from search + video pages (titles, views, likes, commentsCount, subscribers, hashtags).
- 🧬 Multiformat transcripts (SRT, text, timestamp) with language selection and auto-generated fallback support.
- 🚀 Scales with concurrency and robust pagination, ideal for batch YouTube data extraction.
- 🧩 Developer-friendly outputs with consistent JSON fields for analytics and ETL workflows.
- 🛡️ Safe, production-ready anti-blocking: automatic proxy escalation and browser impersonation via impit.
- 💰 Export-ready for CSV/JSON pipelines — perfect for SEO, reporting, and research.
- 🔄 More reliable than ad-hoc scripts or extensions, thanks to stable infrastructure and structured output.

In short: a dependable YouTube web scraping tool for teams that need consistent, structured video data at scale.

### Is it legal / ethical to use Youtube Scraper?

Yes — when done responsibly. The actor collects data from publicly available YouTube pages and does not access private or password-protected content.

Guidelines for responsible use:
- Only use data from public pages.
- Respect copyright and licensing (e.g., check Creative Commons details before reuse).
- Comply with applicable regulations (e.g., GDPR, CCPA) and YouTube’s terms.
- Consult your legal team for edge cases or sensitive applications.

### Input parameters & output format

Example JSON input
```json
{
  "searchTerms": ["Crawlee", "data extraction"],
  "maxVideos": 10,
  "maxShorts": 0,
  "maxStreams": 0,
  "downloadSubtitles": true,
  "saveSubtitlesToKvs": false,
  "subtitlesLanguage": "en",
  "preferAutoGenerated": false,
  "subtitlesFormat": "srt",
  "sortingOrder": "relevance",
  "dateFilter": "",
  "videoTypeFilter": "",
  "lengthFilter": "",
  "isHD": false,
  "hasCC": false,
  "isCreativeCommons": false,
  "is3D": false,
  "isLive": false,
  "isPurchased": false,
  "is4K": false,
  "is360": false,
  "hasLocation": false,
  "isHDR": false,
  "isVR180": false,
  "publishedAfter": "",
  "sortBy": "",
  "proxyConfiguration": { "useApifyProxy": false },
  "startUrls": []
}
````

Parameters (all optional; none are required):

- searchTerms (array) — Enter one or more YouTube search keywords. Default: \[].
- maxVideos (integer) — Maximum regular videos per search term. Use 0 to skip. Default: 10.
- maxShorts (integer) — Maximum Shorts per search term. Use 0 to skip. Default: 0.
- maxStreams (integer) — Maximum live/upcoming streams per search term. Use 0 to skip. Default: 0.
- startUrls (array) — Provide direct YouTube video, channel, playlist, or results page URLs to scrape without using search terms. Default: \[].
- downloadSubtitles (boolean) — Download subtitles/transcripts when available. Default: false.
- saveSubtitlesToKvs (boolean) — Store each transcript in the key-value store under its own key. Default: false.
- subtitlesLanguage (string) — Preferred language for subtitles/transcripts. Default: "en".
- preferAutoGenerated (boolean) — Prefer auto-generated subtitles. Default: false.
- subtitlesFormat (string) — "srt", "text", or "timestamp". Default: "srt".
- sortingOrder (string) — Post-processing sort: "", "relevance", "date", "viewCount", "rating". Default: "".
- dateFilter (string) — "", "hour", "today", "week", "month", "year". Default: "".
- videoTypeFilter (string) — "", "video", "channel", "playlist", "movie". Default: "".
- lengthFilter (string) — "", "short", "medium", "long". Default: "".
- isHD (boolean) — Only include HD videos (>=720p). Default: false.
- hasCC (boolean) — Require at least one non-auto CC track. Default: false.
- isCreativeCommons (boolean) — Include only Creative Commons videos. Default: false.
- is3D (boolean) — Include only 3D videos. Default: false.
- isLive (boolean) — Restrict to live/live-style content. Default: false.
- isPurchased (boolean) — Best-effort filter for purchased/paid content. Default: false.
- is4K (boolean) — Include only 4K (2160p) videos. Default: false.
- is360 (boolean) — Include only 360° videos. Default: false.
- hasLocation (boolean) — Include only videos with explicit location metadata. Default: false.
- isHDR (boolean) — Include only HDR videos. Default: false.
- isVR180 (boolean) — Include only VR180 videos. Default: false.
- publishedAfter (string) — Only include videos published after YYYY-MM-DD. Default: "".
- sortBy (string) — Post-sort by "", "date", "viewCount", or "likes". Default: "".
- proxyConfiguration (object) — Proxy settings; actor escalates if blocked. Default: {}.

Example JSON output

```json
{
  "title": "How to use Crawlee in 10 minutes",
  "translatedTitle": null,
  "type": "video",
  "id": "dQw4w9WgXcQ",
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "thumbnailUrl": "https://i.ytimg.com/vi/dQw4w9WgXcQ/hq720.jpg",
  "viewCount": 1500000,
  "date": "2025-01-15T00:00:00.000Z",
  "likes": 50000,
  "location": null,
  "channelName": "Channel Name",
  "channelUrl": "https://www.youtube.com/@channelname",
  "channelUsername": "channelname",
  "collaborators": null,
  "channelId": "UCxxxxxxxxxxxxxxxxxxxxxxxx",
  "numberOfSubscribers": 1000000,
  "duration": "00:03:33",
  "commentsCount": 12000,
  "text": "Video description...",
  "translatedText": null,
  "descriptionLinks": [
    { "url": "https://example.com", "text": "https://example.com" }
  ],
  "subtitles": ["en"],
  "transcript": null,
  "transcriptLanguage": "en",
  "transcriptFormat": "srt",
  "order": 0,
  "commentsTurnedOff": false,
  "fromYTUrl": "https://www.youtube.com/results?search_query=crawlee",
  "isMonetized": null,
  "hashtags": ["#example"],
  "isCreativeCommons": true,
  "isPurchased": false"
}
```

Note: Some fields may be null when not present on the page or when detection is not possible (e.g., likes, commentsCount, numberOfSubscribers, subtitles, transcript).

### FAQ

#### Do I need a YouTube API key?

No. The actor scrapes public web endpoints and page data directly, so no official YouTube API key is required.

#### Can this extract transcripts or subtitles?

Yes. Enable “downloadSubtitles,” choose a “subtitlesLanguage,” and select a “subtitlesFormat” (srt, text, or timestamp). You can also “preferAutoGenerated” and optionally “saveSubtitlesToKvs.”

#### Does it scrape comments?

It extracts commentsCount when available, but it does not scrape individual comment bodies. The output includes totals and core engagement metrics.

#### Can I scrape YouTube Shorts and live streams?

Yes. Use maxShorts and maxStreams to control how many Shorts and live/upcoming streams are included per search term. You can also filter by isLive in post-processing.

#### Does it support direct URLs?

Yes — provide direct video URLs in startUrls to fetch full metadata and optional transcripts. The actor also works as a YouTube search results scraper via searchTerms.

#### How does the actor handle blocking?

It automatically escalates through connection levels: direct → Apify datacenter → Apify residential (with retries) and continues with a working level for the rest of the run.

#### What filters and sorting are available?

You can filter by isHD, is4K, isHDR, is360, is3D, isVR180, hasCC, hasLocation, isCreativeCommons, and isPurchased, and apply dateFilter, videoTypeFilter, lengthFilter. Sorting options include sortingOrder and sortBy.

#### What formats can I export?

All results are stored in the Apify dataset, ready for export to JSON, CSV, or Excel. This makes it a reliable YouTube data extractor for analytics and reporting.

### Closing CTA / Final thoughts

The Youtube Scraper is built to extract structured YouTube video data at scale with accuracy and reliability. With robust anti-blocking, flexible filters, and optional transcript downloads, it serves marketers, developers, data analysts, and researchers who need clean, export-ready results. Use it as a YouTube data scraper for SEO tracking, competitor monitoring, and research pipelines — and plug the dataset into your automation or BI stack to start extracting smarter insights today.

# Actor input Schema

## `searchTerms` (type: `array`):

Enter one or more YouTube search keywords (for example "Crawlee", "fitness workout"). The actor will run a full scrape for each term and collect matching videos, shorts, and streams.

💬 For custom solutions or feature requests, contact us at dev.scraperengine@gmail.com

## `maxVideos` (type: `integer`):

Set how many regular (non‑Shorts, non‑live) videos to scrape for each search term. Use 0 to skip long‑form videos completely and focus only on Shorts or streams.

## `maxShorts` (type: `integer`):

Control how many YouTube Shorts (vertical clips) to collect per keyword. Use 0 if you do not want to include Shorts in your dataset.

## `maxStreams` (type: `integer`):

Limit how many live or upcoming streams are scraped for each search term. Use 0 to ignore live content entirely.

## `startUrls` (type: `array`):

Provide direct YouTube video, channel, playlist, or results page URLs to scrape without using search terms. This is ideal for monitoring specific assets.

## `downloadSubtitles` (type: `boolean`):

Download video subtitles/transcripts when available. When enabled, the actor will try to fetch caption tracks and optionally full transcripts for each scraped video.

## `saveSubtitlesToKvs` (type: `boolean`):

When enabled, every downloaded transcript is stored in the default Apify key‑value store under its own key (e.g. "transcript-VIDEO\_ID") so you can download large subtitle files separately from the main dataset.

## `subtitlesLanguage` (type: `string`):

Choose the primary language for subtitles/transcripts (e.g. en, es, fr, de). The actor will look for this language first and fall back to available tracks where possible.

## `preferAutoGenerated` (type: `boolean`):

If turned on, the actor will prefer auto‑generated subtitles over manually uploaded caption tracks. This can increase coverage for less localized videos at the cost of some accuracy.

## `subtitlesFormat` (type: `string`):

Decide how transcripts should look in the output: classic SRT (with timestamps), simple plain text, or structured timestamped JSON that is easy to post‑process programmatically.

## `sortingOrder` (type: `string`):

Sort the final dataset by relevance (original order), upload date, view count, or rating. Applied as post-processing for reliable results.

## `dateFilter` (type: `string`):

Apply YouTube’s built‑in "Upload date" filter: last hour, today, this week, this month, or this year — just like clicking the filter in the YouTube interface.

## `videoTypeFilter` (type: `string`):

Filter to only standard videos (exclude Shorts). Select 'video' to keep only long-form videos. Channel/playlist/movie apply when supported.

## `lengthFilter` (type: `string`):

Use YouTube’s length presets to keep only short clips, medium‑length videos, or long‑form content over 20 minutes.

## `isHD` (type: `boolean`):

Only include HD videos (720p or higher). The actor inspects YouTube's streaming formats to verify resolution before including the video.

## `hasCC` (type: `boolean`):

Only include videos that have at least one proper closed‑caption track (not just auto‑generated). Great for accessibility‑critical workflows.

## `isCreativeCommons` (type: `boolean`):

Filter for videos marked by YouTube as Creative Commons licensed. This can help discover content that is more remix‑friendly (always check final license conditions yourself).

## `is3D` (type: `boolean`):

Keep only stereoscopic 3D videos that YouTube flags as special 3D content.

## `isLive` (type: `boolean`):

Restrict results to live or live‑style content. Combine this with maxStreams to build focused dashboards of live events or streams.

## `isPurchased` (type: `boolean`):

Best-effort filter for purchased/paid content. YouTube rarely exposes this in scraped data, so results may be limited. Use for niche use cases only.

## `is4K` (type: `boolean`):

Keep only videos that offer at least one 2160p (4K) stream in their available formats.

## `is360` (type: `boolean`):

Filter results down to immersive 360° videos (spherical / equirectangular projection) that can be explored in all directions.

## `hasLocation` (type: `boolean`):

Only keep videos where YouTube exposes explicit location metadata in the player response (for example city/country information).

## `isHDR` (type: `boolean`):

Limit the dataset to High Dynamic Range (HDR) videos, detected from color information and HDR‑specific flags in the available formats.

## `isVR180` (type: `boolean`):

Filter for VR180 immersive content suitable for VR headsets when YouTube marks the video as VR180.

## `publishedAfter` (type: `string`):

Only include videos published after this date. Pick a date in the calendar (absolute format YYYY-MM-DD). Leave empty to include all dates.

## `sortBy` (type: `string`):

After scraping, optionally sort the final dataset by a chosen field (date, viewCount, or likes) so that the default dataset view is ordered exactly how you like it.

## `proxyConfiguration` (type: `object`):

Select the starting proxy setup for this actor. By default it uses no proxy and, if YouTube blocks the traffic, the actor automatically escalates to Apify datacenter proxy and then to residential proxy with up to 3 retries, locking onto residential for the rest of the run.

## Actor input object example

```json
{
  "searchTerms": [
    "Crawlee"
  ],
  "maxVideos": 10,
  "maxShorts": 0,
  "maxStreams": 0,
  "startUrls": [],
  "downloadSubtitles": false,
  "saveSubtitlesToKvs": false,
  "subtitlesLanguage": "en",
  "preferAutoGenerated": false,
  "subtitlesFormat": "srt",
  "sortingOrder": "",
  "dateFilter": "",
  "videoTypeFilter": "",
  "lengthFilter": "",
  "isHD": false,
  "hasCC": false,
  "isCreativeCommons": false,
  "is3D": false,
  "isLive": false,
  "isPurchased": false,
  "is4K": false,
  "is360": false,
  "hasLocation": false,
  "isHDR": false,
  "isVR180": false,
  "publishedAfter": "",
  "sortBy": "",
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "searchTerms": [
        "Crawlee"
    ],
    "startUrls": [],
    "proxyConfiguration": {
        "useApifyProxy": false
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("scraper-engine/youtube-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "searchTerms": ["Crawlee"],
    "startUrls": [],
    "proxyConfiguration": { "useApifyProxy": False },
}

# Run the Actor and wait for it to finish
run = client.actor("scraper-engine/youtube-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "searchTerms": [
    "Crawlee"
  ],
  "startUrls": [],
  "proxyConfiguration": {
    "useApifyProxy": false
  }
}' |
apify call scraper-engine/youtube-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=scraper-engine/youtube-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Youtube Scraper",
        "description": "🎥 YouTube Scraper extracts structured data from videos, channels & playlists — titles, tags, views, likes, comments, captions, thumbnails & publish dates. 🔎 Perfect for SEO, competitor analysis, research & reporting. 🚀 Export-ready for CSV/JSON pipelines.",
        "version": "0.1",
        "x-build-id": "hlWbE89Z8fUHCGvFB"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/scraper-engine~youtube-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-scraper-engine-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/scraper-engine~youtube-scraper/runs": {
            "post": {
                "operationId": "runs-sync-scraper-engine-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/scraper-engine~youtube-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-scraper-engine-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "searchTerms": {
                        "title": "🔍 Search terms",
                        "type": "array",
                        "description": "Enter one or more YouTube search keywords (for example \"Crawlee\", \"fitness workout\"). The actor will run a full scrape for each term and collect matching videos, shorts, and streams.\n\n💬 For custom solutions or feature requests, contact us at dev.scraperengine@gmail.com",
                        "items": {
                            "type": "string"
                        }
                    },
                    "maxVideos": {
                        "title": "🎞️ Maximum videos per search term",
                        "minimum": 0,
                        "maximum": 9999,
                        "type": "integer",
                        "description": "Set how many regular (non‑Shorts, non‑live) videos to scrape for each search term. Use 0 to skip long‑form videos completely and focus only on Shorts or streams.",
                        "default": 10
                    },
                    "maxShorts": {
                        "title": "📱 Maximum Shorts per search term",
                        "minimum": 0,
                        "maximum": 9999,
                        "type": "integer",
                        "description": "Control how many YouTube Shorts (vertical clips) to collect per keyword. Use 0 if you do not want to include Shorts in your dataset.",
                        "default": 0
                    },
                    "maxStreams": {
                        "title": "📡 Maximum streams per search term",
                        "minimum": 0,
                        "maximum": 9999,
                        "type": "integer",
                        "description": "Limit how many live or upcoming streams are scraped for each search term. Use 0 to ignore live content entirely.",
                        "default": 0
                    },
                    "startUrls": {
                        "title": "🔗 Direct URLs",
                        "type": "array",
                        "description": "Provide direct YouTube video, channel, playlist, or results page URLs to scrape without using search terms. This is ideal for monitoring specific assets.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "downloadSubtitles": {
                        "title": "💬 Download subtitles",
                        "type": "boolean",
                        "description": "Download video subtitles/transcripts when available. When enabled, the actor will try to fetch caption tracks and optionally full transcripts for each scraped video.",
                        "default": false
                    },
                    "saveSubtitlesToKvs": {
                        "title": "🗄️ Save subtitles to key‑value store",
                        "type": "boolean",
                        "description": "When enabled, every downloaded transcript is stored in the default Apify key‑value store under its own key (e.g. \"transcript-VIDEO_ID\") so you can download large subtitle files separately from the main dataset.",
                        "default": false
                    },
                    "subtitlesLanguage": {
                        "title": "🌐 Subtitle language",
                        "enum": [
                            "en",
                            "es",
                            "fr",
                            "de",
                            "pt",
                            "it",
                            "ru",
                            "ja",
                            "ko",
                            "zh",
                            "ar",
                            "hi",
                            "bn",
                            "tr",
                            "pl",
                            "nl",
                            "sv",
                            "id",
                            "th",
                            "vi"
                        ],
                        "type": "string",
                        "description": "Choose the primary language for subtitles/transcripts (e.g. en, es, fr, de). The actor will look for this language first and fall back to available tracks where possible.",
                        "default": "en"
                    },
                    "preferAutoGenerated": {
                        "title": "⚙️ Prefer automatically generated subtitles",
                        "type": "boolean",
                        "description": "If turned on, the actor will prefer auto‑generated subtitles over manually uploaded caption tracks. This can increase coverage for less localized videos at the cost of some accuracy.",
                        "default": false
                    },
                    "subtitlesFormat": {
                        "title": "📄 Subtitle format",
                        "enum": [
                            "srt",
                            "text",
                            "timestamp"
                        ],
                        "type": "string",
                        "description": "Decide how transcripts should look in the output: classic SRT (with timestamps), simple plain text, or structured timestamped JSON that is easy to post‑process programmatically.",
                        "default": "srt"
                    },
                    "sortingOrder": {
                        "title": "🧮 Sorting order",
                        "enum": [
                            "",
                            "relevance",
                            "date",
                            "viewCount",
                            "rating"
                        ],
                        "type": "string",
                        "description": "Sort the final dataset by relevance (original order), upload date, view count, or rating. Applied as post-processing for reliable results.",
                        "default": ""
                    },
                    "dateFilter": {
                        "title": "🕒 Date filter",
                        "enum": [
                            "",
                            "hour",
                            "today",
                            "week",
                            "month",
                            "year"
                        ],
                        "type": "string",
                        "description": "Apply YouTube’s built‑in \"Upload date\" filter: last hour, today, this week, this month, or this year — just like clicking the filter in the YouTube interface.",
                        "default": ""
                    },
                    "videoTypeFilter": {
                        "title": "📂 Video type filter",
                        "enum": [
                            "",
                            "video",
                            "channel",
                            "playlist",
                            "movie"
                        ],
                        "type": "string",
                        "description": "Filter to only standard videos (exclude Shorts). Select 'video' to keep only long-form videos. Channel/playlist/movie apply when supported.",
                        "default": ""
                    },
                    "lengthFilter": {
                        "title": "⏱️ Length filter",
                        "enum": [
                            "",
                            "short",
                            "medium",
                            "long"
                        ],
                        "type": "string",
                        "description": "Use YouTube’s length presets to keep only short clips, medium‑length videos, or long‑form content over 20 minutes.",
                        "default": ""
                    },
                    "isHD": {
                        "title": "📺 HD",
                        "type": "boolean",
                        "description": "Only include HD videos (720p or higher). The actor inspects YouTube's streaming formats to verify resolution before including the video.",
                        "default": false
                    },
                    "hasCC": {
                        "title": "📝 Subtitles / CC",
                        "type": "boolean",
                        "description": "Only include videos that have at least one proper closed‑caption track (not just auto‑generated). Great for accessibility‑critical workflows.",
                        "default": false
                    },
                    "isCreativeCommons": {
                        "title": "⚖️ Creative Commons",
                        "type": "boolean",
                        "description": "Filter for videos marked by YouTube as Creative Commons licensed. This can help discover content that is more remix‑friendly (always check final license conditions yourself).",
                        "default": false
                    },
                    "is3D": {
                        "title": "🕶️ 3D",
                        "type": "boolean",
                        "description": "Keep only stereoscopic 3D videos that YouTube flags as special 3D content.",
                        "default": false
                    },
                    "isLive": {
                        "title": "📺 Live only",
                        "type": "boolean",
                        "description": "Restrict results to live or live‑style content. Combine this with maxStreams to build focused dashboards of live events or streams.",
                        "default": false
                    },
                    "isPurchased": {
                        "title": "💳 Purchased content",
                        "type": "boolean",
                        "description": "Best-effort filter for purchased/paid content. YouTube rarely exposes this in scraped data, so results may be limited. Use for niche use cases only.",
                        "default": false
                    },
                    "is4K": {
                        "title": "🖥️ 4K only",
                        "type": "boolean",
                        "description": "Keep only videos that offer at least one 2160p (4K) stream in their available formats.",
                        "default": false
                    },
                    "is360": {
                        "title": "🌐 360° video",
                        "type": "boolean",
                        "description": "Filter results down to immersive 360° videos (spherical / equirectangular projection) that can be explored in all directions.",
                        "default": false
                    },
                    "hasLocation": {
                        "title": "📍 With location",
                        "type": "boolean",
                        "description": "Only keep videos where YouTube exposes explicit location metadata in the player response (for example city/country information).",
                        "default": false
                    },
                    "isHDR": {
                        "title": "🌈 HDR only",
                        "type": "boolean",
                        "description": "Limit the dataset to High Dynamic Range (HDR) videos, detected from color information and HDR‑specific flags in the available formats.",
                        "default": false
                    },
                    "isVR180": {
                        "title": "🥽 VR180 only",
                        "type": "boolean",
                        "description": "Filter for VR180 immersive content suitable for VR headsets when YouTube marks the video as VR180.",
                        "default": false
                    },
                    "publishedAfter": {
                        "title": "📆 Scrape videos published after (date)",
                        "pattern": "^(\\d{4})-(0[1-9]|1[0-2])-(0[1-9]|[12]\\d|3[01])$|^$",
                        "type": "string",
                        "description": "Only include videos published after this date. Pick a date in the calendar (absolute format YYYY-MM-DD). Leave empty to include all dates.",
                        "default": ""
                    },
                    "sortBy": {
                        "title": "📊 Sort by (post‑processing)",
                        "enum": [
                            "",
                            "date",
                            "viewCount",
                            "likes"
                        ],
                        "type": "string",
                        "description": "After scraping, optionally sort the final dataset by a chosen field (date, viewCount, or likes) so that the default dataset view is ordered exactly how you like it.",
                        "default": ""
                    },
                    "proxyConfiguration": {
                        "title": "🛡️ Proxy configuration & anti‑blocking",
                        "type": "object",
                        "description": "Select the starting proxy setup for this actor. By default it uses no proxy and, if YouTube blocks the traffic, the actor automatically escalates to Apify datacenter proxy and then to residential proxy with up to 3 retries, locking onto residential for the rest of the run."
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
