# Youtube Scraper (`dtrungtin/youtube-scraper`) Actor

YouTube Search Scraper extracts videos and their full metadata — including comments — from any YouTube search-results page, hashtag feed, or individual video URL.

- **URL**: https://apify.com/dtrungtin/youtube-scraper.md
- **Developed by:** [Tin](https://apify.com/dtrungtin) (community)
- **Categories:** Videos
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $10.00 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

**YouTube Search Scraper** extracts videos and their full metadata — including comments — from any [YouTube](https://www.youtube.com) search-results page, hashtag feed, or individual video URL. Drop a list of URLs in, and the scraper crawls each result, paginates through comments, and returns clean JSON ready for analysis, dashboards, or downstream pipelines. It runs entirely on Apify, so you get scheduling, API access, proxy rotation, integrations, monitoring, and storage out of the box — no infrastructure to set up.

### What does YouTube Search Scraper do?

This Actor takes one or more YouTube URLs (search-results, hashtag pages, or individual videos) and returns a structured JSON record for every video it finds. For each video it extracts:

- Core metadata: title, video ID, canonical URL, thumbnail, duration, upload date, description and description links.
- Engagement metrics: view count, like count, subscriber count of the channel, comment count.
- Channel info: name, channel URL.
- Subtitles: list of available caption language codes.
- Comments: a paginated list of top-level comments (author, text, likes, reply count, published time) capped.
- Status flags: `isMonetized`, `commentsTurnedOff`.

To try it, paste a YouTube search URL like `https://www.youtube.com/results?search_query=Apple` or a hashtag URL like `https://www.youtube.com/hashtag/phimmoi` into **Start URLs**, click **Start**, and view the dataset under the **Output** tab.

### Why use YouTube Search Scraper?

- **Market and trend research** — track how a topic, brand, or keyword is being covered on YouTube.
- **Competitor and channel analysis** — compare view counts, subscribers, and posting cadence across channels.
- **Sentiment analysis on comments** — pull comment threads and run NLP over real audience reactions.
- **Content discovery** — surface high-performing videos for a hashtag or keyword to inform your own content strategy.
- **Reporting and dashboards** — feed structured YouTube data into BI tools without dealing with the YouTube Data API quota.
- **No code required** — run it from the Apify Console, schedule it, or call it via the API from any language.

### How to use YouTube Search Scraper

1. Open the Actor on the Apify Console.
2. In **Start URLs**, paste one or more URLs. The scraper auto-detects the type:
   - Search: `https://www.youtube.com/results?search_query=<query>`
   - Hashtag: `https://www.youtube.com/hashtag/<tag>`
   - Single video: `https://www.youtube.com/watch?v=<id>` (or `youtu.be/<id>` / `youtube.com/shorts/<id>`)
3. Set **Max items** to control how many videos to extract per search/hashtag URL (default `100`).
4. Click **Start** and wait for the run to finish.
5. Open the **Output** tab to view the dataset, or download it as JSON, CSV, Excel, HTML, or RSS.

### Input

| Field | Type | Description | Default |
| ----- | ---- | ----------- | ------- |
| `startUrls` | array | One or more YouTube URLs (search-results, hashtag, or individual video). Mixed types are supported in the same run. | `https://www.youtube.com/results?search_query=Apple` |
| `maxItems` | integer | Maximum number of video items to extract per search or hashtag URL. The scraper scrolls to load more results until this cap is hit. | `100` |


### Output

Output is stored as JSON in the Actor's default dataset. One item per video. You can download the dataset in **JSON, CSV, Excel, HTML, or RSS** formats from the **Output** tab.

```json
{
  "title": "Stromae - Santé (Live From The Tonight Show Starring Jimmy Fallon)",
  "id": "CW7gfrTlr0Y",
  "url": "https://www.youtube.com/watch?v=CW7gfrTlr0Y",
  "thumbnailUrl": "https://i.ytimg.com/vi/CW7gfrTlr0Y/maxresdefault.jpg",
  "viewCount": 35582192,
  "date": "2021-12-21",
  "likes": 512238,
  "location": null,
  "channelName": "StromaeVEVO",
  "channelUrl": "http://www.youtube.com/@StromaeVEVO",
  "numberOfSubscribers": 6930000,
  "duration": "00:03:17",
  "commentsCount": 14,
  "text": "Stromae - Santé (Live From The Tonight Show Starring Jimmy Fallon on NBC)...",
  "descriptionLinks": [
    { "url": "https://stromae.lnk.to/la-solassitude", "text": "https://stromae.lnk.to/la-solassitude" }
  ],
  "subtitles": ["en", "fr"],
  "comments": [
    {
      "id": "Ugw...",
      "author": "@some_user",
      "authorChannelUrl": "https://www.youtube.com/channel/UC...",
      "text": "Absolute masterpiece.",
      "publishedAt": "2 years ago",
      "likes": 1240,
      "replyCount": 3
    }
  ],
  "isMonetized": true,
  "commentsTurnedOff": false
}
````

### Data fields

| Field | Type | Description |
| ----- | ---- | ----------- |
| `title` | string | Video title. |
| `id` | string | Unique YouTube video ID. |
| `url` | string | Canonical `watch?v=` URL. |
| `thumbnailUrl` | string | High-resolution thumbnail (`maxresdefault.jpg`). |
| `viewCount` | integer | Total view count. |
| `date` | string | Upload date in `YYYY-MM-DD` format. |
| `likes` | integer | Like count parsed from the video page. |
| `location` | string | null | Geotag, when the uploader has set one (rare). |
| `channelName` | string | Display name of the channel. |
| `channelUrl` | string | Channel URL. |
| `numberOfSubscribers` | integer | Subscriber count (e.g., `6.93M` is normalized to `6930000`). |
| `duration` | string | Duration formatted as `HH:MM:SS`. |
| `commentsCount` | integer | Total comments on the video. |
| `text` | string | Full description text. |
| `descriptionLinks` | array | All external links from the description (YouTube redirect wrappers are unwrapped). |
| `subtitles` | array | null | Available caption language codes. |
| `comments` | array | null | Top-level comments (`{ id, author, authorChannelUrl, text, publishedAt, likes, replyCount }`). |
| `isMonetized` | boolean | Heuristic — `true` when the video has ad placements. |
| `commentsTurnedOff` | boolean | null | `true` when comments are disabled on the video. |

### Tips and advanced options

- **Mix URL types in one run.** Search URLs, hashtag URLs, and individual video URLs can all live in the same `startUrls` array. The scraper auto-routes each one.
- **Tune concurrency in the source if needed.** The crawler is configured with `maxConcurrency: 2` because each browser tab needs enough resources for YouTube's lazy-loaded comments to render. Pushing concurrency higher tends to produce empty `commentsCount` fields under load.
- **Use a proxy on the platform.** A residential or datacenter Apify Proxy (`useApifyProxy: true`) is recommended for sustained crawls to avoid rate limits.

### FAQ and disclaimers

**Is this legal?** This scraper only collects publicly visible data from YouTube — content that any signed-out user can see in their browser. You are responsible for complying with YouTube's Terms of Service and applicable law (including local data protection rules) for your specific use case.

**Why is `commentsCount` sometimes `null`?** YouTube lazy-loads the comments panel; if the page is unusually slow to render the comments header within the wait window, the field falls back to whatever the network listener captured (or `null` if nothing was captured). Lower the `maxConcurrency` or rerun with a proxy if this is frequent.

**Why is `date` an approximation on some videos?** Search-result pages only expose relative timestamps ("2 years ago"). The detail page returns the exact `publishDate`, so videos scraped via the detail handler get the precise date.

**Found a bug or want a feature?** Use the **Issues** tab on the Actor page in the Apify Console to report problems or request enhancements.

# Actor input Schema

## `startUrls` (type: `array`):

YouTube search-results URLs to scrape (e.g. https://www.youtube.com/results?search\_query=Apple).

## `maxItems` (type: `integer`):

Maximum number of video items to extract per start URL. The scraper scrolls to load more results until this cap is reached or no more results are available.

## Actor input object example

```json
{
  "startUrls": [
    {
      "url": "https://www.youtube.com/results?search_query=Apple"
    }
  ],
  "maxItems": 5
}
```

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "startUrls": [
        {
            "url": "https://www.youtube.com/results?search_query=Apple"
        }
    ]
};

// Run the Actor and wait for it to finish
const run = await client.actor("dtrungtin/youtube-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = { "startUrls": [{ "url": "https://www.youtube.com/results?search_query=Apple" }] }

# Run the Actor and wait for it to finish
run = client.actor("dtrungtin/youtube-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "startUrls": [
    {
      "url": "https://www.youtube.com/results?search_query=Apple"
    }
  ]
}' |
apify call dtrungtin/youtube-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=dtrungtin/youtube-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "Youtube Scraper",
        "description": "YouTube Search Scraper extracts videos and their full metadata — including comments — from any YouTube search-results page, hashtag feed, or individual video URL.",
        "version": "0.0",
        "x-build-id": "oTeaavQKQBpcxcU45"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/dtrungtin~youtube-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-dtrungtin-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/dtrungtin~youtube-scraper/runs": {
            "post": {
                "operationId": "runs-sync-dtrungtin-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/dtrungtin~youtube-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-dtrungtin-youtube-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "properties": {
                    "startUrls": {
                        "title": "Start URLs",
                        "type": "array",
                        "description": "YouTube search-results URLs to scrape (e.g. https://www.youtube.com/results?search_query=Apple).",
                        "items": {
                            "type": "object",
                            "required": [
                                "url"
                            ],
                            "properties": {
                                "url": {
                                    "type": "string",
                                    "title": "URL of a web page",
                                    "format": "uri"
                                }
                            }
                        }
                    },
                    "maxItems": {
                        "title": "Max items",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Maximum number of video items to extract per start URL. The scraper scrolls to load more results until this cap is reached or no more results are available.",
                        "default": 5
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
