# YouTube Comments Scraper (`cirkit/youtube-comments-scraper`) Actor

Fast YouTube comments scraper. Paste any video URLs and extract every comment plus full reply threads. Author handle, channel id, exact like count, reply count, hearted, pinned, verified, edited flags, published time. Direct InnerTube API reads, no browser. PPE pricing.

- **URL**: https://apify.com/cirkit/youtube-comments-scraper.md
- **Developed by:** [Crikit](https://apify.com/cirkit) (community)
- **Categories:** Social media
- **Stats:** 2 total users, 1 monthly users, 100.0% runs succeeded, NaN bookmarks
- **User rating**: No ratings yet

## Pricing

from $1.50 / 1,000 results

This Actor is paid per event. You are not charged for the Apify platform usage, but only a fixed price for specific events.

Learn more: https://docs.apify.com/platform/actors/running/actors-in-store#pay-per-event

## What's an Apify Actor?

Actors are a software tools running on the Apify platform, for all kinds of web data extraction and automation use cases.
In Batch mode, an Actor accepts a well-defined JSON input, performs an action which can take anything from a few seconds to a few hours,
and optionally produces a well-defined JSON output, datasets with results, or files in key-value store.
In Standby mode, an Actor provides a web server which can be used as a website, API, or an MCP server.
Actors are written with capital "A".

## How to integrate an Actor?

If asked about integration, you help developers integrate Actors into their projects.
You adapt to their stack and deliver integrations that are safe, well-documented, and production-ready.
The best way to integrate Actors is as follows.

In JavaScript/TypeScript projects, use official [JavaScript/TypeScript client](https://docs.apify.com/api/client/js.md):

```bash
npm install apify-client
```

In Python projects, use official [Python client library](https://docs.apify.com/api/client/python.md):

```bash
pip install apify-client
```

In shell scripts, use [Apify CLI](https://docs.apify.com/cli/docs.md):

````bash
# MacOS / Linux
curl -fsSL https://apify.com/install-cli.sh | bash
# Windows
irm https://apify.com/install-cli.ps1 | iex
```bash

In AI frameworks, you might use the [Apify MCP server](https://docs.apify.com/platform/integrations/mcp.md).

If your project is in a different language, use the [REST API](https://docs.apify.com/api/v2.md).

For usage examples, see the [API](#api) section below.

For more details, see Apify documentation as [Markdown index](https://docs.apify.com/llms.txt) and [Markdown full-text](https://docs.apify.com/llms-full.txt).


# README

## YouTube Comments Scraper

Fast, exhaustive YouTube comments scraper. Paste any YouTube video URLs and extract every comment plus the full reply thread under each one. Direct reads of YouTube's InnerTube API, no headless browser. Built in Python with `curl_cffi` for clean TLS impersonation.

### What you get

One record per comment or reply, in flat row format. Top-level comments have `replyLevel = 0`, replies have `replyLevel = 1` and a `parentCommentId` linking back to the comment they reply to.

Every field is parsed directly from YouTube's structured comment payload, including:

- `commentId`, `parentCommentId`, `replyLevel`
- `text` (plain text rendered from YouTube's run-based content)
- `publishedTimeRelative` (`"1 day ago"`, `"3 weeks ago (edited)"`)
- `isEdited`
- `likeCount` (exact integer, parsed from YouTube's accessibility label, not the `1.1K` cosmetic string)
- `replyCount` (exact integer; 0 on reply rows)
- `authorChannelId`, `authorHandle`, `authorDisplayName`, `authorThumbnailUrl`
- `authorIsVerified`, `authorIsCreator`, `authorIsArtist`
- `isHearted` (creator gave this comment the heart icon)
- `isPinned` (top-level only; pinned comments only)
- `videoId`, `videoUrl`
- `scrapedAt` (ISO-8601 UTC timestamp)

### Input

```json
{
  "videoUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://youtu.be/jNQXAC9IVRw",
    "dQw4w9WgXcQ"
  ],
  "sortBy": "TOP",
  "maxCommentsPerVideo": 500,
  "includeReplies": true,
  "maxRepliesPerComment": 50,
  "proxyConfiguration": { "useApifyProxy": true }
}
````

| Field | Type | Description |
| --- | --- | --- |
| `videoUrls` | array, required | YouTube watch URLs, Shorts URLs, youtu.be short links, or raw 11-character video IDs. Each video is scraped independently. |
| `sortBy` | enum, default `TOP` | `TOP` returns YouTube's ranked order. `NEWEST` returns reverse-chronological. |
| `maxCommentsPerVideo` | integer, optional | Hard cap on top-level comments per video. Leave empty to scrape the entire comment section. |
| `includeReplies` | boolean, default `true` | If on, fetch the full reply thread under every top-level comment. |
| `maxRepliesPerComment` | integer, optional | Hard cap on replies per top-level comment. Leave empty for unlimited. |
| `proxyConfiguration` | object | Datacenter proxy is sufficient for steady-state scraping. Use Residential only if you hit throttling at very high volume. |

### Output

One JSON object per dataset row. Example (top-level comment):

```json
{
  "videoId": "dQw4w9WgXcQ",
  "videoUrl": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
  "commentId": "Ugzge340dBgB75hWBm54AaABAg",
  "parentCommentId": null,
  "replyLevel": 0,
  "text": "can confirm: he never gave us up",
  "publishedTimeRelative": "1 year ago",
  "isEdited": false,
  "likeCount": 241,
  "replyCount": 960,
  "authorChannelId": "UCBR8-60-B28hp2BmDPdntcQ",
  "authorHandle": "YouTube",
  "authorDisplayName": "@YouTube",
  "authorThumbnailUrl": "https://yt3.ggpht.com/...",
  "authorIsVerified": true,
  "authorIsCreator": false,
  "authorIsArtist": false,
  "isHearted": true,
  "isPinned": true,
  "scrapedAt": "2026-05-19T01:16:49Z"
}
```

A reply has the same shape with `replyLevel: 1`, a populated `parentCommentId`, and `replyCount: 0`.

### How it works

The actor calls YouTube's InnerTube API directly (`POST /youtubei/v1/next`) using `curl_cffi` with Chrome 131 TLS impersonation. For each input video it:

1. Fetches the watch-page HTML once to extract the current `INNERTUBE_CLIENT_VERSION` and the comments-section entry continuation token.
2. Walks comment pages 20-at-a-time until the comment section ends (or `maxCommentsPerVideo` is hit).
3. For each top-level comment with replies, fires a parallel reply-thread fetch (10 replies per page; paginates until the reply thread ends or `maxRepliesPerComment` is hit).
4. Pushes each comment to the dataset as a flat record.

No browser, no captcha challenge, no signed requests. The whole flow runs at roughly 1-2 requests per second per IP and is safe to run unattended for hours.

### Pricing

Pay per comment scraped (Pay Per Event). Replies and top-level comments are both billable rows. Failed videos (private, unavailable, comments disabled) emit nothing and cost nothing.

### Edge cases handled

- **Comments disabled.** Detected; the video is skipped and no rows are emitted.
- **Private or unavailable videos.** Detected; the video is skipped.
- **YouTube Shorts.** Same comment endpoint as regular videos; works out of the box.
- **Bumped client version.** Re-extracted from a fresh watch-page HTML on each run, so it always uses the live version YouTube expects.
- **Mid-run rate limiting.** Exponential backoff on 429 and 5xx; rotates between `chrome131` and `chrome124` TLS profiles on persistent failures.

### Example: scrape every comment on a single video

```json
{
  "videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
  "sortBy": "TOP",
  "includeReplies": true
}
```

### Example: TOP 100 comments per video, no replies, multiple videos

```json
{
  "videoUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://www.youtube.com/watch?v=jNQXAC9IVRw"
  ],
  "sortBy": "TOP",
  "maxCommentsPerVideo": 100,
  "includeReplies": false
}
```

### Example: NEWEST 50 comments, plain integer video IDs

```json
{
  "videoUrls": ["dQw4w9WgXcQ", "jNQXAC9IVRw"],
  "sortBy": "NEWEST",
  "maxCommentsPerVideo": 50,
  "includeReplies": true,
  "maxRepliesPerComment": 5
}
```

### Notes on data freshness

The comments returned reflect the state of the comment section at scrape time. Re-running an hour later will pick up any new comments posted since (and any comments that were deleted will disappear). The `scrapedAt` timestamp is your reference for when the snapshot was captured.

# Actor input Schema

## `videoUrls` (type: `array`):

Each entry can be a YouTube watch URL (https://www.youtube.com/watch?v=...), a Shorts URL, a youtu.be short link, or a bare 11-character video ID. Each video is scraped independently.

## `sortBy` (type: `string`):

TOP returns YouTube's ranked order (default). NEWEST returns strict reverse-chronological.

## `maxCommentsPerVideo` (type: `integer`):

Hard cap on top-level comments per video. Leave empty for unlimited (the whole comment section).

## `includeReplies` (type: `boolean`):

If on, fetch the full reply thread under every top-level comment. Adds ~30% to runtime on average.

## `maxRepliesPerComment` (type: `integer`):

Hard cap on replies fetched per top-level comment. Leave empty for unlimited.

## `proxyConfiguration` (type: `object`):

Datacenter proxy is sufficient for steady-state scraping. Switch to Residential if you see 429 throttles at very high volume.

## Actor input object example

```json
{
  "videoUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  ],
  "sortBy": "TOP",
  "maxCommentsPerVideo": 50,
  "includeReplies": true,
  "maxRepliesPerComment": 10,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}
```

# Actor output Schema

## `items` (type: `string`):

All comments and replies extracted in the latest run.

## `itemsCsv` (type: `string`):

All comments and replies in CSV format.

## `itemsJson` (type: `string`):

All comments and replies in JSON format.

## `consoleRun` (type: `string`):

No description

# API

You can run this Actor programmatically using our API. Below are code examples in JavaScript, Python, and CLI, as well as the OpenAPI specification and MCP server setup.

## JavaScript example

```javascript
import { ApifyClient } from 'apify-client';

// Initialize the ApifyClient with your Apify API token
// Replace the '<YOUR_API_TOKEN>' with your token
const client = new ApifyClient({
    token: '<YOUR_API_TOKEN>',
});

// Prepare Actor input
const input = {
    "videoUrls": [
        "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
    ],
    "sortBy": "TOP",
    "maxCommentsPerVideo": 50,
    "includeReplies": true,
    "maxRepliesPerComment": 10,
    "proxyConfiguration": {
        "useApifyProxy": true
    }
};

// Run the Actor and wait for it to finish
const run = await client.actor("cirkit/youtube-comments-scraper").call(input);

// Fetch and print Actor results from the run's dataset (if any)
console.log('Results from dataset');
console.log(`💾 Check your data here: https://console.apify.com/storage/datasets/${run.defaultDatasetId}`);
const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => {
    console.dir(item);
});

// 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/js/docs

```

## Python example

```python
from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "videoUrls": ["https://www.youtube.com/watch?v=dQw4w9WgXcQ"],
    "sortBy": "TOP",
    "maxCommentsPerVideo": 50,
    "includeReplies": True,
    "maxRepliesPerComment": 10,
    "proxyConfiguration": { "useApifyProxy": True },
}

# Run the Actor and wait for it to finish
run = client.actor("cirkit/youtube-comments-scraper").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

# 📚 Want to learn more 📖? Go to → https://docs.apify.com/api/client/python/docs/quick-start

```

## CLI example

```bash
echo '{
  "videoUrls": [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
  ],
  "sortBy": "TOP",
  "maxCommentsPerVideo": 50,
  "includeReplies": true,
  "maxRepliesPerComment": 10,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}' |
apify call cirkit/youtube-comments-scraper --silent --output-dataset

```

## MCP server setup

```json
{
    "mcpServers": {
        "apify": {
            "command": "npx",
            "args": [
                "mcp-remote",
                "https://mcp.apify.com/?tools=cirkit/youtube-comments-scraper",
                "--header",
                "Authorization: Bearer <YOUR_API_TOKEN>"
            ]
        }
    }
}

```

## OpenAPI specification

```json
{
    "openapi": "3.0.1",
    "info": {
        "title": "YouTube Comments Scraper",
        "description": "Fast YouTube comments scraper. Paste any video URLs and extract every comment plus full reply threads. Author handle, channel id, exact like count, reply count, hearted, pinned, verified, edited flags, published time. Direct InnerTube API reads, no browser. PPE pricing.",
        "version": "0.2",
        "x-build-id": "GBUOJgj1DpSfn4D9S"
    },
    "servers": [
        {
            "url": "https://api.apify.com/v2"
        }
    ],
    "paths": {
        "/acts/cirkit~youtube-comments-scraper/run-sync-get-dataset-items": {
            "post": {
                "operationId": "run-sync-get-dataset-items-cirkit-youtube-comments-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for its completion, and returns Actor's dataset items in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        },
        "/acts/cirkit~youtube-comments-scraper/runs": {
            "post": {
                "operationId": "runs-sync-cirkit-youtube-comments-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor and returns information about the initiated run in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "$ref": "#/components/schemas/runsResponseSchema"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/acts/cirkit~youtube-comments-scraper/run-sync": {
            "post": {
                "operationId": "run-sync-cirkit-youtube-comments-scraper",
                "x-openai-isConsequential": false,
                "summary": "Executes an Actor, waits for completion, and returns the OUTPUT from Key-value store in response.",
                "tags": [
                    "Run Actor"
                ],
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "$ref": "#/components/schemas/inputSchema"
                            }
                        }
                    }
                },
                "parameters": [
                    {
                        "name": "token",
                        "in": "query",
                        "required": true,
                        "schema": {
                            "type": "string"
                        },
                        "description": "Enter your Apify token here"
                    }
                ],
                "responses": {
                    "200": {
                        "description": "OK"
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "inputSchema": {
                "type": "object",
                "required": [
                    "videoUrls"
                ],
                "properties": {
                    "videoUrls": {
                        "title": "Video URLs or IDs",
                        "minItems": 1,
                        "uniqueItems": true,
                        "type": "array",
                        "description": "Each entry can be a YouTube watch URL (https://www.youtube.com/watch?v=...), a Shorts URL, a youtu.be short link, or a bare 11-character video ID. Each video is scraped independently.",
                        "items": {
                            "type": "string"
                        }
                    },
                    "sortBy": {
                        "title": "Sort order",
                        "enum": [
                            "TOP",
                            "NEWEST"
                        ],
                        "type": "string",
                        "description": "TOP returns YouTube's ranked order (default). NEWEST returns strict reverse-chronological.",
                        "default": "TOP"
                    },
                    "maxCommentsPerVideo": {
                        "title": "Max comments per video",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Hard cap on top-level comments per video. Leave empty for unlimited (the whole comment section)."
                    },
                    "includeReplies": {
                        "title": "Include replies",
                        "type": "boolean",
                        "description": "If on, fetch the full reply thread under every top-level comment. Adds ~30% to runtime on average.",
                        "default": true
                    },
                    "maxRepliesPerComment": {
                        "title": "Max replies per comment",
                        "minimum": 1,
                        "type": "integer",
                        "description": "Hard cap on replies fetched per top-level comment. Leave empty for unlimited."
                    },
                    "proxyConfiguration": {
                        "title": "Proxy configuration",
                        "type": "object",
                        "description": "Datacenter proxy is sufficient for steady-state scraping. Switch to Residential if you see 429 throttles at very high volume.",
                        "default": {
                            "useApifyProxy": true
                        }
                    }
                }
            },
            "runsResponseSchema": {
                "type": "object",
                "properties": {
                    "data": {
                        "type": "object",
                        "properties": {
                            "id": {
                                "type": "string"
                            },
                            "actId": {
                                "type": "string"
                            },
                            "userId": {
                                "type": "string"
                            },
                            "startedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "finishedAt": {
                                "type": "string",
                                "format": "date-time",
                                "example": "2025-01-08T00:00:00.000Z"
                            },
                            "status": {
                                "type": "string",
                                "example": "READY"
                            },
                            "meta": {
                                "type": "object",
                                "properties": {
                                    "origin": {
                                        "type": "string",
                                        "example": "API"
                                    },
                                    "userAgent": {
                                        "type": "string"
                                    }
                                }
                            },
                            "stats": {
                                "type": "object",
                                "properties": {
                                    "inputBodyLen": {
                                        "type": "integer",
                                        "example": 2000
                                    },
                                    "rebootCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "restartCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "resurrectCount": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "computeUnits": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "options": {
                                "type": "object",
                                "properties": {
                                    "build": {
                                        "type": "string",
                                        "example": "latest"
                                    },
                                    "timeoutSecs": {
                                        "type": "integer",
                                        "example": 300
                                    },
                                    "memoryMbytes": {
                                        "type": "integer",
                                        "example": 1024
                                    },
                                    "diskMbytes": {
                                        "type": "integer",
                                        "example": 2048
                                    }
                                }
                            },
                            "buildId": {
                                "type": "string"
                            },
                            "defaultKeyValueStoreId": {
                                "type": "string"
                            },
                            "defaultDatasetId": {
                                "type": "string"
                            },
                            "defaultRequestQueueId": {
                                "type": "string"
                            },
                            "buildNumber": {
                                "type": "string",
                                "example": "1.0.0"
                            },
                            "containerUrl": {
                                "type": "string"
                            },
                            "usage": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "integer",
                                        "example": 1
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            },
                            "usageTotalUsd": {
                                "type": "number",
                                "example": 0.00005
                            },
                            "usageUsd": {
                                "type": "object",
                                "properties": {
                                    "ACTOR_COMPUTE_UNITS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATASET_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "KEY_VALUE_STORE_WRITES": {
                                        "type": "number",
                                        "example": 0.00005
                                    },
                                    "KEY_VALUE_STORE_LISTS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_READS": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "REQUEST_QUEUE_WRITES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_INTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "DATA_TRANSFER_EXTERNAL_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_RESIDENTIAL_TRANSFER_GBYTES": {
                                        "type": "integer",
                                        "example": 0
                                    },
                                    "PROXY_SERPS": {
                                        "type": "integer",
                                        "example": 0
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}
```
