Tiktok Comments Dataset - Research Grade cookieless
Pricing
from $1.50 / 1,000 results
Tiktok Comments Dataset - Research Grade cookieless
Extract high-fidelity TikTok comment datasets with granular metadata, capturing hidden engagement metrics, timestamps, and user interaction fields. Analysis-ready, structured research tool for comprehensive social media sentiment and performance insights.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer

Surge Street
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
4 days ago
Last modified
Categories
Share
Tiktok Comments Dataset - Research Grade
Overview
This actor performs a deep extraction of TikTok comment data from specified video posts, delivering research-grade structured datasets optimized for sentiment analysis, engagement modeling, and audience behavior studies. The extraction pipeline ensures data integrity through validation checkpoints, timestamp normalization, and comprehensive metadata capture. All records include author attribution, engagement metrics, and moderation status for complete analytical coverage.
Data Dictionary
| Field Name | Data Type | Definition |
|---|---|---|
comment_id | String | Unique identifier assigned to the comment by the extraction system |
post_id | String | Identifier of the parent TikTok video post containing this comment |
external_id | String | Platform-native comment identifier from TikTok's internal system |
scraped_at | String (ISO 8601) | UTC timestamp indicating when the comment was extracted from the platform |
content | String | Full text body of the comment as authored by the user |
created_at | String (ISO 8601) | UTC timestamp when the comment was originally published on TikTok |
language_code | String | ISO 639-1 two-letter language code detected from comment content |
sentiment_score | Float | Normalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive) |
is_edited | Boolean | Indicates whether the comment has been modified after initial publication |
is_hidden | Boolean | Indicates whether the comment is hidden from public view by platform or author |
is_flagged | Boolean | Indicates whether the comment has been flagged for moderation review |
author.user_id | String | Unique identifier for the comment author's TikTok account |
author.username | String | Display username of the comment author |
author.is_verified | Boolean | Indicates whether the author account has official verification status |
author.follower_count | Integer | Total number of followers on the author's account at time of extraction |
author.account_type | String | Classification of account type (e.g., personal, professional, business) |
engagement.likes | Integer | Total number of likes received by the comment |
engagement.replies | Integer | Total number of direct replies to this comment |
engagement.shares | Integer | Total number of times the comment has been shared |
metadata.client_type | String | Platform and device type used to post the comment (e.g., mobile_ios, web) |
metadata.app_version | String | TikTok application version number used by the commenter |
metadata.ip_location | String | ISO 3166-1 alpha-2 country code derived from IP geolocation |
metadata.device_id | String | Anonymized device identifier for client tracking |
moderation_status | String | Current moderation state (e.g., approved, pending, rejected) |
thread_position | Integer | Ordinal position of comment within its thread hierarchy |
parent_comment_id | String (Nullable) | Reference to parent comment ID if this is a reply; null for top-level comments |
Sample Dataset
Below is a sample of the high-fidelity JSON output:
{"comment_id": "987654321098765","post_id": "456789123456789","external_id": "c_98765432109876543210","scraped_at": "2025-12-19T15:22:31Z","content": "This analysis really helped me understand the market trends!","created_at": "2025-12-19T14:30:00Z","language_code": "en","sentiment_score": 0.87,"is_edited": false,"is_hidden": false,"is_flagged": false,"author": {"user_id": "123456789","username": "market_analyst","is_verified": true,"follower_count": 15234,"account_type": "professional"},"engagement": {"likes": 342,"replies": 28,"shares": 15},"metadata": {"client_type": "mobile_ios","app_version": "2.14.0","ip_location": "US","device_id": "iphone13_pro"},"moderation_status": "approved","thread_position": 1,"parent_comment_id": null}
Configuration Parameters
To ensure optimal data depth, configure the following:
| Parameter | JSON Field Name | Data Type | Example Value | Description |
|---|---|---|---|---|
| Video ID | videoId | String | 345678765678 | TikTok video identifier from which comments will be extracted |
Analytical Use Cases
Sentiment Analysis: Leverage sentiment_score and content fields to perform aggregate sentiment tracking across viral content, identifying emotional response patterns and audience reception trends over time.
Engagement Modeling: Utilize engagement metrics (likes, replies, shares) in conjunction with author.follower_count and account_type to build predictive models for comment virality and influence propagation.
Network Mapping: Construct reply-thread networks using parent_comment_id and thread_position to visualize conversation structures and identify key opinion leaders within comment ecosystems.
Longitudinal Studies: Track temporal patterns using created_at and scraped_at timestamps to analyze comment velocity, engagement decay curves, and content lifecycle dynamics.
Cross-Platform Benchmarking: Compare metadata.client_type distributions to understand platform usage patterns and optimize content delivery strategies for specific device audiences.
Content Moderation Research: Analyze moderation_status, is_flagged, and is_hidden fields to study platform governance patterns and community standards enforcement.
Technical Limitations
Important Considerations:
- Comment extraction is limited to publicly accessible content; private or restricted posts will return null datasets
- Platform rate limiting may restrict extraction velocity to 100 requests per minute per IP address
- Historical comment data is available for posts up to 90 days old; older content may have incomplete engagement metrics
- Deleted comments are not recoverable and will not appear in extraction results
- Sentiment scores are algorithmically generated and should be validated against domain-specific lexicons for specialized content
- Author metadata reflects point-in-time values at extraction; follower counts and verification status may change
- Nested reply threads deeper than 5 levels may experience incomplete capture due to API pagination constraints
- Geographic IP location data provides country-level granularity only; city or region-level precision is not available
Keywords & Tags: This research-grade comments scraper enables comprehensive post comments extractor functionality for social media comment scraping workflows. Export post comments with this web scraper for comments to support comment data extraction pipelines, social listening data aggregation, and retrieve post comments operations at scale.