Tiktok Comments Dataset - Research Grade cookieless avatar
Tiktok Comments Dataset - Research Grade cookieless
Under maintenance

Pricing

from $1.50 / 1,000 results

Go to Apify Store
Tiktok Comments Dataset - Research Grade cookieless

Tiktok Comments Dataset - Research Grade cookieless

Under maintenance

Extract high-fidelity TikTok comment datasets with granular metadata, capturing hidden engagement metrics, timestamps, and user interaction fields. Analysis-ready, structured research tool for comprehensive social media sentiment and performance insights.

Pricing

from $1.50 / 1,000 results

Rating

0.0

(0)

Developer

Surge Street

Surge Street

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

4 days ago

Last modified

Share

Tiktok Comments Dataset - Research Grade

Overview

This actor performs a deep extraction of TikTok comment data from specified video posts, delivering research-grade structured datasets optimized for sentiment analysis, engagement modeling, and audience behavior studies. The extraction pipeline ensures data integrity through validation checkpoints, timestamp normalization, and comprehensive metadata capture. All records include author attribution, engagement metrics, and moderation status for complete analytical coverage.

Data Dictionary

Field NameData TypeDefinition
comment_idStringUnique identifier assigned to the comment by the extraction system
post_idStringIdentifier of the parent TikTok video post containing this comment
external_idStringPlatform-native comment identifier from TikTok's internal system
scraped_atString (ISO 8601)UTC timestamp indicating when the comment was extracted from the platform
contentStringFull text body of the comment as authored by the user
created_atString (ISO 8601)UTC timestamp when the comment was originally published on TikTok
language_codeStringISO 639-1 two-letter language code detected from comment content
sentiment_scoreFloatNormalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive)
is_editedBooleanIndicates whether the comment has been modified after initial publication
is_hiddenBooleanIndicates whether the comment is hidden from public view by platform or author
is_flaggedBooleanIndicates whether the comment has been flagged for moderation review
author.user_idStringUnique identifier for the comment author's TikTok account
author.usernameStringDisplay username of the comment author
author.is_verifiedBooleanIndicates whether the author account has official verification status
author.follower_countIntegerTotal number of followers on the author's account at time of extraction
author.account_typeStringClassification of account type (e.g., personal, professional, business)
engagement.likesIntegerTotal number of likes received by the comment
engagement.repliesIntegerTotal number of direct replies to this comment
engagement.sharesIntegerTotal number of times the comment has been shared
metadata.client_typeStringPlatform and device type used to post the comment (e.g., mobile_ios, web)
metadata.app_versionStringTikTok application version number used by the commenter
metadata.ip_locationStringISO 3166-1 alpha-2 country code derived from IP geolocation
metadata.device_idStringAnonymized device identifier for client tracking
moderation_statusStringCurrent moderation state (e.g., approved, pending, rejected)
thread_positionIntegerOrdinal position of comment within its thread hierarchy
parent_comment_idString (Nullable)Reference to parent comment ID if this is a reply; null for top-level comments

Sample Dataset

Below is a sample of the high-fidelity JSON output:

{
"comment_id": "987654321098765",
"post_id": "456789123456789",
"external_id": "c_98765432109876543210",
"scraped_at": "2025-12-19T15:22:31Z",
"content": "This analysis really helped me understand the market trends!",
"created_at": "2025-12-19T14:30:00Z",
"language_code": "en",
"sentiment_score": 0.87,
"is_edited": false,
"is_hidden": false,
"is_flagged": false,
"author": {
"user_id": "123456789",
"username": "market_analyst",
"is_verified": true,
"follower_count": 15234,
"account_type": "professional"
},
"engagement": {
"likes": 342,
"replies": 28,
"shares": 15
},
"metadata": {
"client_type": "mobile_ios",
"app_version": "2.14.0",
"ip_location": "US",
"device_id": "iphone13_pro"
},
"moderation_status": "approved",
"thread_position": 1,
"parent_comment_id": null
}

Configuration Parameters

To ensure optimal data depth, configure the following:

ParameterJSON Field NameData TypeExample ValueDescription
Video IDvideoIdString345678765678TikTok video identifier from which comments will be extracted

Analytical Use Cases

Sentiment Analysis: Leverage sentiment_score and content fields to perform aggregate sentiment tracking across viral content, identifying emotional response patterns and audience reception trends over time.

Engagement Modeling: Utilize engagement metrics (likes, replies, shares) in conjunction with author.follower_count and account_type to build predictive models for comment virality and influence propagation.

Network Mapping: Construct reply-thread networks using parent_comment_id and thread_position to visualize conversation structures and identify key opinion leaders within comment ecosystems.

Longitudinal Studies: Track temporal patterns using created_at and scraped_at timestamps to analyze comment velocity, engagement decay curves, and content lifecycle dynamics.

Cross-Platform Benchmarking: Compare metadata.client_type distributions to understand platform usage patterns and optimize content delivery strategies for specific device audiences.

Content Moderation Research: Analyze moderation_status, is_flagged, and is_hidden fields to study platform governance patterns and community standards enforcement.

Technical Limitations

Important Considerations:

  • Comment extraction is limited to publicly accessible content; private or restricted posts will return null datasets
  • Platform rate limiting may restrict extraction velocity to 100 requests per minute per IP address
  • Historical comment data is available for posts up to 90 days old; older content may have incomplete engagement metrics
  • Deleted comments are not recoverable and will not appear in extraction results
  • Sentiment scores are algorithmically generated and should be validated against domain-specific lexicons for specialized content
  • Author metadata reflects point-in-time values at extraction; follower counts and verification status may change
  • Nested reply threads deeper than 5 levels may experience incomplete capture due to API pagination constraints
  • Geographic IP location data provides country-level granularity only; city or region-level precision is not available

Keywords & Tags: This research-grade comments scraper enables comprehensive post comments extractor functionality for social media comment scraping workflows. Export post comments with this web scraper for comments to support comment data extraction pipelines, social listening data aggregation, and retrieve post comments operations at scale.