Twitter (X) Post Comments Extractor (Rich Metadata) cookieless
Pricing
from $1.50 / 1,000 results
Twitter (X) Post Comments Extractor (Rich Metadata) cookieless
Extract high-fidelity Twitter post comments with granular metadata, capturing hidden engagement fields, user interactions, and precise timestamps. Cookieless extraction enables comprehensive sentiment analysis and strategic content optimization.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer

Surge Street
Actor stats
0
Bookmarked
1
Total users
0
Monthly active users
18 hours ago
Last modified
Categories
Share
"# Twitter (X.com) Post Comments Extractor (Rich Metadata) cookieless
Overview
This actor performs a deep extraction of comment-level data from Twitter (X.com) posts, capturing rich metadata including engagement metrics, author attributes, sentiment indicators, and temporal information. The extraction process operates without authentication cookies, ensuring reliable data collection with high fidelity schema adherence. All records include comprehensive timestamp tracking and nested object structures for author profiles, engagement statistics, and platform metadata to support downstream analytical workflows.
Data Dictionary
| Field Name | Data Type | Definition |
|---|---|---|
post_id | String | Unique identifier for the parent post to which the comment belongs |
external_id | String | Platform-generated unique identifier for the individual comment record |
scraped_at | String (ISO 8601) | UTC timestamp indicating when the comment data was extracted from the platform |
parent_thread_id | String | Identifier for the conversation thread containing this comment |
content | String | Full text content of the comment as posted by the user |
language_code | String | ISO 639-1 two-letter language code detected for the comment content |
sentiment_score | Float | Normalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive) |
is_verified | Boolean | Indicates whether the comment author has platform verification status |
is_edited | Boolean | Flag indicating if the comment has been modified after initial posting |
comment_depth | Integer | Nesting level of the comment within the thread hierarchy (0 = top-level) |
author.user_id | String | Unique platform identifier for the comment author |
author.username | String | Display username or handle of the comment author |
author.reputation_score | Integer | Numerical representation of author's platform credibility or karma |
author.is_premium | Boolean | Indicates whether the author holds a premium or subscription account status |
engagement_metrics.upvotes | Integer | Count of positive reactions or likes received by the comment |
engagement_metrics.downvotes | Integer | Count of negative reactions received by the comment |
engagement_metrics.replies | Integer | Number of direct replies to this comment |
engagement_metrics.shares | Integer | Count of times the comment has been shared or reposted |
metadata.client_version | String | Version identifier of the client application used to post the comment |
metadata.platform | String | Platform interface type (web, mobile_app, api) |
metadata.device_type | String | Device category from which the comment was posted (desktop, mobile, tablet) |
metadata.ip_region | String | Geographic region code derived from the posting IP address |
timestamps.created_at | String (ISO 8601) | UTC timestamp of the original comment creation |
timestamps.last_edited | String (ISO 8601) | UTC timestamp of the most recent edit operation |
timestamps.last_activity | String (ISO 8601) | UTC timestamp of the most recent engagement activity on the comment |
Sample Dataset
Below is a sample of the high-fidelity JSON output:
{""post_id"": ""p_87654321"",""external_id"": ""comment_15f8a9c2d3b4e5"",""scraped_at"": ""2025-12-21T15:22:33Z"",""parent_thread_id"": ""thread_98765432"",""content"": ""This analysis perfectly captures the market trends!"",""language_code"": ""en"",""sentiment_score"": 0.87,""is_verified"": true,""is_edited"": false,""comment_depth"": 2,""author"": {""user_id"": ""u_123456789"",""username"": ""market_analyst"",""reputation_score"": 856,""is_premium"": true},""engagement_metrics"": {""upvotes"": 234,""downvotes"": 12,""replies"": 15,""shares"": 8},""metadata"": {""client_version"": ""2.14.5"",""platform"": ""web"",""device_type"": ""desktop"",""ip_region"": ""EUR""},""timestamps"": {""created_at"": ""2025-12-21T14:30:00Z"",""last_edited"": ""2025-12-21T14:45:22Z"",""last_activity"": ""2025-12-21T15:20:11Z""}}
Configuration Parameters
To ensure optimal data depth, configure the following:
| Parameter | JSON Field Name | Data Type | Required | Description | Example Value |
|---|---|---|---|---|---|
| Tweet ID | tweetId | String | Yes | The unique identifier of the Twitter post from which to extract comments | ""1738106896777699464"" |
Analytical Use Cases
Researchers and data scientists can leverage this dataset for multiple analytical workflows:
- Sentiment Analysis: Utilize the
sentiment_scorefield in conjunction withcontentto perform aggregate sentiment tracking across comment threads, enabling identification of positive/negative discourse patterns. - Engagement Pattern Analysis: Correlate
engagement_metricswithauthor.reputation_scoreandtimestampsto identify high-performing comment characteristics and optimal posting windows. - Network Mapping: Construct social graphs using
author.user_id,parent_thread_id, andcomment_depthto visualize conversation hierarchies and identify influential community members. - Longitudinal Studies: Track temporal evolution of discussions using
timestamps.created_atandtimestamps.last_activityto measure conversation decay rates and sustained engagement periods. - Content Strategy Optimization: Analyze
language_code,contentlength, andengagement_metricsto determine which comment styles drive higher interaction rates. - Verification Impact Studies: Compare engagement patterns between verified (
is_verified: true) and non-verified authors to quantify credibility effects on audience response.
Technical Limitations
Important Considerations:
- Rate Limiting: Extraction operates within platform-imposed rate limits; high-volume requests may experience throttling or temporary access restrictions.
- Data Freshness: The
scraped_attimestamp reflects extraction time; rapidly evolving threads may show discrepancies betweenengagement_metricsand real-time values. - Deleted Content: Comments removed after extraction will not be retroactively flagged; implement periodic re-scraping for data integrity validation.
- Nested Thread Depth: Extremely deep comment threads (depth > 10) may experience incomplete extraction due to platform pagination constraints.
- Sentiment Accuracy: The
sentiment_scorerepresents algorithmic inference and may not capture nuanced sarcasm, cultural context, or domain-specific terminology. - Geographic Precision: The
ip_regionfield provides regional-level granularity only; city or precise location data is not available. - Historical Data: Extraction is limited to currently accessible comments; platform retention policies may restrict access to older content beyond 12-18 months.
Keywords & Tags: This specification supports workflows involving post comments scraper, social media comments extractor, instagram post comments scraping, facebook post comments scraper, export comments from posts, comment data scraping tool, and social engagement analytics for comprehensive audience insight generation."