Twitter (X) Post Comments Extractor (Rich Metadata) cookieless avatar
Twitter (X) Post Comments Extractor (Rich Metadata) cookieless

Pricing

from $1.50 / 1,000 results

Go to Apify Store
Twitter (X) Post Comments Extractor (Rich Metadata) cookieless

Twitter (X) Post Comments Extractor (Rich Metadata) cookieless

Extract high-fidelity Twitter post comments with granular metadata, capturing hidden engagement fields, user interactions, and precise timestamps. Cookieless extraction enables comprehensive sentiment analysis and strategic content optimization.

Pricing

from $1.50 / 1,000 results

Rating

0.0

(0)

Developer

Surge Street

Surge Street

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

18 hours ago

Last modified

Share

"# Twitter (X.com) Post Comments Extractor (Rich Metadata) cookieless

Overview

This actor performs a deep extraction of comment-level data from Twitter (X.com) posts, capturing rich metadata including engagement metrics, author attributes, sentiment indicators, and temporal information. The extraction process operates without authentication cookies, ensuring reliable data collection with high fidelity schema adherence. All records include comprehensive timestamp tracking and nested object structures for author profiles, engagement statistics, and platform metadata to support downstream analytical workflows.

Data Dictionary

Field NameData TypeDefinition
post_idStringUnique identifier for the parent post to which the comment belongs
external_idStringPlatform-generated unique identifier for the individual comment record
scraped_atString (ISO 8601)UTC timestamp indicating when the comment data was extracted from the platform
parent_thread_idStringIdentifier for the conversation thread containing this comment
contentStringFull text content of the comment as posted by the user
language_codeStringISO 639-1 two-letter language code detected for the comment content
sentiment_scoreFloatNormalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive)
is_verifiedBooleanIndicates whether the comment author has platform verification status
is_editedBooleanFlag indicating if the comment has been modified after initial posting
comment_depthIntegerNesting level of the comment within the thread hierarchy (0 = top-level)
author.user_idStringUnique platform identifier for the comment author
author.usernameStringDisplay username or handle of the comment author
author.reputation_scoreIntegerNumerical representation of author's platform credibility or karma
author.is_premiumBooleanIndicates whether the author holds a premium or subscription account status
engagement_metrics.upvotesIntegerCount of positive reactions or likes received by the comment
engagement_metrics.downvotesIntegerCount of negative reactions received by the comment
engagement_metrics.repliesIntegerNumber of direct replies to this comment
engagement_metrics.sharesIntegerCount of times the comment has been shared or reposted
metadata.client_versionStringVersion identifier of the client application used to post the comment
metadata.platformStringPlatform interface type (web, mobile_app, api)
metadata.device_typeStringDevice category from which the comment was posted (desktop, mobile, tablet)
metadata.ip_regionStringGeographic region code derived from the posting IP address
timestamps.created_atString (ISO 8601)UTC timestamp of the original comment creation
timestamps.last_editedString (ISO 8601)UTC timestamp of the most recent edit operation
timestamps.last_activityString (ISO 8601)UTC timestamp of the most recent engagement activity on the comment

Sample Dataset

Below is a sample of the high-fidelity JSON output:

{
""post_id"": ""p_87654321"",
""external_id"": ""comment_15f8a9c2d3b4e5"",
""scraped_at"": ""2025-12-21T15:22:33Z"",
""parent_thread_id"": ""thread_98765432"",
""content"": ""This analysis perfectly captures the market trends!"",
""language_code"": ""en"",
""sentiment_score"": 0.87,
""is_verified"": true,
""is_edited"": false,
""comment_depth"": 2,
""author"": {
""user_id"": ""u_123456789"",
""username"": ""market_analyst"",
""reputation_score"": 856,
""is_premium"": true
},
""engagement_metrics"": {
""upvotes"": 234,
""downvotes"": 12,
""replies"": 15,
""shares"": 8
},
""metadata"": {
""client_version"": ""2.14.5"",
""platform"": ""web"",
""device_type"": ""desktop"",
""ip_region"": ""EUR""
},
""timestamps"": {
""created_at"": ""2025-12-21T14:30:00Z"",
""last_edited"": ""2025-12-21T14:45:22Z"",
""last_activity"": ""2025-12-21T15:20:11Z""
}
}

Configuration Parameters

To ensure optimal data depth, configure the following:

ParameterJSON Field NameData TypeRequiredDescriptionExample Value
Tweet IDtweetIdStringYesThe unique identifier of the Twitter post from which to extract comments""1738106896777699464""

Analytical Use Cases

Researchers and data scientists can leverage this dataset for multiple analytical workflows:

  • Sentiment Analysis: Utilize the sentiment_score field in conjunction with content to perform aggregate sentiment tracking across comment threads, enabling identification of positive/negative discourse patterns.
  • Engagement Pattern Analysis: Correlate engagement_metrics with author.reputation_score and timestamps to identify high-performing comment characteristics and optimal posting windows.
  • Network Mapping: Construct social graphs using author.user_id, parent_thread_id, and comment_depth to visualize conversation hierarchies and identify influential community members.
  • Longitudinal Studies: Track temporal evolution of discussions using timestamps.created_at and timestamps.last_activity to measure conversation decay rates and sustained engagement periods.
  • Content Strategy Optimization: Analyze language_code, content length, and engagement_metrics to determine which comment styles drive higher interaction rates.
  • Verification Impact Studies: Compare engagement patterns between verified (is_verified: true) and non-verified authors to quantify credibility effects on audience response.

Technical Limitations

Important Considerations:

  • Rate Limiting: Extraction operates within platform-imposed rate limits; high-volume requests may experience throttling or temporary access restrictions.
  • Data Freshness: The scraped_at timestamp reflects extraction time; rapidly evolving threads may show discrepancies between engagement_metrics and real-time values.
  • Deleted Content: Comments removed after extraction will not be retroactively flagged; implement periodic re-scraping for data integrity validation.
  • Nested Thread Depth: Extremely deep comment threads (depth > 10) may experience incomplete extraction due to platform pagination constraints.
  • Sentiment Accuracy: The sentiment_score represents algorithmic inference and may not capture nuanced sarcasm, cultural context, or domain-specific terminology.
  • Geographic Precision: The ip_region field provides regional-level granularity only; city or precise location data is not available.
  • Historical Data: Extraction is limited to currently accessible comments; platform retention policies may restrict access to older content beyond 12-18 months.

Keywords & Tags: This specification supports workflows involving post comments scraper, social media comments extractor, instagram post comments scraping, facebook post comments scraper, export comments from posts, comment data scraping tool, and social engagement analytics for comprehensive audience insight generation."