Tweet Replies Scraper avatar
Tweet Replies Scraper

Under maintenance

Pricing

$6.00 / 1,000 results

Go to Store
Tweet Replies Scraper

Tweet Replies Scraper

Under maintenance

Developed by

Giichi Arai

Giichi Arai

Maintained by Community

Fast API-powered scraper to extract replies from specific tweets on X (Twitter)

0.0 (0)

Pricing

$6.00 / 1,000 results

0

Total users

2

Monthly users

2

Runs succeeded

14%

Last modified

5 days ago

Tweet Replies Scraper – Fast API-powered Twitter Replies Extractor

Fast and reliable scraper for extracting replies from specific tweets on X (Twitter). Perfect for sentiment analysis, conversation monitoring, and social media research.

Features

  • 🚀 Fast & Reliable: Built with high-performance API endpoints
  • 🔄 Pagination Support: Automatically handles pagination to get all replies
  • 📊 Multiple Output Formats: Export data as JSON, CSV, or Excel
  • 🎯 Precise Targeting: Get replies from specific tweets using tweet IDs
  • 🔧 Flexible Configuration: Customize reply limits, sorting, and output formats
  • 💾 Structured Data: Clean, formatted output with consistent field names
  • 🔍 Deduplication: Automatically removes duplicate replies across tweets

Input Parameters

Required Parameters

  • Tweet IDs (tweetIds): List of tweet IDs to extract replies from
    • Format: Array of strings containing tweet IDs
    • Example: ["TWEET_ID_1", "TWEET_ID_2"]
    • How to get tweet ID: From URL https://twitter.com/user/status/TWEET_ID, the ID is TWEET_ID

Optional Parameters

  • Maximum Replies Per Tweet (maxRepliesPerTweet): Maximum number of replies to extract per tweet

    • Default: 100
    • Range: 1-2000
  • Sort By (sortBy): How to sort the replies

    • Options: latest, popular, mixed
    • Default: latest
  • Include Replies of Replies (includeRepliesOfReplies): Whether to include nested replies

    • Default: false

Output Data Structure

Each reply contains the following fields:

{
"reply_id": "REPLY_ID",
"reply_text": "This is a reply to the original tweet",
"reply_author": "username",
"reply_author_name": "Display Name",
"created_at": "2022-04-28T12:00:00.000Z",
"retweet_count": 10,
"like_count": 25,
"reply_count": 5,
"quote_count": 2,
"is_retweet": false,
"is_reply": true,
"is_quote": false,
"hashtags": ["#example"],
"mentions": ["@username"],
"urls": ["https://example.com"],
"lang": "en",
"source": "Twitter for iPhone",
"reply_url": "https://twitter.com/user/status/REPLY_ID",
"bookmarks": 0,
"verified": false,
"original_tweet_id": "TWEET_ID",
"in_reply_to_tweet_id": "TWEET_ID",
"conversation_id": "CONVERSATION_ID"
}

Usage Examples

Basic Usage

{
"tweetIds": ["TWEET_ID"],
"maxRepliesPerTweet": 50
}

Multiple Tweets

{
"tweetIds": [
"TWEET_ID_1",
"TWEET_ID_2",
"TWEET_ID_3"
],
"maxRepliesPerTweet": 100,
"sortBy": "popular"
}

Advanced Configuration

{
"tweetIds": ["TWEET_ID"],
"maxRepliesPerTweet": 200,
"sortBy": "latest",
"includeRepliesOfReplies": true
}

API Endpoint

This actor uses the following API endpoint:

GET {API_BASE_URL}/twitter/tweet/{tweet_id}/replies

Parameters:

  • cursor: For pagination (automatically handled)

Common Use Cases

  1. Sentiment Analysis: Analyze public reaction to specific tweets
  2. Customer Service Monitoring: Track responses to company announcements
  3. Social Media Research: Study conversation patterns and engagement
  4. Brand Monitoring: Monitor mentions and reactions to brand content
  5. Political Analysis: Analyze public discourse around political tweets
  6. Crisis Management: Monitor responses during crisis situations

How to Get Tweet IDs

  1. From Tweet URL:

    • URL: https://twitter.com/username/status/TWEET_ID
    • Tweet ID: TWEET_ID
  2. From Mobile App:

    • Tap "Share" → "Copy link"
    • Extract ID from the copied URL
  3. From Developer Tools:

    • Right-click on tweet → "Inspect Element"
    • Look for data-tweet-id attribute

Data Export Options

All data is automatically saved to the Apify Dataset in JSON format. You can export the data in multiple formats directly from the Apify Console:

  1. Go to your actor run page
  2. Click on the "Dataset" tab
  3. Select your preferred export format:
    • JSON: Raw structured data with all fields preserved
    • CSV: Comma-separated values for spreadsheet applications
    • Excel (XLSX): Microsoft Excel format with proper column headers
    • HTML: Web-friendly table format
  4. Click "Download" to get your data

Via API

You can also export data programmatically using the Apify API:

  • JSON: https://api.apify.com/v2/datasets/{datasetId}/items?format=json
  • CSV: https://api.apify.com/v2/datasets/{datasetId}/items?format=csv
  • Excel: https://api.apify.com/v2/datasets/{datasetId}/items?format=xlsx

Advanced Export Options

  • Clean items: Remove debug data (recommended)
  • Reverse order: Change result ordering
  • Offset/Limit: Export specific data ranges
  • Field selection: Export only specific fields using fields parameter

Rate Limiting & Performance

  • The actor implements intelligent rate limiting with exponential backoff
  • Automatic retry logic for failed requests
  • Pagination is handled automatically
  • Small delays between requests to respect API limits

Error Handling

The actor includes comprehensive error handling:

  • Invalid tweet ID format validation
  • API request failures with retry logic
  • Network timeout handling
  • Data parsing error recovery

Data Quality

  • Deduplication: Removes duplicate replies across multiple tweets
  • Data Validation: Ensures all required fields are present
  • Consistent Formatting: Standardizes field names and data types
  • Raw Data Preservation: Keeps original response data for reference

Security Features

  • Environment Variables: API configuration via environment variables
  • Secure Logging: Structured logging with configurable verbosity levels
  • Error Handling: Error messages don't expose sensitive API details
  • Configurable Log Levels: Control logging verbosity via LOG_LEVEL environment variable

Environment Variables

The following environment variables can be configured:

  • API_BASE_URL: Base URL for the Twitter API service (required, set via environment variable)
  • LOG_LEVEL: Logging level - DEBUG, INFO, WARNING, ERROR (default: INFO)
  • REQUEST_TIMEOUT: Request timeout in seconds (default: 30)
  • MAX_RETRIES: Maximum retry attempts for failed requests (default: 3)
  • RETRY_DELAY: Delay between retry attempts in seconds (default: 5)

Technical Details

  • Runtime: Python 3.11
  • Dependencies: requests, aiohttp, apify
  • Memory Usage: Optimized for large datasets, no additional memory overhead from export functions
  • Concurrent Processing: Handles multiple tweets efficiently
  • Security: Implements secure logging practices with configurable verbosity
  • Data Storage: Uses Apify Dataset for automatic format support

Troubleshooting

Common Issues

  1. "Invalid tweet ID format": Ensure tweet IDs are numeric strings
  2. "No replies found": Tweet might have no replies or be private
  3. "Rate limit exceeded": Wait and try again with fewer requests

Getting Help

If you encounter issues:

  1. Check the actor logs for detailed error messages
  2. Verify tweet IDs are correct and public
  3. Ensure your input parameters are valid
  4. Contact support if problems persist

Changelog

Version 1.0.0

  • Initial release
  • Support for multiple tweet IDs
  • Pagination support
  • Multiple output formats
  • Comprehensive error handling
  • Rate limiting implementation

Note: This actor respects Twitter's terms of service and implements appropriate rate limiting. Always ensure you have permission to scrape the data you're accessing.