Twitter Hashtag Tweet Scraper
Pricing
$3.00 / 1,000 results
Twitter Hashtag Tweet Scraper
Scrapes tweets by hashtags with comprehensive metadata extraction and intelligent rate limit handling.
Pricing
$3.00 / 1,000 results
Rating
0.0
(0)
Developer

Deepanshu Sharma
Actor stats
0
Bookmarked
3
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Twitter Hashtag Scraper
A actor that scrapes tweets based on hashtags using authenticated Twitter sessions. This scraper respects rate limits, handles deduplication, and provides comprehensive tweet data extraction.
๐ Features
- Multi-hashtag support: Scrape tweets from multiple hashtags in a single run
- Rate limit handling: Automatically handles Twitter rate limits with smart waiting
- Deduplication: Prevents duplicate tweets using unique ID tracking
- Real-time data streaming: Pushes data to Apify dataset in batches for immediate access
- Comprehensive tweet data: Extracts detailed metadata including engagement metrics
- Time-based filtering: Configurable tweet age limits
- Authentication via cookies: Uses Twitter session cookies for reliable access
๐ Input Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
hashtags | Array[String] | List of hashtags to search (without # symbol) |
cookies | Array[Object] | Twitter session cookies for authentication |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
max_tweets | Integer | 5000 | Maximum total number of tweets to collect |
max_age_hint_minutes | Integer | 1440 | Maximum age of tweets in minutes (24 hours) |
Input Example
{"hashtags": ["AI", "MachineLearning", "DataScience"],"max_tweets": 3000,"max_age_hint_minutes": 720,"cookies": [{"name": "auth_token","value": "your_auth_token_here","domain": ".x.com"},{"name": "ct0","value": "your_ct0_token_here","domain": ".x.com"}]}
๐ช How to Get Twitter Cookies
- Log into Twitter/X in your browser
- Open Developer Tools (F12)
- Go to Application/Storage tab
- Find Cookies for
x.comortwitter.com - Copy required cookies:
auth_token- Main authentication tokenct0- CSRF tokentwid- Twitter ID (optional but recommended)
Cookie Format
{"name": "cookie_name","value": "cookie_value","domain": ".x.com"}
๐ Output Data Structure
Each tweet returns the following data structure:
{"id": "1234567890123456789","text": "This is a sample tweet with #hashtag","author": "John Doe","username": "johndoe","created_at": "2024-01-15T10:30:00Z","retweet_count": 42,"like_count": 156,"reply_count": 23,"quote_count": 8,"url": "https://twitter.com/johndoe/status/1234567890123456789","hashtags": ["hashtag", "example"],"mentions": ["mention1", "mention2"],"is_retweet": false,"language": "en","user_followers": 1500,"user_verified": false,"search_hashtag": "AI","scraped_at": "2024-01-15T11:00:00Z"}
Output Fields Explanation
| Field | Description |
|---|---|
id | Unique tweet identifier |
text | Full tweet content |
author | Display name of tweet author |
username | Twitter handle (@username) |
created_at | Tweet creation timestamp |
retweet_count | Number of retweets |
like_count | Number of likes/favorites |
reply_count | Number of replies |
quote_count | Number of quote tweets |
url | Direct link to the tweet |
hashtags | Array of hashtags found in tweet |
mentions | Array of mentioned users |
is_retweet | Boolean indicating if it's a retweet |
language | Detected language code |
user_followers | Author's follower count |
user_verified | Author's verification status |
search_hashtag | Which hashtag query found this tweet |
scraped_at | When the tweet was scraped |
โก Performance & Limits
Rate Limiting
- Automatic handling: Actor automatically waits when rate limits are hit
- Smart delays: Random delays between requests to avoid detection
- Batch processing: Processes tweets in batches for efficiency
Tweet Distribution
- Even distribution: Tweets are distributed evenly across hashtags
- Global limit: Total tweet count never exceeds
max_tweets - Deduplication: Duplicate tweets across hashtags are filtered out
Data Streaming
- Real-time updates: Data is pushed to dataset every 50 tweets
- Progress tracking: Detailed logging of collection progress
- Error recovery: Continues scraping even if individual tweets fail
๐ง Advanced Configuration
Search Query Optimization
The actor automatically builds optimized search queries:
- Filters out retweets by default
- Includes time constraints based on
max_age_hint_minutes - Uses Twitter's "Latest" product for recent tweets
Error Handling
- Retry logic: Up to 5 retries per hashtag on failures
- Graceful degradation: Continues with other hashtags if one fails
- Data preservation: Saves collected data even if scraping is interrupted
๐จ Important Notes
Authentication
- Required: Valid Twitter session cookies are mandatory
- Session management: Uses cookies to maintain authenticated session
- Security: Keep your cookies secure and don't share them
Rate Limits
- Twitter limits: Respects Twitter's rate limiting policies
- Wait times: May pause for up to 9 minutes when rate limited
- Patience required: Large scraping jobs may take time
Content Policy
- Public tweets only: Only scrapes publicly available tweets
- No private data: Cannot access protected accounts
- Respect ToS: Use responsibly and respect Twitter's Terms of Service
๐ Usage Examples
Small Scale Scraping
{"hashtags": ["startup"],"max_tweets": 100,"max_age_hint_minutes": 60,"cookies": [...]}
Multi-hashtag Research
{"hashtags": ["climate", "sustainability", "renewableenergy"],"max_tweets": 5000,"max_age_hint_minutes": 2880,"cookies": [...]}
Recent Trending Topics
{"hashtags": ["breaking", "news", "trending"],"max_tweets": 1000,"max_age_hint_minutes": 30,"cookies": [...]}
๐ ๏ธ Troubleshooting
Common Issues
-
Authentication Failed
- Verify cookies are valid and recent
- Check cookie format and domain settings
- Ensure you're logged into Twitter in the same browser
-
No Tweets Found
- Check hashtag spelling
- Verify hashtags exist and have recent activity
- Adjust
max_age_hint_minutesto include older tweets
-
Rate Limited
- Wait for the automatic rate limit handling
- Consider reducing
max_tweetsfor faster completion - Use fewer hashtags to reduce API calls
Performance Tips
- Start small: Test with low
max_tweetsfirst - Use specific hashtags: More specific hashtags often yield better results
- Monitor progress: Check Apify logs for real-time progress updates
- Be patient: Large scraping jobs require time due to rate limits
Support
For issues or questions:
- Check the Apify logs for detailed error messages
- Verify your input format matches the examples
- Ensure your cookies are valid and up-to-date
- Contact support through Apify platform if issues persist
Note: This actor is designed for research and analysis purposes. Please ensure compliance with Twitter's Terms of Service and applicable data protection regulations.