Twitter Hashtag Scraper - X Posts by Hashtag
Pricing
Pay per usage
Twitter Hashtag Scraper - X Posts by Hashtag
Scrape Twitter/X posts by hashtag using Nitter instances and web scraping. Extract tweet text, authors, handles, engagement metrics (likes, retweets, replies, quotes), images, videos, and posting dates. Includes hashtag analytics with top authors, engagement summaries, and posting frequency.
Pricing
Pay per usage
Rating
0.0
(0)
Developer

Ricardo Akiyoshi
Actor stats
0
Bookmarked
2
Total users
1
Monthly active users
3 hours ago
Last modified
Categories
Share
Scrape Twitter/X posts by hashtag at scale using Nitter instances and web scraping. No Twitter API keys or authentication required. Extract full tweet metadata including text, authors, engagement metrics, images, and videos.
What does it do?
This actor searches Twitter/X for posts containing specific hashtags and extracts comprehensive data including:
- Tweet content: Full text, author name, handle, posting date
- Engagement metrics: Likes, retweets, replies, quotes
- Media: Image URLs, video URLs
- Tweet metadata: Tweet ID, direct URL, retweet/reply status
- Hashtag analytics: Top authors, engagement distribution, posting frequency, co-occurring hashtags
How it works
The scraper uses a multi-layered strategy for maximum reliability:
- Nitter instances (primary) — Scrapes public Nitter mirrors that proxy Twitter data. Automatically rotates through 15+ instances and falls back to the next if one is down.
- CheerioCrawler (secondary) — Uses Crawlee's built-in retry and proxy management for more resilient scraping of Nitter pages.
- Google search (fallback) — When all Nitter instances are unavailable, searches Google for
site:twitter.com #hashtagto find tweet URLs and basic metadata.
All tweets are deduplicated by tweet ID across all sources.
Use Cases
Social Listening
Monitor what people are saying about your brand, product, or industry. Track hashtag conversations in real-time and identify emerging sentiments before they trend.
Campaign Tracking
Measure the reach and engagement of your marketing campaigns. Track branded hashtags to see how many people are using them, who the top contributors are, and what content performs best.
Trend Analysis
Discover trending topics and conversations around specific hashtags. Analyze posting frequency, engagement patterns, and co-occurring hashtags to understand how conversations evolve.
Influencer Discovery
Find the most active and influential voices in any hashtag conversation. The built-in analytics identify top authors by tweet count and total engagement, making it easy to spot potential collaborators.
Competitive Intelligence
Monitor competitor brand hashtags, product launches, and campaign performance. Compare engagement metrics across different hashtags to benchmark your performance.
Market Research
Understand public opinion on topics relevant to your business. Analyze the language, sentiment, and themes within hashtag conversations to inform product development and positioning.
Academic Research
Collect large-scale social media datasets for research. The structured JSON output integrates easily with data analysis tools and NLP pipelines.
Content Strategy
Analyze which types of content (images, videos, text-only) get the most engagement within your target hashtags. Use co-hashtag analysis to discover related conversations to join.
Input
| Field | Type | Default | Description |
|---|---|---|---|
hashtags | array | required | List of hashtags to search (without # symbol) |
maxTweets | integer | 500 | Maximum tweets to collect per hashtag (1-10,000) |
language | string | - | Filter by language code (e.g., "en", "es", "ja") |
includeReplies | boolean | false | Whether to include reply tweets |
minLikes | integer | 0 | Minimum likes threshold for filtering |
proxyConfiguration | object | - | Proxy settings for large-scale scrapes |
Example: Basic hashtag search
{"hashtags": ["AI", "machinelearning"],"maxTweets": 200}
Example: High-engagement English tweets only
{"hashtags": ["startup", "SaaS", "indiehackers"],"maxTweets": 1000,"language": "en","includeReplies": false,"minLikes": 10}
Example: Campaign tracking with replies
{"hashtags": ["YourBrandName", "ProductLaunch2026"],"maxTweets": 5000,"includeReplies": true,"minLikes": 0}
Output
Each tweet is saved as a structured JSON object:
{"tweetId": "1895234567890123456","tweetText": "The future of AI is here. These new models are changing everything we know about automation. #AI #machinelearning #tech","author": "Tech Insights","handle": "@techinsights","date": "2026-02-28T14:30:00.000Z","likes": 1523,"retweets": 342,"replies": 87,"quotes": 45,"images": ["https://pbs.twimg.com/media/example.jpg"],"videos": [],"isRetweet": false,"isReply": false,"tweetUrl": "https://twitter.com/techinsights/status/1895234567890123456","hashtag": "AI","mentionedHashtags": ["ai", "machinelearning", "tech"],"scrapedAt": "2026-03-01T10:15:00.000Z"}
Analytics Output
The actor also generates hashtag analytics saved to the key-value store under the ANALYTICS key:
{"AI": {"hashtag": "AI","totalTweets": 500,"uniqueAuthors": 312,"topAuthors": [{"author": "@techinsights","tweetCount": 15,"totalLikes": 8934,"totalRetweets": 2156,"totalEngagement": 12450}],"engagementSummary": {"likes": { "total": 125000, "average": 250, "median": 45, "max": 15000 },"retweets": { "total": 34000, "average": 68, "median": 12, "max": 5000 },"replies": { "total": 18000, "average": 36, "median": 5, "max": 2000 },"totalEngagement": 177000,"averageEngagementPerTweet": 354},"postingFrequency": {"totalDays": 7,"averagePerDay": 71,"peakDate": "2026-02-28","peakCount": 120},"mediaStats": {"tweetsWithImages": 180,"tweetsWithVideos": 45,"originalTweets": 380,"retweets": 90,"replies": 30,"mediaPercentage": 45},"coHashtags": [{ "hashtag": "machinelearning", "count": 120 },{ "hashtag": "deeplearning", "count": 85 },{ "hashtag": "tech", "count": 72 }]}}
Deduplication
Tweets are deduplicated across all sources using a two-layer approach:
- Tweet ID (primary) — Each tweet has a unique numeric ID. If the same tweet appears from multiple Nitter instances, the version with the highest engagement data is kept.
- Text hash (fallback) — For tweets without extractable IDs (e.g., from Google fallback), a text hash is used to prevent duplicates.
Rate Limiting & Reliability
- Automatic rotation through 15+ Nitter instances
- Health checks on startup to identify working instances
- 1.5-second delay between requests with random jitter
- Exponential backoff on rate limits (429 responses)
- Three-layer fallback strategy (direct fetch, CheerioCrawler, Google)
- Request timeouts to prevent hanging on unresponsive instances
Pay Per Event Pricing
This actor uses Apify's Pay Per Event model:
| Event | Price |
|---|---|
| Tweet scraped | $0.003 |
Example costs:
- 100 tweets = $0.30
- 500 tweets = $1.50
- 1,000 tweets = $3.00
Limitations
- Nitter availability: Nitter instances go up and down frequently. The actor handles this with multi-instance rotation, but scraping volume depends on instance availability.
- Historical data: Nitter search results are limited in how far back they go. For deep historical analysis, results may be incomplete.
- Engagement accuracy: Engagement numbers from Nitter may lag behind real-time Twitter values.
- Language filtering: Language detection is advisory and depends on Nitter's rendering. Some tweets may slip through the filter.
- Google fallback: When using Google as a fallback, engagement metrics (likes, retweets) are not available.
- Rate limits: Very large scrapes (5,000+ tweets) may take longer due to rate limiting across instances.
Tips for Best Results
- Use proxies for scrapes over 500 tweets to avoid rate limiting
- Set
minLikesto filter out low-quality or spam tweets - Disable
includeRepliesfor cleaner datasets focused on original content - Search multiple related hashtags in one run for comprehensive topic coverage
- Check the analytics output in the key-value store for quick insights without processing raw data
Changelog
1.0.0 (2026-03-02)
- Initial release
- Multi-instance Nitter scraping with automatic rotation
- Google search fallback
- Tweet deduplication by ID and text hash
- Hashtag analytics (top authors, engagement, frequency, co-hashtags)
- Media extraction (images and videos)
- Pay Per Event billing
- CheerioCrawler integration with Crawlee