Twitter (X.com) Search Dataset Cookieless(Full History)
Pricing
from $1.50 / 1,000 results
Twitter (X.com) Search Dataset Cookieless(Full History)
Extract comprehensive Twitter (X) search results including hidden metadata, engagement metrics, and granular timestamps. High-fidelity, cookieless data extraction tool designed for precise sales intelligence and competitive analysis.
Pricing
from $1.50 / 1,000 results
Rating
0.0
(0)
Developer

Surge Street
Actor stats
0
Bookmarked
1
Total users
1
Monthly active users
3 days ago
Last modified
Categories
Share
Twitter (X.com) Search Dataset Cookieless (Full History)
Overview
This actor performs a deep extraction of Twitter (X) search results, capturing comprehensive tweet metadata, author profiles, engagement metrics, and media assets without requiring authentication cookies. The extraction pipeline maintains full historical context through conversation threading and delivers structured JSON output optimized for analytical workloads. Data integrity is ensured through timestamp validation, unique composite identifiers, and schema-enforced type constraints.
Data Dictionary
| Field Name | Data Type | Definition |
|---|---|---|
tweet_id | String | Unique Twitter-assigned identifier for the tweet object |
external_id | String | Composite identifier combining tweet ID and extraction timestamp for deduplication |
scraped_at | String (ISO 8601) | UTC timestamp indicating when the record was extracted from the platform |
text | String | Full UTF-8 encoded tweet content including mentions, hashtags, and URLs |
language_code | String | ISO 639-1 two-letter language code detected by Twitter's NLP engine |
is_verified | Boolean | Indicates whether the author account holds verified status (blue checkmark) |
is_retweet | Boolean | Flag identifying if the tweet is a retweet of another user's content |
is_quote | Boolean | Flag identifying if the tweet quotes another tweet with added commentary |
sentiment_score | Float | Normalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive) |
author.user_id | String | Unique Twitter-assigned identifier for the author account |
author.username | String | Account handle without @ prefix (e.g., "tech_analyst") |
author.display_name | String | User-defined display name shown on profile |
author.followers_count | Integer | Total number of accounts following this user at extraction time |
author.following_count | Integer | Total number of accounts this user follows at extraction time |
engagement.likes | Integer | Cumulative like/favorite count at time of extraction |
engagement.retweets | Integer | Number of times the tweet has been retweeted |
engagement.replies | Integer | Count of direct reply tweets in the conversation thread |
engagement.quotes | Integer | Number of quote tweets referencing this tweet |
engagement.impressions | Integer | Estimated view count as reported by Twitter's analytics |
media.has_media | Boolean | Indicates presence of attached media assets (images, videos, GIFs) |
media.type | String | Media classification: "image", "video", "animated_gif", or null |
media.count | Integer | Number of media items attached to the tweet (max 4) |
media.urls | Array[String] | Direct URLs to media assets hosted on Twitter's CDN |
location.place_id | String | Twitter Place ID for geotagged tweets |
location.place_type | String | Geographic granularity: "city", "admin", "country", or "poi" |
location.place_name | String | Human-readable location string (e.g., "San Francisco, CA") |
conversation_id | String | Root tweet ID linking all replies in a conversation thread |
source | String | Client application used to publish the tweet (e.g., "Twitter Web App") |
hashtags | Array[String] | Extracted hashtags including the # symbol |
Sample Dataset
Below is a sample of the high-fidelity JSON output:
{"tweet_id": "1739284756104728576","external_id": "tw_1739284756104728576_20251221","scraped_at": "2025-12-21T12:30:45Z","text": "The future of AI is not what we expected. Here's why... (1/4)","language_code": "en","is_verified": true,"is_retweet": false,"is_quote": false,"sentiment_score": 0.65,"author": {"user_id": "87654321","username": "tech_analyst","display_name": "Tech Analyst","followers_count": 45230,"following_count": 892},"engagement": {"likes": 3421,"retweets": 892,"replies": 234,"quotes": 56,"impressions": 28945},"media": {"has_media": true,"type": "image","count": 2,"urls": ["https://x.com/img1.jpg", "https://x.com/img2.jpg"]},"location": {"place_id": "01a9a39c7b349001","place_type": "city","place_name": "San Francisco, CA"},"conversation_id": "1739284756104728500","source": "Twitter Web App","hashtags": ["#AI", "#TechTrends", "#Future"]}
Configuration Parameters
To ensure optimal data depth, configure the following:
| Parameter | JSON Field Name | Data Type | Example | Description |
|---|---|---|---|---|
| Search Query | query | String | "Mr Beast" | Twitter search syntax including keywords, hashtags, mentions, operators (AND/OR), and filters (lang:, from:, since:) |
Analytical Use Cases
Sentiment Analysis: Leverage sentiment_score and text fields to perform time-series sentiment tracking across brand mentions, product launches, or crisis events. Aggregate sentiment by author.followers_count to weight influence.
Network Mapping: Construct social graphs using author.user_id, conversation_id, and engagement metrics to identify key opinion leaders, community clusters, and information diffusion patterns.
Competitive Intelligence: Monitor competitor mentions through targeted queries, tracking engagement metrics and hashtags to benchmark share-of-voice and campaign performance against industry baselines.
Lead Generation: Filter verified accounts (is_verified: true) with high follower counts discussing relevant topics, extracting author.username for outreach prioritization based on engagement velocity.
Longitudinal Studies: Utilize scraped_at timestamps and external_id for temporal analysis, tracking topic evolution, hashtag lifecycle, and engagement decay curves across extended observation periods.
Content Strategy Optimization: Analyze media.type correlation with engagement rates, source distribution patterns, and optimal posting times derived from high-performing tweets in target verticals.
Technical Limitations
Important Considerations:
- Rate limiting applies at approximately 180 search requests per 15-minute window; implement exponential backoff for production workloads
- Historical data retrieval is constrained to Twitter's search index depth (typically 7-10 days for free-tier access)
impressionsdata may be unavailable for tweets from accounts without analytics access; expect null values in 40-60% of records- Deleted or protected tweets will not appear in results; dataset represents point-in-time availability only
sentiment_scoreis algorithmically derived and should be validated against domain-specific lexicons for mission-critical applications- Media URLs are subject to CDN expiration; archive assets externally if long-term retention is required
- Geolocation data (
locationobject) is present in <2% of tweets; do not rely on this field for geographic segmentation - Thread reconstruction via
conversation_idmay be incomplete for deeply nested reply chains (>100 replies)
Keywords & Tags: twitter scraper, twitter search scraper, x search scraper, export tweets, tweet data extraction, twitter lead generation, twitter data extractor