Twitter (X.com) Search Dataset Cookieless(Full History) avatar
Twitter (X.com) Search Dataset Cookieless(Full History)
Under maintenance

Pricing

from $1.50 / 1,000 results

Go to Apify Store
Twitter (X.com) Search Dataset Cookieless(Full History)

Twitter (X.com) Search Dataset Cookieless(Full History)

Under maintenance

Extract comprehensive Twitter (X) search results including hidden metadata, engagement metrics, and granular timestamps. High-fidelity, cookieless data extraction tool designed for precise sales intelligence and competitive analysis.

Pricing

from $1.50 / 1,000 results

Rating

0.0

(0)

Developer

Surge Street

Surge Street

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

1

Monthly active users

3 days ago

Last modified

Share

Twitter (X.com) Search Dataset Cookieless (Full History)

Overview

This actor performs a deep extraction of Twitter (X) search results, capturing comprehensive tweet metadata, author profiles, engagement metrics, and media assets without requiring authentication cookies. The extraction pipeline maintains full historical context through conversation threading and delivers structured JSON output optimized for analytical workloads. Data integrity is ensured through timestamp validation, unique composite identifiers, and schema-enforced type constraints.

Data Dictionary

Field NameData TypeDefinition
tweet_idStringUnique Twitter-assigned identifier for the tweet object
external_idStringComposite identifier combining tweet ID and extraction timestamp for deduplication
scraped_atString (ISO 8601)UTC timestamp indicating when the record was extracted from the platform
textStringFull UTF-8 encoded tweet content including mentions, hashtags, and URLs
language_codeStringISO 639-1 two-letter language code detected by Twitter's NLP engine
is_verifiedBooleanIndicates whether the author account holds verified status (blue checkmark)
is_retweetBooleanFlag identifying if the tweet is a retweet of another user's content
is_quoteBooleanFlag identifying if the tweet quotes another tweet with added commentary
sentiment_scoreFloatNormalized sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive)
author.user_idStringUnique Twitter-assigned identifier for the author account
author.usernameStringAccount handle without @ prefix (e.g., "tech_analyst")
author.display_nameStringUser-defined display name shown on profile
author.followers_countIntegerTotal number of accounts following this user at extraction time
author.following_countIntegerTotal number of accounts this user follows at extraction time
engagement.likesIntegerCumulative like/favorite count at time of extraction
engagement.retweetsIntegerNumber of times the tweet has been retweeted
engagement.repliesIntegerCount of direct reply tweets in the conversation thread
engagement.quotesIntegerNumber of quote tweets referencing this tweet
engagement.impressionsIntegerEstimated view count as reported by Twitter's analytics
media.has_mediaBooleanIndicates presence of attached media assets (images, videos, GIFs)
media.typeStringMedia classification: "image", "video", "animated_gif", or null
media.countIntegerNumber of media items attached to the tweet (max 4)
media.urlsArray[String]Direct URLs to media assets hosted on Twitter's CDN
location.place_idStringTwitter Place ID for geotagged tweets
location.place_typeStringGeographic granularity: "city", "admin", "country", or "poi"
location.place_nameStringHuman-readable location string (e.g., "San Francisco, CA")
conversation_idStringRoot tweet ID linking all replies in a conversation thread
sourceStringClient application used to publish the tweet (e.g., "Twitter Web App")
hashtagsArray[String]Extracted hashtags including the # symbol

Sample Dataset

Below is a sample of the high-fidelity JSON output:

{
"tweet_id": "1739284756104728576",
"external_id": "tw_1739284756104728576_20251221",
"scraped_at": "2025-12-21T12:30:45Z",
"text": "The future of AI is not what we expected. Here's why... (1/4)",
"language_code": "en",
"is_verified": true,
"is_retweet": false,
"is_quote": false,
"sentiment_score": 0.65,
"author": {
"user_id": "87654321",
"username": "tech_analyst",
"display_name": "Tech Analyst",
"followers_count": 45230,
"following_count": 892
},
"engagement": {
"likes": 3421,
"retweets": 892,
"replies": 234,
"quotes": 56,
"impressions": 28945
},
"media": {
"has_media": true,
"type": "image",
"count": 2,
"urls": ["https://x.com/img1.jpg", "https://x.com/img2.jpg"]
},
"location": {
"place_id": "01a9a39c7b349001",
"place_type": "city",
"place_name": "San Francisco, CA"
},
"conversation_id": "1739284756104728500",
"source": "Twitter Web App",
"hashtags": ["#AI", "#TechTrends", "#Future"]
}

Configuration Parameters

To ensure optimal data depth, configure the following:

ParameterJSON Field NameData TypeExampleDescription
Search QueryqueryString"Mr Beast"Twitter search syntax including keywords, hashtags, mentions, operators (AND/OR), and filters (lang:, from:, since:)

Analytical Use Cases

Sentiment Analysis: Leverage sentiment_score and text fields to perform time-series sentiment tracking across brand mentions, product launches, or crisis events. Aggregate sentiment by author.followers_count to weight influence.

Network Mapping: Construct social graphs using author.user_id, conversation_id, and engagement metrics to identify key opinion leaders, community clusters, and information diffusion patterns.

Competitive Intelligence: Monitor competitor mentions through targeted queries, tracking engagement metrics and hashtags to benchmark share-of-voice and campaign performance against industry baselines.

Lead Generation: Filter verified accounts (is_verified: true) with high follower counts discussing relevant topics, extracting author.username for outreach prioritization based on engagement velocity.

Longitudinal Studies: Utilize scraped_at timestamps and external_id for temporal analysis, tracking topic evolution, hashtag lifecycle, and engagement decay curves across extended observation periods.

Content Strategy Optimization: Analyze media.type correlation with engagement rates, source distribution patterns, and optimal posting times derived from high-performing tweets in target verticals.

Technical Limitations

Important Considerations:

  • Rate limiting applies at approximately 180 search requests per 15-minute window; implement exponential backoff for production workloads
  • Historical data retrieval is constrained to Twitter's search index depth (typically 7-10 days for free-tier access)
  • impressions data may be unavailable for tweets from accounts without analytics access; expect null values in 40-60% of records
  • Deleted or protected tweets will not appear in results; dataset represents point-in-time availability only
  • sentiment_score is algorithmically derived and should be validated against domain-specific lexicons for mission-critical applications
  • Media URLs are subject to CDN expiration; archive assets externally if long-term retention is required
  • Geolocation data (location object) is present in <2% of tweets; do not rely on this field for geographic segmentation
  • Thread reconstruction via conversation_id may be incomplete for deeply nested reply chains (>100 replies)

Keywords & Tags: twitter scraper, twitter search scraper, x search scraper, export tweets, tweet data extraction, twitter lead generation, twitter data extractor