All-in-One X/Twitter Scraper avatar

All-in-One X/Twitter Scraper

Pricing

from $0.20 / 1,000 tweet results

Go to Apify Store
All-in-One X/Twitter Scraper

All-in-One X/Twitter Scraper

X/Twitter scraper — 10 modes: tweets, profiles, followers, comments, timelines, lists, search & more. From $0.09/1K — up to 90% cheaper than alternatives. Premium residential proxy (~95% success rate). apidojo-compatible output. MCP-ready for AI agents.

Pricing

from $0.20 / 1,000 tweet results

Rating

0.0

(0)

Developer

Japi Cricket

Japi Cricket

Maintained by Community

Actor stats

0

Bookmarked

61

Total users

19

Monthly active users

4 hours ago

Last modified

Share

All-in-One X/Twitter Scraper

What does All-in-One X/Twitter Scraper do?

Scrape tweets, profiles, followers, timelines, comments, lists & more — residential proxy included, 128 MB memory. 10 core modes + 2 advanced modes, pay-per-result from $0.24/1K. Works with AI agents (Claude, GPT, Cursor) via MCP.

Three modes work without any login cookies (posts, profiles, data extractor).

Why choose this over separate X scrapers?

  • 10 modes in one actor — tweets, profiles, search, timelines, followers, comments, lists, user search, all-in-one, data extractor — one integration to maintain
  • Residential proxy included — Evomi-routed traffic on authenticated modes at no extra per-result cost
  • 128 MB memory — runs efficiently on minimal resources
  • ≥95% field accuracy — every mode validated against fresh Chrome ground truth (most recent: 2026-04-25 across 60 our-runs + 22 competitor-runs)
  • AI-ready — works with Claude, GPT, and Cursor via MCP protocol
  • Drop-in replacement for popular alternatives — every output item carries id / fullText / possiblySensitive / extendedEntities / protected aliases so existing pipelines keep working with zero code changes

What's New (2026-05-02 — §62 X-only audit + drift detection)

Daily-test infrastructure caught X's overnight bundle bump within 36 hours of yesterday's drift-detection ship. 8 GraphQL queryIds rotated simultaneously (TweetDetail / TweetResultByRestId / SearchTimeline / UserTweets / UserTweetsAndReplies / Followers / Following / ListLatestTweetsTimeline). Pre-flight detection fired BEFORE fan-out — operator email-flagged immediately, scraper auto-rotates at runtime, push-and-baseline closes the loop within hours of detection.

12-mode gap matrix verified: read 1-5 dataset items per X mode from past 100 successful runs, diffed against EXPECTED_FIELDS_BY_PLATFORM_MODE.x. Result: 11 of 12 modes have zero EXPECTED-field gaps. The 1 exception (x-post-scraper) returned an error item from the rotation event itself — proving the drift-detection ship works end-to-end.

Ground-truth comparator extended back to 11 fields: joinedDate re-added to compareX with a ±2-month tolerance window after a 9-handle cross-check (8/9 match). The window absorbs UTC-vs-local-rendered TZ-flip artifacts while still catching genuine semantic divergence.

4 ambiguous-handle test cases surfaced for the X-team upstream-investigation case (handles where X's UserByScreenName returns a less-famous registered owner): aaronlevie / tim / obama / satya. X enforces handle uniqueness — these are not scraper bugs, just X-side data.

Full audit: docs/audit/2026-05-02-x-only-improvements.md. Plan section: §62 in ~/.claude/plans/new-daily-test-plan-moonlit-journal.md.

What's New (build 0.1.75 — 2026-04-26)

Sprint A (today, 2026-04-26): appended 4 final apidojo-compat naming aliases — inReplyToId (= inReplyToTweetId), inReplyToUsername (= inReplyToUserName), quoteId (= quotedTweet?.tweetId), and inputSource (= input on every item). With these, the actor is now a complete drop-in replacement for apidojo schema-wise on tweet and per-item fields.

Earlier (build 0.1.73 — 2026-04-25)

After a head-to-head competitive analysis vs. the top X-scraper actors on the Apify Store (60 baseline runs + 22 competitor benchmark runs), we shipped 9 additive output fields without changing any existing behaviour. Pure additive: every existing integration keeps working unchanged.

Tier 1 — apidojo-compat aliases + every-item metadata

FieldWhereWhat it does
idevery tweet + user itemMirrors tweetId / twitterId. Drop-in compatibility for pipelines that reference item.id.
fullTexttweet itemsMirrors text (carries the full long-form body when isLongForm: true).
possiblySensitivetweet itemsMirrors isSensitive. Always boolean.
extendedEntitiestweet items{media: [...]} wrapper for media. Always present (empty {media: []} when none).
protecteduser itemsLowercase alias for isProtected.
scrapedAtevery itemUTC ISO 8601 stamp of when this snapshot was actually parsed from X's response. Useful for dataset cataloging, dedup-by-window, and audit trails. Cached items keep their original parse time.

Tier 2 — real new signal fields

FieldWhereWhat it does
fastFollowersCountuser itemslegacy.fast_followers_count from X — bot-prone signal for influencer / lead-quality analysis (high ratio of fast-followers to total = suggested-follow-driven, not organic). null when X omits the field.
pinnedTweetIdsuser itemsArray form of pinned tweet IDs (matches the wider X scraper schema family). Empty [] when none. The legacy singular pinnedTweetId is preserved.
isPinnedtweet itemstrue when this tweet's tweetId matches the author's pinned tweet ID. Useful for de-emphasising pinned tweets in chronological analysis.

All 9 fields ship alongside the existing 35-43 output fields per item — nothing was removed or renamed.

Getting Started

  1. Click "Try for free" at the top of this page
  2. Choose a scraping mode (Post Scraper, Profile Scraper, Tweet Search, Hashtag Scraper, Timeline, Follower, Following, Comment, List, or Data Extractor)
  3. Paste tweet URLs, usernames, or enter a search query
  4. Click Start — results appear in the Dataset tab within seconds
  5. Download as JSON, CSV, or Excel — or connect via API, n8n, Make, or Zapier

Residential proxy is included — no setup required. 3 modes work without login cookies (post, profile, data-extractor).

Easiest Way to Start: Paste a URL

Just paste any X/Twitter URL into the "Start URLs" field and hit Start. The scraper auto-detects the type:

URL PatternAuto-Detected Mode
x.com/user/status/1234567890Post Scraper (tweet by URL)
x.com/usernameProfile Scraper
x.com/i/lists/1234567890List Scraper

For Tweet Search and User Search, enter keywords in the "Search Queries" field.

10 Scraping Modes

ModeDescriptionAuth RequiredBest For
x-post-scraperScrape specific tweets by URLNoneTweet archiving, engagement tracking
x-profile-scraperFull user profiles with statsNoneLead enrichment, influencer research
x-data-extractorLightweight tweet URL scraperNoneQuick tweet data extraction
x-search-scraperSearch tweets by keywords/phrasesLogin cookiesSocial listening, trend monitoring
x-hashtag-scraperSearch tweets by hashtagLogin cookiesHashtag tracking, trend analysis
x-timeline-scraperScrape a user's tweet timelineLogin cookiesContent analysis, user monitoring
x-follower-scraperGet followers of an accountLogin cookiesAudience analysis, lead discovery
x-following-scraperGet accounts a user followsLogin cookiesNetwork mapping, competitive research
x-comment-scraperGet replies/conversation threadsLogin cookiesSentiment analysis, community insights
x-list-scraperScrape tweets from X/Twitter listsLogin cookiesCurated feed monitoring

Standard vs Authenticated Mode

3 modes work without any login cookies. 7 modes require your X/Twitter cookies for full functionality.

What Works Without Cookies (Standard Mode)

No login, no risk. Just paste and scrape:

  • Post Scraper: Full tweet data with engagement metrics, media, author info, quoted tweets
  • Profile Scraper: Full user profile with followers, bio, verification, account stats
  • Data Extractor: Same as Post Scraper (lightweight mode)

What Requires Cookies (Authenticated Mode)

Provide your auth_token and ct0 cookies to unlock search and social graph features:

ModeWhat It Unlocks
Tweet Search (x-search-scraper)Search tweets by keywords, phrases, and advanced operators
Hashtag Scraper (x-hashtag-scraper)Search tweets by hashtag (e.g., #AI, #crypto)
Timeline (x-timeline-scraper)Scrape a user's full tweet history
Followers (x-follower-scraper)Get an account's follower list
Following (x-following-scraper)Get the accounts a user follows
Comments (x-comment-scraper)Get replies and conversation threads
List Scraper (x-list-scraper)Scrape tweets from curated lists

When to Use Which Mode

Your GoalRecommended ModeCookies Needed?
Save a specific tweet with engagement datax-post-scraperNo
Get a user's profile info and follower countx-profile-scraperNo
Monitor brand mentions or trending topicsx-search-scraperYes
Track tweets under a specific hashtagx-hashtag-scraperYes
Analyze what someone has been tweetingx-timeline-scraperYes
Build a list of an account's followersx-follower-scraperYes
See who an account followsx-following-scraperYes
Analyze public reactions to a tweetx-comment-scraperYes
Monitor a curated Twitter listx-list-scraperYes

Pricing — Pay Per Result, No Monthly Fee

ModePrice / 1K results (Starter)Price / 1K results (Scale)Price / 1K results (Business)
x-post-scraper$0.24$0.22$0.20
x-profile-scraper$0.24$0.22$0.20
x-search-scraper$0.24$0.22$0.20
x-hashtag-scraper$0.24$0.22$0.20
x-timeline-scraper$0.24$0.22$0.20
x-comment-scraper$0.24$0.22$0.20
x-list-scraper$0.24$0.22$0.20
x-follower-scraper$0.24$0.22$0.20
x-following-scraper$0.24$0.22$0.20
x-data-extractor$0.30$0.27$0.25

Apify Subscription Discounts

Higher Apify subscription plans get automatic discounts on all modes:

Apify PlanDiscount
Free / Starter
Scale5% off
Business10% off
Enterprise15% off

Cost examples:

  • 1,000 tweets from search: $0.24
  • 500 user profiles: $0.12
  • 10,000 followers: $2.40
  • 100 tweet URLs: $0.024

You only pay for results delivered. Platform compute costs are included.

Why This X/Twitter Scraper?

  • Among the cheapest on Apify — $0.24/1K results across all core modes — significantly below competitor medians
  • 10 modes in one actor — tweets, profiles, search, timelines, followers, comments, lists, user search, all-in-one, data extractor — one integration to maintain
  • Residential proxy included — Evomi traffic built in; unauthenticated modes auto-skip to datacenter for bandwidth savings
  • HTTP-only — lightweight, no browser required
  • 128 MB memory — runs efficiently on minimal resources
  • ≥95% field accuracy verified against Chrome ground truth + backed by a 26-case parser regression suite that runs on every build (build 0.1.60, 2026-04-24)
  • Active cookie rotation — multi-cookie X_LOGIN_COOKIES pools cycle automatically when X burns an entry; emits COOKIE_EXHAUSTED when all are burned
  • Cross-run cache (optional) — opt into a 24 h KV cache with useCrossRunCache: true so scheduled/polling runs skip re-scraping tweets and profiles that are already fresh
  • Sparse output (optional) — pass fields: ["tweetId","text","likeCount"] to emit only the keys you need; shrinks payloads for Google Sheets / HubSpot / CSV integrations
  • Graceful rate-limit handling — honours X's x-rate-limit-reset header; backoff waits exactly as long as the server asked
  • Self-healing on doc-ID rotation — when X rotates a GraphQL queryId, the scraper fetches the live bundle, resolves the new ID, and retries; still logs loudly for a follow-up PR
  • Circuit breaker — aborts a doomed batch after 10 same-category failures in a row (RATE_LIMITED / AUTH_EXPIRED / etc.) instead of burning CU on requests that are guaranteed to fail
  • Resume capability — pick up where failed runs left off via resumeFromDataset, or skip already-seen items via onlyIfChangedSince: {dataset, since}
  • Optional deep user hydrationhydrateFullUser: true enriches follower / following / user-search results with the fields X trims on list endpoints (canHighlightTweets, verifiedSince, full professional)
  • Completion webhook — set completionWebhookUrl to POST a summary to your own endpoint (Slack / n8n / Zapier) when the run finishes
  • Schema-versioned output — every item carries schemaVersion: "1.0.0" so downstream consumers can pin behaviour
  • Clean text outputtext, description, and other user-visible string fields are HTML-entity decoded (&&, '') and stripped of zero-width / control characters, so output pastes straight into spreadsheets / CRMs without surprises
  • Auto-batch suggestions — on small runs (<5 results) the scraper logs a one-line tip pointing at the bulk-input fields; summary.autoBatchTip carries the same data for dashboards. No pricing change; just a nudge toward profitable batch sizes
  • MCP-compatible — works with AI agents (Claude, GPT, Cursor) out of the box

Cookies: bring your own (customer-supplied)

7 of 10 modes require X/Twitter login cookies. Here's how to get them:

  1. Open x.com in your browser and log in
  2. Open Developer Tools (F12 or Cmd+Opt+I)
  3. Go to Application > Cookies > x.com
  4. Copy the values of auth_token (HttpOnly — visible in DevTools cookie panel) and ct0
  5. Paste as: auth_token=YOUR_TOKEN; ct0=YOUR_CT0

Modes that do NOT require cookies: x-post-scraper, x-profile-scraper, x-data-extractor.

Important: this actor does NOT include any X/Twitter cookies for customer runs

When a mode requires authentication, you must paste your own cookies into the loginCookies input field. There is no shared cookie pool, no built-in fallback, and no operator-side cookie injection — every authenticated run uses the cookies you provide, attributing the activity to your X account.

This is intentional. Account-flag risk lives with the supplied cookies, not with the actor author. Top-MAU X scrapers on Apify follow the same pattern.

Customer usage — supply cookies via input

{
"scrapeMode": "x-search-scraper",
"searchQueries": ["...your query..."],
"loginCookies": "auth_token=YOUR_TOKEN; ct0=YOUR_CT0"
}

Cookie rotation: for large runs, provide multiple cookies separated by |||. The scraper uses the first cookie by default and automatically cycles to the next entry whenever X burns the active one (AUTH_EXPIRED response). If every pooled cookie is burned during the same run, a COOKIE_EXHAUSTED error item is emitted (see Error Items):

auth_token=aaa; ct0=bbb|||auth_token=ccc; ct0=ddd

Operator daily-testing only

The daily-test orchestrator passes the operator's cookies via input.loginCookies (same input field customers use). This keeps customer and operator paths uniform and prevents accidental operator-cookie leakage to customer runs. The X_LOGIN_COOKIES environment variable is no longer read by the scraper as of build 2026-05-08 — operator cookies live on the orchestrator and the cookie-health-check probe, never on this scraper actor.

Always use a dedicated burner X account for testing, never your primary personal account — X's anti-automation system can flag accounts that show sustained API access patterns.

Migration note (2026-05-08)

Earlier versions of this actor accepted an X_LOGIN_COOKIES env-var fallback when loginCookies was empty. That fallback has been removed for the customer-vs-operator cookie boundary. Customer runs without supplied cookies now fail cleanly for cookie-required modes instead of silently using whatever was in the env var.

Every authenticated run validates your auth_token + ct0 against the X API at startup. The result is surfaced in the run logs:

INFO Cookie health check: VALID (verified via @YourHandle)

If validation fails, you'll see:

ERROR Cookie health check: FAILED<reason>
ERROR Authenticated modes will fail. Refresh the cookies provided via input.loginCookies with fresh auth_token+ct0 from x.com.

No cookies or cookie fingerprints are stored anywhere — not in the run's key-value store, not in named KV stores. Cookies live only in the loginCookies input field per-run. This is an explicit policy: we persist no cookie-derived data, even hashed.

Typical cookie lifetime on X is 30–60 days. If your scheduled runs start failing with auth errors, refresh the cookies in your loginCookies input field from a logged-in browser session.

Residential Proxy (Included by Default)

The scraper defaults to proxyTier: "residential" — Evomi residential traffic is included in the per-result price, no extra setup. The unauthenticated modes (x-post-scraper, x-profile-scraper, x-data-extractor) automatically skip the residential proxy and use datacenter IPs because they hit syndication / guest-token endpoints that don't need residential routing — this saves bandwidth at zero accuracy risk.

Opt-out: Datacenter only

If you want every mode to run on datacenter IPs (e.g. you're scraping only tweet URLs and want absolute-minimum latency), set proxyTier to "none":

{
"scrapeMode": "x-post-scraper",
"tweetURLs": ["https://x.com/NASA/status/..."],
"proxyTier": "none"
}

When Residential Matters

  • Cookies expiring fast — residential IPs look more like real users, reducing session invalidation
  • Rate limits on authenticated modes — tweet search, timeline, followers, comments
  • Large-scale runs — scraping thousands of results per session with login cookies

These are exactly the modes that run on residential by default.

Alternative: Bring Your Own Proxy

If you already have a residential proxy subscription, set proxyTier to "custom" and paste your proxy URL into the Proxy Configuration field. Apify Proxy is not supported — this actor is Evomi-only by policy.

MCP Integration for AI Agents

This scraper works with AI agents via the Model Context Protocol (MCP). Connect it to Claude Desktop, Cursor, GPT, or any MCP-compatible client.

Setup:

  1. Go to mcp.apify.com
  2. Add "All-in-One X/Twitter Scraper" to your MCP server
  3. Ask your AI: "Find the latest tweets about AI from NASA"

Example prompts for your AI agent:

  • "Scrape the profile of @elonmusk on X"
  • "Search X for tweets about web scraping from the last week"
  • "Get the followers of @apify on Twitter"
  • "Find all replies to this tweet: https://x.com/..."

Works with Claude Desktop, Cursor, GPT via MCP, and any other MCP-compatible AI client.

Integrations

n8n

  1. Add the Apify node in your n8n workflow
  2. Select "All-in-One X/Twitter Scraper" as the actor
  3. Configure the mode and input parameters
  4. Connect the output to your CRM, Google Sheets, or database

Make.com (Integromat)

  1. Add the Apify module to your scenario
  2. Select "Run Actor" and choose this scraper
  3. Map the JSON output fields to your downstream modules
  4. Use for automated tweet monitoring, lead enrichment, or CRM syncing

Zapier

  1. Create a new Zap with Apify as the trigger or action
  2. Select "Run Actor" and configure with this scraper's actor ID
  3. Map output fields to Google Sheets, HubSpot, Salesforce, or Slack
  4. Trigger on schedule or from a webhook

REST API & SDKs

Use the Apify API, JavaScript SDK, or Python SDK for programmatic access. See the Python examples in each mode section below.


Mode 1: Post Scraper (x-post-scraper)

Scrape specific tweets by URL with full engagement data, media, author info, and quoted tweets. No login cookies required.

Pricing tip: batch ≥5 tweets per run to amortize the per-run actor-start fee. Single-item runs are ~4× less cost-efficient than 10-item batches because the fixed startup overhead dominates a 1-item run.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-post-scraper"
tweetURLsstring[]YesTweet URLs or tweet IDs
maxResultsintegerNoMax results (default: 100)

Input Example

{
"scrapeMode": "x-post-scraper",
"tweetURLs": [
"https://x.com/elonmusk/status/1728108619189874825",
"https://x.com/NASA/status/1630332507265589248"
]
}

Output Fields

FieldTypeDescriptionExample
typestringAlways "tweet""tweet"
tweetIdstringUnique tweet ID"1728108619189874825"
urlstringFull X.com URL"https://x.com/elonmusk/status/..."
twitterUrlstringLegacy twitter.com URL"https://twitter.com/elonmusk/status/..."
textstringTweet content. HTML entities are decoded (&amp;&, &#39;', etc.) and zero-width / control characters are stripped. Newlines + emoji are preserved."More than 10 per human on average"
retweetCountintegerRetweet count8880
replyCountintegerReply count5931
likeCountintegerLike count92939
quoteCountintegerQuote tweet count2862
bookmarkCountintegerBookmark count605
viewCountintegerView count37422895
createdAtstringTweet timestamp (X native format)"Fri Nov 24 17:49:36 +0000 2023"
normalizedCreatedAtstringSame timestamp as createdAt in ISO 8601 (UTC). Additive — the native field is kept for backward compatibility. null only when X omits the timestamp."2023-11-24T17:49:36.000Z"
langstringLanguage code (ISO 639-1)"en"
sourcestringPosting app/device"Twitter for iPhone"
isReplybooleanIs a replyfalse
isRetweetbooleanIs a retweetfalse
isQuotebooleanIs a quote tweettrue
conversationIdstringThread root ID"1728108619189874825"
inReplyToTweetIdstringParent tweet ID (if reply)null
inReplyToUserIdstringParent user ID (if reply)null
inReplyToUserNamestringParent username (if reply)null
hashtagsarrayHashtag strings["AI", "space"]
urlsarrayURL objects[{url, expandedUrl, displayUrl}]
userMentionsarrayMention objects[{userName, name, twitterId}]
mediaarrayMedia objects (photos, videos, GIFs). Each entry: {type, url, expandedUrl, videoUrl, durationMs, bitrate, videoVariants, width, height, aspectRatio, altText}. Video enrichment (duration / bitrate / aspectRatio) is populated when X provides them.[{type:"video", url, videoUrl, durationMs:30580, bitrate:2176000, aspectRatio:1.7778}]
geoobjectFlattened geolocation when the tweet is geotagged: {lat, lng, placeName, placeType, country}. null for untagged tweets. Complements the raw place object.{lat: 37.7749, lng: -122.4194, placeName: "San Francisco, CA", country: "United States"}
cardobjectLink preview card{type, title, description, url}
placeobjectGeolocation datanull
authorobjectAuthor profile{userName, name, twitterId, followers, ...}
quotedTweetobjectQuoted tweet (full object){tweetId, text, author, ...}
isLongFormbooleanX Premium long-form tweet (>280 chars via note_tweet). text carries the full body when true.false
isSensitivebooleanX-marked possibly sensitive (NSFW/graphic). Always boolean (default false) to match X's UI — true only when X explicitly flags.false
editHistoryTweetIdsarrayTweet IDs forming the edit chain (current tweet is last entry). Contains at least [tweetId] whenever the tweet is still within its edit window; additional IDs appear after edits. null only when X does not expose any edit metadata for the tweet.["1728108619189874825"]
isEditedbooleanDerived: editHistoryTweetIds.length > 1.false
editableUntilMsecsintegerUnix ms when the tweet stops being editable. null when not editable.1677538334000
communityIdstringID of the X Community this tweet was posted in (if any). null for regular public tweets.null
translationobjectAuto-translation when X attached one. Shape: {text, sourceLang, targetLang, translator}. null when absent.null
articleobjectX Article metadata when the tweet carries a longform article. Shape: {id, title, previewText, coverMediaKey, coverMediaUrl, wordCount}. null for non-article tweets.null
idstring(0.1.73 alias) Mirrors tweetId for drop-in compatibility with pipelines that reference item.id."1728108619189874825"
fullTextstring(0.1.73 alias) Mirrors text. Carries the full long-form body when isLongForm: true."More than 10 per human on average"
possiblySensitiveboolean(0.1.73 alias) Mirrors isSensitive. Always boolean.false
extendedEntitiesobject(0.1.73) {media: [...]} wrapper. Always present (empty {media: []} when no media).{"media": []}
isPinnedboolean(0.1.73) true when this tweet's ID matches the author's pinned tweet ID. Always false on syndication path (no author-pinned info available).false
scrapedAtstring(0.1.73) UTC ISO 8601 stamp of when this snapshot was parsed. Cached items keep their original parse time."2026-04-25T10:40:33.131Z"

Output Example

{
"type": "tweet",
"tweetId": "1728108619189874825",
"id": "1728108619189874825",
"url": "https://x.com/elonmusk/status/1728108619189874825",
"twitterUrl": "https://twitter.com/elonmusk/status/1728108619189874825",
"text": "More than 10 per human on average",
"fullText": "More than 10 per human on average",
"retweetCount": 8880,
"replyCount": 5931,
"likeCount": 92939,
"quoteCount": 2862,
"bookmarkCount": 605,
"viewCount": 37422895,
"createdAt": "Fri Nov 24 17:49:36 +0000 2023",
"normalizedCreatedAt": "2023-11-24T17:49:36.000Z",
"lang": "en",
"source": "Twitter for iPhone",
"isReply": false,
"isRetweet": false,
"isQuote": true,
"isSensitive": false,
"possiblySensitive": false,
"isPinned": false,
"conversationId": "1728108619189874825",
"hashtags": [],
"urls": [],
"userMentions": [],
"media": [],
"extendedEntities": { "media": [] },
"author": {
"type": "user",
"userName": "elonmusk",
"name": "Elon Musk",
"twitterId": "44196397",
"isBlueVerified": true,
"profilePicture": "https://pbs.twimg.com/profile_images/.../image_400x400.jpg",
"followers": 237889308,
"following": 1308
},
"quotedTweet": {
"type": "tweet",
"tweetId": "1728107610631729415",
"text": "The posts on X gets ~ 100 billion impressions every day.",
"likeCount": 3396,
"viewCount": 38475358,
"author": { "userName": "cb_doge", "name": "DogeDesigner" }
},
"scrapedAt": "2026-04-25T10:40:33.131Z"
}

Use Cases

  • Tweet archiving: Save important tweets with full engagement data before they're deleted
  • Engagement tracking: Monitor how specific tweets perform over time
  • Content research: Analyze viral tweets to understand what resonates
  • Fact-checking: Capture tweet content with metadata for verification

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-post-scraper",
"tweetURLs": [
"https://x.com/elonmusk/status/1728108619189874825",
"https://x.com/NASA/status/1630332507265589248"
]
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['author']['userName']}: {item['text'][:80]} ({item['likeCount']} likes, {item['viewCount']} views)")

Mode 2: Profile Scraper (x-profile-scraper)

Scrape full X/Twitter user profiles with follower counts, bio, verification status, and account metadata. No login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-profile-scraper"
twitterHandlesstring[]YesUsernames, @handles, or profile URLs
maxResultsintegerNoMax results (default: 100)

Input Example

{
"scrapeMode": "x-profile-scraper",
"twitterHandles": ["elonmusk", "NASA", "OpenAI"]
}

Output Fields

FieldTypeDescriptionExample
typestringAlways "user""user"
twitterIdstringUser REST ID"11348282"
userNamestring@handle"NASA"
namestringDisplay name"NASA"
urlstringX.com URL"https://x.com/NASA"
twitterUrlstringLegacy twitter.com URL"https://twitter.com/NASA"
descriptionstringBio text. Same HTML-entity decoding + control-char stripping as text so the bio pastes cleanly into downstream tools."The Moon is the mission..."
locationstringSelf-reported location"Moonbound"
websitestringProfile website URL"https://t.co/9NkQJKAVks"
isVerifiedbooleanLegacy verified (pre-2023)false
isBlueVerifiedbooleanX Premium subscribertrue
verifiedTypestringVerification type"Government", "Business", or null
profilePicturestringAvatar URL (400x400)"https://pbs.twimg.com/..."
coverPicturestringBanner image URL"https://pbs.twimg.com/..."
followersintegerFollower count90633649
followingintegerFollowing count117
statusesCountintegerTotal tweets posted73634
favouritesCountintegerTotal likes given16673
listedCountintegerLists the user is on96785
mediaCountintegerTotal media posted27827
createdAtstringAccount creation date (X native format)"Wed Dec 19 20:20:32 +0000 2007"
normalizedCreatedAtstringSame timestamp in ISO 8601 (UTC)"2007-12-19T20:20:32.000Z"
isProtectedbooleanPrivate accountfalse
canDmbooleanDMs openfalse
professionalobjectProfessional account info — {type, category:[{name, iconName, id}]}{type: "Creator", category: []}
pinnedTweetIdstringPinned tweet ID"2040213736883892403"
affiliateBadgeobjectBadge when account is affiliated with a verified org (e.g. @Tesla staff get a Tesla badge). Shape: {labelType, description, badgeUrl, sourceUrl, sourceUserScreenName}. null otherwise.{"labelType":"BusinessLabel","description":"X","badgeUrl":"https://pbs.twimg.com/..."}
verifiedSincestringVerification timestamp (ISO 8601). null when X does not expose a positive timestamp (common for legacy verifications and X-affiliate accounts)."2017-10-05T08:46:27.580Z"
canHighlightTweetsbooleanWhether this user has the Highlights tab enabled. Populated on dedicated profile lookups (x-profile-scraper, x-scraper); null on list-mode endpoints (followers / following / user-search) where X serves a trimmed user object that omits the flag.true
idstring(0.1.73 alias) Mirrors twitterId for drop-in compatibility with pipelines that reference item.id."44196397"
protectedboolean(0.1.73 alias) Lowercase alias for isProtected.false
fastFollowersCountinteger(0.1.73) legacy.fast_followers_count from X — number of accounts following this user via the suggested-follow path. High ratio of fast-followers to total followers is a useful bot-prone signal for influencer / lead-quality analysis. null when X omits the field.0
pinnedTweetIdsarray(0.1.73) Array form of pinned tweet IDs for cross-scraper schema parity. Currently always 0 or 1 element (X exposes one pin); array shape is forward-compat. Empty [] when none pinned. The legacy singular pinnedTweetId is preserved.["2047881966268117064"]
scrapedAtstring(0.1.73) UTC ISO 8601 stamp of when this snapshot was parsed. Cached items keep their original parse time."2026-04-25T10:40:33.705Z"

Output Example

{
"type": "user",
"twitterId": "44196397",
"id": "44196397",
"userName": "elonmusk",
"name": "Elon Musk",
"url": "https://x.com/elonmusk",
"twitterUrl": "https://twitter.com/elonmusk",
"description": "https://t.co/dDtDyVssfm",
"location": null,
"website": "http://Terafab.ai",
"isVerified": false,
"isBlueVerified": true,
"verifiedType": null,
"profilePicture": "https://pbs.twimg.com/profile_images/.../image_400x400.jpg",
"coverPicture": "https://pbs.twimg.com/profile_banners/44196397/...",
"followers": 237890854,
"following": 1308,
"statusesCount": 100639,
"favouritesCount": 221162,
"listedCount": 168145,
"mediaCount": 4434,
"createdAt": "Tue Jun 02 20:12:29 +0000 2009",
"normalizedCreatedAt": "2009-06-02T20:12:29.000Z",
"isProtected": false,
"protected": false,
"canDm": false,
"fastFollowersCount": 0,
"professional": { "type": "Creator", "category": [] },
"pinnedTweetId": "2047881966268117064",
"pinnedTweetIds": ["2047881966268117064"],
"scrapedAt": "2026-04-25T10:40:33.705Z"
}

Use Cases

  • Lead enrichment: Enrich CRM contacts with X profile data (bio, followers, verification)
  • Influencer research: Identify key accounts by follower count and verification status
  • Competitor tracking: Monitor competitor accounts for follower growth
  • Account verification: Check if accounts are verified, professional, or protected

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-profile-scraper",
"twitterHandles": ["elonmusk", "NASA", "OpenAI"]
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['userName']}{item['followers']} followers — {item['description'][:60]}")

Mode 3: Tweet Search (x-search-scraper)

Search tweets by keywords, phrases, and advanced operators with the Query Wizard. Supports date ranges, engagement filters, media filters, and geo-targeting.

Login cookies required. Provide auth_token and ct0 cookies from x.com.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-search-scraper"
searchQueriesstring[]YesSearch queries with advanced syntax
maxResultsintegerNoMax results per query (default: 100)
sortstringNo"Latest", "Top", or "Latest + Top"
loginCookiesstringYes"auth_token=xxx; ct0=yyy"
languageFilterstringNoISO 639-1 language code (e.g., "en")
startstringNoStart date ("YYYY-MM-DD")
endstringNoEnd date ("YYYY-MM-DD")
minimumFavoritesintegerNoMin likes filter
authorstringNoFrom user (Query Wizard)

Input Example

{
"scrapeMode": "x-search-scraper",
"searchQueries": ["web scraping", "from:NASA #space"],
"maxResults": 100,
"sort": "Latest",
"languageFilter": "en",
"start": "2026-01-01",
"minimumFavorites": 10,
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same tweet fields as Mode 1 (Post Scraper). Plus optional searchTerm field when includeSearchTerms is enabled.

Output Example

Same format as Mode 1. Each tweet includes full engagement data, author profile, media, entities, and metadata.

Use Cases

  • Social listening: Monitor mentions of your brand, product, or industry
  • Trend monitoring: Track emerging topics and hashtags in real-time
  • Competitive analysis: Monitor what competitors' audiences are saying
  • Market research: Analyze public sentiment around events, launches, or announcements

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-search-scraper",
"searchQueries": ["web scraping"],
"maxResults": 100,
"sort": "Latest",
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['author']['userName']}: {item['text'][:80]} ({item['likeCount']} likes)")

Mode 4: Timeline Scraper (x-timeline-scraper)

Scrape a user's tweet timeline including original tweets, retweets, and optionally replies.

Login cookies required for best results. Guest token works for basic access.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-timeline-scraper"
twitterHandlesstring[]YesUsernames or profile URLs
maxResultsintegerNoMax tweets per profile (default: 100)
includeRepliesbooleanNoInclude replies (default: false)
loginCookiesstringRecommended"auth_token=xxx; ct0=yyy"

Input Example

{
"scrapeMode": "x-timeline-scraper",
"twitterHandles": ["NASA"],
"maxResults": 50,
"includeReplies": false,
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same tweet fields as Mode 1 (Post Scraper).

Use Cases

  • Content analysis: Analyze a user's posting patterns, topics, and engagement
  • User monitoring: Track what key accounts are tweeting about
  • Research: Collect a user's tweet history for academic or business research
  • Archiving: Save a user's tweet timeline for record-keeping

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-timeline-scraper",
"twitterHandles": ["NASA"],
"maxResults": 100,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"{item['createdAt']}: {item['text'][:80]} ({item['likeCount']} likes)")

Mode 5: Follower Scraper (x-follower-scraper)

Get the list of accounts that follow a specific X/Twitter account. For the following list (accounts a user follows), use Mode 9 (x-following-scraper).

Login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-follower-scraper"
twitterHandlesstring[]YesUsernames or profile URLs
maxResultsintegerNoMax followers per profile (default: 100)
loginCookiesstringYes"auth_token=xxx; ct0=yyy"
hydrateFullUserbooleanNoDefault false. Set to true to run a follow-up UserByScreenName call per returned follower, filling canHighlightTweets, verifiedSince, and the fuller professional object that X trims from list-endpoint responses. Doubles GraphQL call count for this mode.

Input Example

{
"scrapeMode": "x-follower-scraper",
"twitterHandles": ["apify"],
"maxResults": 500,
"loginCookies": "auth_token=xxx; ct0=yyy",
"hydrateFullUser": false
}

Output Fields

Same user profile fields as Mode 2 (Profile Scraper), plus:

FieldTypeDescriptionExample
followerOfstringSource profile"apify"

Use Cases

  • Audience analysis: Understand who follows a competitor or influencer
  • Lead discovery: Find potential customers from a competitor's follower list
  • Influencer marketing: Identify relevant followers for partnership outreach
  • Network mapping: Map connections between accounts

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-follower-scraper",
"twitterHandles": ["apify"],
"maxResults": 500,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['userName']}{item['followers']} followers — {item['description'][:50] if item.get('description') else ''}")

Mode 6: Comment Scraper (x-comment-scraper)

Get replies and conversation threads for specific tweets.

Login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-comment-scraper"
tweetURLsstring[]YesTweet URLs to get replies for
maxResultsintegerNoMax replies per tweet (default: 100)
loginCookiesstringYes"auth_token=xxx; ct0=yyy"

Input Example

{
"scrapeMode": "x-comment-scraper",
"tweetURLs": ["https://x.com/elonmusk/status/1728108619189874825"],
"maxResults": 50,
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same tweet fields as Mode 1, but with type: "comment" instead of type: "tweet".

Use Cases

  • Sentiment analysis: Analyze public reactions to announcements or events
  • Community insights: Understand what topics drive conversation
  • Customer feedback: Monitor replies to brand tweets for support issues
  • Trend analysis: Identify emerging opinions in reply threads

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-comment-scraper",
"tweetURLs": ["https://x.com/elonmusk/status/1728108619189874825"],
"maxResults": 50,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['author']['userName']}: {item['text'][:80]} ({item['likeCount']} likes)")

Mode 7: Hashtag Scraper (x-hashtag-scraper)

Search X/Twitter for tweets containing a specific hashtag. Ideal for tracking trending topics, events, and campaigns by hashtag.

Login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-hashtag-scraper"
searchQueriesstring[]YesHashtags or hashtag-based queries (e.g., "#AI", "#crypto -filter:retweets")
maxResultsintegerNoMax tweets per query (default: 100)
sortstringNo"Latest", "Top", or "Latest + Top"
loginCookiesstringYes"auth_token=xxx; ct0=yyy"

Input Example

{
"scrapeMode": "x-hashtag-scraper",
"searchQueries": ["#AI", "#webdev"],
"maxResults": 100,
"sort": "Latest",
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same tweet fields as Mode 1 (Post Scraper).

Use Cases

  • Trend tracking: Monitor what's being said under a specific hashtag in real-time
  • Event coverage: Collect tweets from live events and conferences
  • Campaign monitoring: Track your own or competitor hashtag campaigns
  • Market research: Analyze public discussion around industry hashtags

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-hashtag-scraper",
"searchQueries": ["#AI"],
"maxResults": 100,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['author']['userName']}: {item['text'][:80]} ({item['likeCount']} likes)")

Mode 8: List Scraper (x-list-scraper)

Scrape tweets from X/Twitter lists (curated collections of accounts).

Login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-list-scraper"
listURLsstring[]YesFull list URLs — must be https://x.com/i/lists/{id} format
maxResultsintegerNoMax tweets per list (default: 100)
loginCookiesstringYes"auth_token=xxx; ct0=yyy"

Note on list URL format: listURLs requires the full URL (https://x.com/i/lists/1234567890). Bare list IDs are not supported.

Input Example

{
"scrapeMode": "x-list-scraper",
"listURLs": ["https://x.com/i/lists/1234567890"],
"maxResults": 100,
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same tweet fields as Mode 1 (Post Scraper).

Use Cases

  • Curated feed monitoring: Track tweets from industry-specific lists
  • News aggregation: Collect tweets from journalist or media lists
  • Research: Monitor topic-specific lists for academic or market research
  • Competitive intelligence: Track curated competitor lists

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-list-scraper",
"listURLs": ["https://x.com/i/lists/1234567890"],
"maxResults": 100,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['author']['userName']}: {item['text'][:80]}")

Mode 9: Following Scraper (x-following-scraper)

Get the list of accounts a specific X/Twitter user follows. For the reverse (getting an account's followers), use Mode 5 (x-follower-scraper).

Login cookies required.

Input Parameters

ParameterTypeRequiredDescription
scrapeModestringYes"x-following-scraper"
twitterHandlesstring[]YesUsernames or profile URLs
maxResultsintegerNoMax accounts per profile (default: 100)
loginCookiesstringYes"auth_token=xxx; ct0=yyy"
hydrateFullUserbooleanNoDefault false. Set true to enrich each returned user with canHighlightTweets, verifiedSince, and the full professional object via a follow-up UserByScreenName call.

Input Example

{
"scrapeMode": "x-following-scraper",
"twitterHandles": ["apify"],
"maxResults": 500,
"loginCookies": "auth_token=xxx; ct0=yyy"
}

Output Fields

Same user profile fields as Mode 2 (Profile Scraper), plus:

FieldTypeDescriptionExample
followingOfstringSource profile"apify"

Use Cases

  • Network mapping: Understand which accounts key players follow
  • Influencer research: Discover who influencers are learning from
  • Competitive intelligence: See which accounts your competitors follow
  • Lead generation: Find prospects through shared-following analysis

How to Run

Python:

from apify_client import ApifyClient
client = ApifyClient("YOUR_API_TOKEN")
run = client.actor("get-leads/all-in-one-x-scraper").call(run_input={
"scrapeMode": "x-following-scraper",
"twitterHandles": ["apify"],
"maxResults": 500,
"loginCookies": "auth_token=xxx; ct0=yyy"
})
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(f"@{item['userName']}{item['name']}{item['followers']} followers")

Mode 10: Data Extractor (x-data-extractor)

Lightweight tweet scraper for extracting data from tweet URLs. Same output as Mode 1. No login cookies required.

Pricing tip: batch ≥3 tweets per run for best MCP pricing. AI agents using MCP should prefer multi-tweet requests when possible — the per-run startup overhead is amortized across batched items.

Input Example

{
"scrapeMode": "x-data-extractor",
"tweetURLs": [
"https://x.com/elonmusk/status/1728108619189874825"
]
}

Output Fields

Same tweet fields as Mode 1 (Post Scraper).


Advanced Modes

Two additional scrapeMode values are exposed for power users and don't appear in the main mode table above:

x-scraper — All-in-One Auto-Router

Auto-detects URL type per item in startUrls and dispatches to the appropriate handler. Use this when you have mixed inputs (tweet URLs + profile URLs + list URLs in one batch). Each output item carries a type field (tweet or user) and its own input back-reference so you can correlate.

{
"scrapeMode": "x-scraper",
"startUrls": [
"https://x.com/elonmusk/status/1728108619189874825",
"https://x.com/NASA",
"https://x.com/i/lists/1494198051328528385"
],
"maxResults": 5
}

x-user-search-scraper — Search Users by Keyword

Uses X's SearchTimeline with product=People to return user profiles matching a search term. Cookies required. Accepts the optional hydrateFullUser flag to enrich each result with fields X trims on search endpoints (canHighlightTweets, verifiedSince, full professional).

{
"scrapeMode": "x-user-search-scraper",
"searchQueries": ["AI researcher", "space journalist"],
"maxResults": 50,
"loginCookies": "auth_token=YOUR_TOKEN; ct0=YOUR_CT0",
"hydrateFullUser": true
}

Output is the same user object as Mode 2 (Profile Scraper). With hydrateFullUser: true the scraper makes an extra UserByScreenName call per returned user, so cost scales with the result count — use when you need the full profile shape, leave off (default) for lightweight enumeration.


Error Items

When an input can't be resolved to a real item, the scraper emits a structured {type:"error", ...} record instead of silently dropping it. Three error codes are emitted today:

NOT_FOUND — account or tweet deleted/suspended

For profile-consuming modes (profile, timeline, follower, following, all-in-one): the handle is shaped correctly but X says no such account exists (deleted, suspended, or never existed).

{
"type": "error",
"error": "NOT_FOUND",
"username": "deleteduser",
"message": "Account does not exist or has been deleted",
"input": "deleteduser"
}

For tweet-consuming modes (post, data-extractor, all-in-one): the tweet URL/ID references a tweet that no longer exists on X.

{
"type": "error",
"error": "NOT_FOUND",
"message": "Tweet does not exist or has been deleted",
"input": "1790102578472022371"
}

INVALID_HANDLE — input failed validation before any network call

The input never resembled a scrapable X handle. X handles are at most 15 characters and only contain letters, digits, and underscores. The message field tells you why:

{
"type": "error",
"error": "INVALID_HANDLE",
"input": "UnderTheRadarMag",
"message": "HANDLE_TOO_LONG (16 chars; X handles max 15)"
}

Message values you may see:

  • HANDLE_TOO_LONG (N chars; X handles max 15)
  • INVALID_CHARACTERS (X handles allow A-Z, a-z, 0-9, underscore only)
  • INVALID_URL_FORMAT (expected https://x.com/<handle> …)
  • EMPTY_INPUT

INVALID_HANDLE error items only fire for profile-consuming modes (profile, timeline, follower, following, all-in-one). They let you distinguish typos from deleted accounts in downstream workflows.

Non-error item types

Beyond type: "tweet" | "user" | "comment" | "error", two control-plane item types are emitted in specific situations:

typeWhen it firesPayload
dry-runYou set dryRun: true on input{type, scrapeMode, inputs: {profiles, tweetIds, listIds, searchQueries, activeCookies, proxyTier, cookiePoolSize}, inputProblems} — no scraping happened
partial-summaryGraceful shutdown (SIGTERM / actor abort / migration) interrupted the run mid-batch{type, reason, mode, durationSecs, stats, schemaVersion} — carries the counters collected before the interrupt

Both are harmless for normal dataset consumers — filter them the same way you filter error items: type === "tweet" | "user" | "comment".

SHADOW_BANNED — tweet or account is restricted / limited-visibility

X distinguishes between deleted tweets (HTTP 404 → NOT_FOUND) and tweets that are still public but intentionally hidden — either because of a visibility limit or because the posting account has been restricted. When X returns a 403 with a message mentioning "restricted", "suspended", "limited", or "visibility", the scraper emits a SHADOW_BANNED item so you can treat these cases differently from genuine deletions.

{
"type": "error",
"error": "SHADOW_BANNED",
"message": "Tweet or account is restricted / limited-visibility",
"input": "1234567890"
}

RATE_LIMITED, DOC_ID_ROTATED, FORMAT_CHANGE, BLOCKED, AUTH_EXPIRED — operational telemetry

Persistent operational failures surface once per run as structured error items so downstream workflows (n8n, Zapier, a monitoring dashboard) can react. These mirror the errorsByCategory section of the run summary:

CodeMeaningTypical fix
RATE_LIMITEDHit X's rate-limit window. The scraper already respects the x-rate-limit-reset header internally; the item appears only when the run genuinely exhausted the window.Rerun after the reset window, or split the batch across multiple schedules.
DOC_ID_ROTATEDX rotated a GraphQL queryId. The scraper auto-resolves the new ID for the current run — this error item signals a PR is needed to update src/utils/constants.js.Run tools/rotate-doc-ids.js or the drift-detection CI that opens an issue automatically.
FORMAT_CHANGEA parser couldn't read a key it expected. Rare; usually means X shipped a schema change.Report via a GitHub issue with a sample input.
BLOCKEDX returned a 403 unrelated to visibility (e.g. account flagged).Rotate cookies or cool down the source.
AUTH_EXPIREDCurrent cookie was invalidated mid-run (single-cookie setups).Refresh X_LOGIN_COOKIES. For multi-cookie pools, see COOKIE_EXHAUSTED.

Fires at most once per run. When authenticated modes are in use and X invalidates your auth_token (cookie burn), the scraper rotates to the next cookie in X_LOGIN_COOKIES (separated by |||). If every pooled cookie is burned during the same run, a single COOKIE_EXHAUSTED item is pushed at the end so downstream workflows can alert on it.

{
"type": "error",
"error": "COOKIE_EXHAUSTED",
"message": "All cookies in X_LOGIN_COOKIES were invalidated during this run. Refresh auth_token+ct0 from x.com and restart."
}

This is distinct from the startup cookie-health check — COOKIE_EXHAUSTED means cookies became invalid mid-run, not that they were bad going in. A single-cookie setup bypasses rotation entirely and will simply fail authenticated calls with the usual AUTH_EXPIRED error category.

Reacting to errors downstream

for item in dataset.iterate_items():
if item.get("type") == "error":
code = item["error"]
if code == "NOT_FOUND":
... # real account/tweet that no longer exists
elif code == "INVALID_HANDLE":
... # user supplied a bad handle (fix input)
elif code == "SHADOW_BANNED":
... # tweet / account restricted — NOT deleted
elif code == "COOKIE_EXHAUSTED":
... # refresh X_LOGIN_COOKIES and rerun
elif code in {"RATE_LIMITED", "DOC_ID_ROTATED", "FORMAT_CHANGE", "BLOCKED", "AUTH_EXPIRED"}:
... # operational — alert the maintainer
else:
... # successful scrape

Output Controls (optional)

Sparse output — fields

Pass an array of top-level keys to keep only those in every output item. Empty (default) returns the full object. Useful when exporting to Google Sheets, HubSpot, or CSV — smaller payloads, less downstream parsing.

{
"scrapeMode": "x-post-scraper",
"tweetURLs": ["https://x.com/NASA/status/..."],
"fields": ["tweetId", "text", "likeCount", "retweetCount", "author", "normalizedCreatedAt"]
}

Always preserved regardless of the allow-list: type, input, error, message, username, searchTerm, followerOf, followingOf. That way error items are never silently truncated.

Incremental scraping — onlyIfChangedSince

Skip items that are already in a prior dataset or are older than a given timestamp. Huge cost-saver for hourly monitoring.

{
"scrapeMode": "x-timeline-scraper",
"profiles": ["NASA"],
"onlyIfChangedSince": { "dataset": "PREVIOUS_DATASET_ID" }
}

Accepts:

  • { "since": "2026-04-20T00:00:00Z" } — drop items older than the cutoff
  • { "dataset": "DATASET_ID" } — read tweetIds + latest createdAt from a prior dataset and filter against both
  • { "since": "...", "dataset": "..." } — combine; the newer cutoff wins

Completion webhook — completionWebhookUrl

POST the run summary to your own endpoint when the run finishes. Non-blocking; failures are logged but never fail the run.

{
"scrapeMode": "x-hashtag-scraper",
"searchQueries": ["#AI"],
"completionWebhookUrl": "https://example.com/my-webhook"
}

Payload:

{
"event": "run.completed",
"runId": "...",
"actorId": "...",
"datasetId": "...",
"summary": { "mode": "x-hashtag-scraper", "duration": "23s", "results": 100, "errors": 0, "schemaVersion": "1.0.0", "circuitBreakerTripped": false, ... }
}

Structured JSON logs — jsonLogs: true

Emit newline-delimited JSON to stdout (one event per line) instead of the default free-text logs. Drops directly into Cloudwatch / Datadog / Loki without regex parsing.

{
"scrapeMode": "x-timeline-scraper",
"profiles": ["NASA"],
"jsonLogs": true
}

Sample line:

{"t":"2026-04-24T06:10:00.000Z","lvl":"INFO","runId":"ABC","actorId":"XYZ","build":"0.1.67","node":"v20.20.1","message":"Progress: 25 results emitted so far (7s elapsed)"}

Per-input ledger — emitPerInputLedger: true

When enabled, the run summary carries a perInput array with each input's outcome — great for batches where you scrape hundreds of handles and need to know exactly which ones succeeded.

"perInput": [
{"input": "NASA", "status": "success", "itemsEmitted": 25, "errorCategory": null},
{"input": "OpenAI", "status": "success", "itemsEmitted": 4, "errorCategory": null},
{"input": "baduser1", "status": "failed", "itemsEmitted": 0, "errorCategory": "NOT_FOUND"}
]

Ledger caps at 500 entries; overflow is reported as summary.perInputOverflow. The ledger lives on the run's summary key-value record — read it via await Actor.openKeyValueStore().then(s => s.getValue('summary')) or from the Apify run UI.

Dry run — dryRun: true

Validate inputs, check cookies, resolve doc IDs, emit a single {type:"dry-run", ...} item, then exit WITHOUT any network scraping. Pre-production smoke test / CI validation.

Bulk CSV input — inputCsv

Provide an HTTPS URL of a CSV file. Each non-empty row is auto-routed to the matching input array:

Row shapeRoutes to
https://x.com/<user>/status/<id>tweetURLs
https://x.com/<user> or @user or userprofiles
Anything elsesearchQueries

Header row is auto-detected (any of username, handle, tweet_url, profile_url, query, search, url). Lets you batch thousands of handles without hitting Apify's input-size limit.

Cross-run cache — useCrossRunCache

Opt-in 24 h KV cache for tweets + profiles. When enabled, scheduled/polling runs that re-scrape the same URLs within 24 h serve from cache instead of hitting X again.

{
"scrapeMode": "x-profile-scraper",
"profiles": ["NASA", "OpenAI"],
"useCrossRunCache": true
}
  • Store name: x-scraper-xrun-cache (auto-created per actor).
  • TTL: 24 hours, enforced on read.
  • Cache key: tweet:<tweetId> / profile:<handle>.
  • Keys are case-insensitive for handles.
  • Summary logged at end of run: Cross-run cache: hits=2 misses=0 writes=0.
  • Default off — no behaviour change for existing callers.

Great fit for:

  • Hourly brand-monitoring runs
  • Lead-enrichment pipelines that revisit the same profiles
  • Dashboard refresh jobs

Not useful for one-off backfills or queries that need fresh engagement numbers.

Engagement Fields: null vs 0

All engagement fields (likeCount, retweetCount, viewCount, replyCount, quoteCount, bookmarkCount) follow a strict convention:

  • null — the field is not applicable for this content type. For example, retweets don't carry independent engagement counts, and pre-2023 tweets have no view tracking. A null value means the data genuinely does not exist.
  • 0 — the platform reported zero engagement. The tweet was seen/liked/retweeted exactly zero times.

This prevents false zeros from polluting your data. If a field is null, the engagement data was unavailable — not zero. Never treat null and 0 as equivalent.

Query Wizard

Build complex search queries without learning advanced syntax. These fields auto-append to your search queries:

FieldWhat it doesEquivalent syntax
From UserTweets from specific userfrom:username
In Reply ToReplies to specific userto:username
Mentioning UserMentions of user@username
Start DateTweets after datesince:YYYY-MM-DD
End DateTweets before dateuntil:YYYY-MM-DD
Minimum LikesMin like thresholdmin_faves:N
Minimum RetweetsMin RT thresholdmin_retweets:N
Minimum RepliesMin reply thresholdmin_replies:N
Language FilterFilter by languagelang:xx
Only VerifiedVerified users onlyfilter:verified
Only ImagesTweets with attached photos (strict — card-only matches are dropped)filter:images
Only VideosTweets with attached videos or GIFs (strict — card-only matches are dropped)filter:videos
Only QuotesQuote tweets onlyfilter:quote
Geotagged NearLocation filternear:"city"
Within RadiusRadius from locationwithin:10km

Advanced Search Syntax

Combine operators: from:NASA #space since:2024-01-01 min_faves:100 lang:en -filter:retweets

OperatorExampleDescription
from:userfrom:NASATweets from specific user
to:userto:supportReplies to specific user
@user@OpenAIMentions of specific user
#hashtag#AITweets with hashtag
$cashtag$TSLATweets with cashtag
since:datesince:2024-01-01Tweets after date
until:dateuntil:2024-12-31Tweets before date
min_faves:Nmin_faves:100Minimum likes
min_retweets:Nmin_retweets:50Minimum retweets
lang:xxlang:enFilter by language
filter:mediafilter:mediaOnly tweets with media
filter:imagesfilter:imagesOnly tweets with images
filter:videosfilter:videosOnly tweets with videos
filter:verifiedfilter:verifiedOnly verified users
filter:blue_verifiedfilter:blue_verifiedOnly X Premium users
-filter:retweets-filter:retweetsExclude retweets
-filter:replies-filter:repliesExclude replies
filter:quotefilter:quoteOnly quote tweets
conversation_id:IDconversation_id:123Thread replies
near:"city"near:"San Francisco"Geotagged tweets
within:radiuswithin:10kmWithin radius

Technical Details

  • Stack: Node.js 20, Apify SDK 3, Impit (Chrome TLS fingerprinting)
  • Memory: 128 MB (HTTP-only, no browser) — peak RSS ~50–85 MB in typical runs
  • Speed: ~5–25 seconds per run depending on mode, max 5000 results per run
  • Residential proxy (Evomi) included on authenticated modes by default; unauthenticated modes (post, profile, extractor) auto-route via datacenter IPs to save egress
  • Active cookie rotationX_LOGIN_COOKIES can contain multiple |||-separated cookies; the scraper cycles to the next entry when X burns the current one mid-run, and emits a single COOKIE_EXHAUSTED item when every entry is burned
  • In-run + cross-run request caches — duplicates in the same batch reuse in-flight promises; with useCrossRunCache: true, parsed tweets + profiles are cached in a named KV store for 24 h so scheduled polling runs skip re-scraping fresh data
  • Rate-limit aware — honours x-rate-limit-reset on 429 and waits the server-requested duration (capped at 60 s per retry)
  • Self-healing doc IDs — on DOC_ID_ROTATED, the scraper fetches x.com's main JS bundle, resolves the new queryId for the failing operation, and retries once this run. Weekly GitHub Action (x-bundle-drift.yml) opens an issue when drift is detected
  • Optional deep hydrationhydrateFullUser: true on follower / following / user-search modes enriches each returned user with Q10 fields X trims on list endpoints
  • Sparse output — pass fields: ["tweetId","text","likeCount"] to emit only the specified keys (type + input + error fields are always preserved)
  • Resume capability via resumeFromDataset (skip already-scraped items from a prior dataset)
  • Custom output via customMapFunction
  • Progress log — runs emitting >25 results print a per-batch status line so long runs don't look silent
  • Parser regression suitenpm test runs 61 tests (53 unit tests against hand-crafted X response fixtures + 8 integration tests replaying a recorded live GraphQL response); every field fix shipped through every audit is pinned down by at least one test, including the 9 Tier 1 + Tier 2 field additions in build 0.1.73
  • Field accuracy: every mode ≥95% null-aware field-fill against Chrome ground truth (last verified 2026-04-25 against 60 our-runs + 22 competitor benchmark runs on build 0.1.71; build 0.1.73 then added 9 additive fields without changing parser correctness — see docs/audit/2026-04-25-x-competitive-analysis.md). Parser live-sample corpus in docs/audit/q1-q10-live-samples.json tracks 6/10 Q1–Q10 enrichment fields with organically-sourced URLs
  • Mutation testing: npm run test:mutate runs Stryker against the parsers + helpers to grade how well the test suite would catch real bugs. Surviving mutants are bug templates; target trajectory is ≥75% mutation score across the parsers
  • Predeploy gate: apify push runs npm run check:modes && npm run test:parsers && npm run check:drift — catches mode-consistency regressions, parser regressions, and X bundle drift before a build ships
  • Gzip-compressed POST bodies on authenticated GraphQL requests when the payload shrinks ≥10% — cuts upload bandwidth for search/follower/timeline runs
  • Graceful shutdown: SIGTERM / abort / migrate events trip the circuit breaker, flush caches, and emit a {type:"partial-summary", reason, mode, durationSecs, stats, schemaVersion} item before exit — you never lose accumulated state to a timeout
  • Text normalisation: every string field exposed to the user (text, description, etc.) is run through an HTML-entity decoder (named + numeric entities) and a control-character stripper before emission. Newlines and emoji are preserved. Pasting output into Google Sheets / HubSpot / a CSV no longer leaves stray &amp; or invisible zero-width characters
  • Cross-run guest-token cache: for unauthenticated modes, the IP-bound guest token is reused from a 4-hour KV cache across runs — saves the ~500 ms activation round-trip on every subsequent run that hits the same Evomi exit IP. Transparent: the underlying token still rotates per X's TTL
  • Lazy module loading: the cross-run cache module is only imported when useCrossRunCache: true is set, trimming a few milliseconds off cold-start for the typical run that doesn't enable it

Troubleshooting

Getting no results?

  • Check that your search query isn't too restrictive
  • Try sort: Top instead of Latest — X sometimes returns fewer results with Latest
  • Remove until: date filters if getting low counts

Authentication errors?

  • Your cookies may have expired. Get fresh auth_token and ct0 from x.com
  • Both auth_token and ct0 are required
  • Format: auth_token=YOUR_TOKEN; ct0=YOUR_CT0

Missing tweets?

  • Some tweets are shadow-banned or filtered by X. This is outside our control.
  • Try different date ranges or search terms

Want to resume a failed run?

  • Copy the dataset ID from the previous run
  • Add it to resumeFromDataset — already-scraped items will be skipped

Support

Found a bug or need help? Open an issue on the Issues tab.