X/Twitter Scraper — Tweets, Profiles & Engagement Data avatar

X/Twitter Scraper — Tweets, Profiles & Engagement Data

Pricing

$15.00 / 1,000 results

Go to Apify Store
X/Twitter Scraper — Tweets, Profiles & Engagement Data

X/Twitter Scraper — Tweets, Profiles & Engagement Data

Scrape Twitter/X data at scale. Extract tweets, profiles, hashtags, trends, and engagement metrics for social media analytics.

Pricing

$15.00 / 1,000 results

Rating

0.0

(0)

Developer

Luan M.

Luan M.

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

15 hours ago

Last modified

Categories

Share

Twitter/X Data Scraper

Apify Actor Node.js Crawlee

A powerful, production-ready Twitter/X Data Scraper built on Apify and Crawlee. Extracts tweets, retweets, replies, likes, user profiles, hashtags, mentions, and media URLs from Twitter/X — all with minimal configuration.

Features

  • Multi-mode Scraping — Search by keyword, scrape user timelines, or provide arbitrary X.com URLs
  • Rich Tweet Data — Extracts tweet text, timestamp, engagement metrics (likes, retweets, replies, views), tweet IDs, and URLs
  • User Profile Info — Bio, followers count, following count, location, website, join date, avatar, and banner image
  • Media Extraction — URLs for images, videos, and GIFs embedded in tweets
  • Hashtags & Mentions — Extracted automatically from each tweet
  • Reply & Retweet Filters — Optionally include or exclude replies and retweets
  • Proxy Support — Built-in Apify proxy integration with residential proxy group support
  • Configurable — Max tweets, concurrency, retries, and more
  • Headless Browser — Uses Playwright with Chromium for reliable JavaScript-rendered page extraction

Input Configuration

FieldTypeDefaultDescription
startUrlsArray["https://x.com/elonmusk"]Starting URLs (profile pages, search results)
searchQueryStringSearch query (overrides startUrls). Supports operators like from:username, has:hashtags, lang:en
usernameStringScrape a specific user's timeline (without @). Overrides startUrls
maxTweetsInteger100Maximum tweets to scrape (0 = unlimited)
includeRepliesBooleanfalseInclude replies in user timeline scrape
includeRetweetsBooleantrueInclude retweets in user timeline scrape
proxyConfigurationObjectApify RESIDENTIALProxy settings to avoid IP bans
maxRequestRetriesInteger3Retry limit for failed requests
maxConcurrencyInteger5Concurrent browser pages
extractTweetIdsBooleantrueInclude tweet IDs and URLs
extractUserInfoBooleantrueInclude author profile data
extractMediaBooleantrueExtract image/video URLs

Output Dataset

Each tweet is stored as a dataset item with the following structure:

{
"text": "The future of AI is exciting!",
"timestamp": "2025-05-30T12:00:00.000Z",
"tweetId": "1234567890123456789",
"tweetUrl": "https://x.com/username/status/1234567890123456789",
"replyCount": 42,
"retweetCount": 128,
"likeCount": 1024,
"viewCount": 50000,
"isReply": false,
"isRetweet": false,
"hashtags": ["AI", "tech"],
"mentions": ["@openai"],
"mediaUrls": ["https://pbs.twimg.com/media/..."],
"user": {
"username": "elonmusk",
"displayName": "Elon Musk",
"avatarUrl": "https://pbs.twimg.com/profile_images/...",
"profileUrl": "https://x.com/elonmusk"
},
"profile": {
"username": "elonmusk",
"displayName": "Elon Musk",
"bio": "Technology entrepreneur",
"followersCount": 180000000,
"followingCount": 1500,
"location": "Austin, TX",
"website": "https://example.com",
"avatarUrl": "https://pbs.twimg.com/profile_images/...",
"bannerUrl": "https://pbs.twimg.com/profile_banners/..."
},
"sourceUrl": "https://x.com/elonmusk",
"scrapedAt": "2025-05-30T12:05:00.000Z"
}

Quick Start

# Install dependencies
npm install
# Run locally (requires Apify token)
npx apify run -p
# Or run directly
node src/main.js

Deployment to Apify

  1. Push this repository to GitHub
  2. Go to Apify Console → Create Actor → Import from GitHub
  3. Set up environment variables in Apify Console as needed
  4. Build and run!

Environment Variables

VariableDescription
APIFY_TOKENYour Apify API token (required for cloud proxy)
APIFY_PROXY_PASSWORDApify proxy password
APIFY_LOCAL_STORAGE_DIRLocal storage directory for development

Technical Details

  • Runtime: Node.js 18+
  • Browser Engine: Chromium via Playwright
  • Crawler: Crawlee PlaywrightCrawler
  • Data Storage: Apify dataset
  • Proxy: Apify Proxy (RESIDENTIAL recommended)

Limitations & Best Practices

  • Rate Limiting: Twitter/X aggressively rate-limits scraping. Use residential proxies and reasonable concurrency.
  • Login Walls: Some pages may require authentication. For full access, consider adding cookie-based session management.
  • DOM Changes: This scraper relies on Twitter's DOM structure (data-testid attributes). If Twitter updates their UI, selectors may need adjustment.
  • Ethical Use: Respect Twitter's Terms of Service and robots.txt. Use responsibly and consider rate limiting.

License

Apache 2.0