Twitter Profile Scraper avatar
Twitter Profile Scraper

Pricing

$10.00 / 1,000 results

Go to Apify Store
Twitter Profile Scraper

Twitter Profile Scraper

Developed by

Crawler Bros

Crawler Bros

Maintained by Community

Extract comprehensive Twitter/X profile data and tweets including all engagement metrics (likes, retweets, replies, quotes, bookmarks, views), profile details, media URLs, hashtags, and mentions with anti-detection features and authenticated scraping support.

5.0 (3)

Pricing

$10.00 / 1,000 results

0

2

1

Last modified

3 days ago

A robust Apify Actor for scraping tweets from Twitter/X profiles with comprehensive data extraction including all engagement metrics, media, and metadata.

⚠️ IMPORTANT: Authentication Required for Recent Tweets

Without authentication cookies, Twitter shows "Highlights" (popular old tweets) instead of recent chronological posts.

  • Without cookies: You'll get old popular tweets (sometimes years old)
  • With cookies: You'll get recent chronological tweets

To scrape recent tweets, you MUST provide authentication cookies. See the Authentication section below.

🚀 Features

Profile Information Extraction

  • Profile Details
    • Username and display name
    • Bio/description
    • Location
    • Website URL
    • Joined date
    • Verification status
    • Profile image URL
    • Profile banner URL
    • Followers count
    • Following count

Complete Tweet Data Extraction

  • Tweet Content

    • Full tweet text
    • Tweet ID and URL
    • Timestamp and creation date
    • Tweet type (regular, reply, retweet)
  • Author Information

    • Author username and display name
    • Verification status
    • Profile being scraped
  • Engagement Metrics

    • ❤️ Likes count
    • 🔄 Retweets count
    • 💬 Replies count
    • 💭 Quotes count
    • 🔖 Bookmarks count
    • 👁️ Views count
    • 📊 Total engagement
  • Media & Links

    • 📸 Media URLs (combined images and videos in one array)
    • 🔗 External URLs
    • #️⃣ Hashtags
    • @ Mentions
  • Tweet Context

    • Is reply? (with parent user)
    • Is retweet?
    • Reply-to username
    • Clean text (newlines removed)

Advanced Features

Authenticated Scraping - Use cookies to access private/protected profiles ✅ Session Persistence - Save cookies between runs for efficiency ✅ Human Behavior Simulation - Random delays, mouse movements, natural scrolling ✅ Rate Limit Protection - Configurable delays to avoid detection ✅ Block Detection - Automatic detection of account restrictions ✅ Filter Options - Include/exclude replies and retweets ✅ Multi-Profile Support - Scrape multiple profiles in one run ✅ Stealth Mode - Firefox with anti-detection measures

📋 Input Parameters

Required

  • usernames (array) - List of Twitter usernames to scrape (without @)
    • Example: ["elonmusk", "openai", "github"]

Optional

  • maxTweets (integer, default: 50) - Maximum tweets per profile (1-500)
  • includeReplies (boolean, default: false) - Include reply tweets
  • includeRetweets (boolean, default: true) - Include retweets
  • cookies (string) - Twitter auth cookies in JSON format
  • sessionName (string, default: "default_session") - Session name for cookie storage
  • minDelayBetweenRequests (integer, default: 3) - Min delay between actions (1-30s)
  • maxDelayBetweenRequests (integer, default: 7) - Max delay between actions (2-60s)
  • delayBetweenProfiles (integer, default: 10) - Delay between profiles (5-300s)
  • humanizeBehavior (boolean, default: true) - Enable human-like behavior

📊 Output Format

The scraper returns a structured JSON object with profile information and all posts:

{
"username": "elonmusk",
"display_name": "Elon Musk",
"bio": "Tesla, SpaceX, Neuralink, The Boring Company",
"location": "Texas, USA",
"website": "https://tesla.com",
"joined_date": "Joined June 2009",
"birth_date": null,
"followers_count": 228100000,
"following_count": 1219,
"tweets_count": 0,
"verified": true,
"profile_image_url": "https://pbs.twimg.com/profile_images/...",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/...",
"scraped_at": "2024-01-15T12:00:00.000000",
"posts": [
{
"profile_username": "elonmusk",
"tweet_id": "1234567890",
"tweet_url": "https://twitter.com/elonmusk/status/1234567890",
"text": "Tweet content here without newlines",
"author_name": "Elon Musk",
"author_username": "elonmusk",
"author_verified": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"created_at": "2024-01-15T10:30:00+00:00",
"likes_count": 15000,
"retweets_count": 3000,
"replies_count": 500,
"quotes_count": 200,
"bookmarks_count": 1000,
"views_count": 500000,
"total_engagement": 19700,
"is_retweet": false,
"is_reply": false,
"reply_to": null,
"media_urls": [
"https://pbs.twimg.com/media/image1.jpg",
"https://pbs.twimg.com/media/video1.mp4"
],
"hashtags": ["AI", "Technology"],
"mentions": ["openai", "github"],
"urls": ["https://example.com"],
"scraped_at": "2024-01-15T12:00:00.000000"
}
]
}

🔧 Usage Examples

Basic Usage

{
"usernames": ["elonmusk"],
"maxTweets": 20
}

Advanced Usage with Authentication

{
"usernames": ["elonmusk", "openai"],
"maxTweets": 100,
"includeReplies": true,
"includeRetweets": false,
"cookies": "[{\"name\":\"auth_token\",\"value\":\"your_token\",\"domain\":\".twitter.com\"}]",
"sessionName": "my_twitter_account",
"minDelayBetweenRequests": 5,
"maxDelayBetweenRequests": 10,
"humanizeBehavior": true
}

Conservative Settings (Avoid Detection)

{
"usernames": ["elonmusk"],
"maxTweets": 30,
"minDelayBetweenRequests": 8,
"maxDelayBetweenRequests": 15,
"delayBetweenProfiles": 30,
"humanizeBehavior": true
}

🔐 Authentication (REQUIRED for Recent Tweets)

Why you need authentication:

  • Without login: Twitter shows "Highlights" tab with old popular tweets (can be years old)
  • With login: Twitter shows recent chronological tweets from the profile

Bottom line: If you want recent tweets, you MUST provide cookies.

For accessing private profiles or getting recent chronological tweets, you must provide Twitter authentication cookies:

How to Get Cookies

  1. Open Twitter/X in your browser
  2. Log in to your account
  3. Open Developer Tools (F12)
  4. Go to Application/Storage → Cookies → https://twitter.com
  5. Export cookies in JSON format

Required Cookies

  • auth_token - Main authentication token
  • ct0 - CSRF token (optional but recommended)
[
{
"name": "auth_token",
"value": "your_auth_token_value",
"domain": ".twitter.com",
"path": "/",
"expires": 1234567890,
"httpOnly": true,
"secure": true
},
{
"name": "ct0",
"value": "your_ct0_value",
"domain": ".twitter.com",
"path": "/",
"expires": 1234567890,
"httpOnly": false,
"secure": true
}
]

⚠️ Important Notes

🔴 CRITICAL: Recent Tweets Require Authentication

Without authentication cookies, you will get OLD tweets (highlights), not recent ones!

  • Twitter shows logged-out users a "Highlights" feed with popular old tweets
  • This is Twitter's deliberate behavior, not a bug in the scraper
  • Solution: Provide authentication cookies in the input (see Authentication section)

Rate Limits

  • Twitter has rate limits for both authenticated and unauthenticated access
  • Use slower settings (higher delays) to avoid detection
  • Start with fewer tweets (20-50) when testing
  • Use authenticated sessions for better limits

Best Practices

  1. Use Authentication - REQUIRED for recent tweets, better access, fewer restrictions
  2. Start Small - Test with maxTweets: 10-20 first
  3. Slow Down - Higher delays = less likely to be blocked
  4. Rotate Sessions - Use different sessionName for different accounts
  5. Monitor Logs - Watch for warning/block messages about login status

Limitations

  • Cannot access truly private accounts without authentication (requires following)
  • Some tweets may be hidden due to privacy settings
  • Rate limits apply (especially without authentication)
  • Twitter's structure may change, requiring updates
  • Without cookies, you get old "highlight" tweets, not recent chronological ones

🛠️ Running Locally

Prerequisites

pip install apify playwright beautifulsoup4
playwright install firefox

Create Input File

Create storage/key_value_stores/default/INPUT.json:

{
"usernames": ["elonmusk"],
"maxTweets": 20
}

Run

cd Twitter-Profile
apify run

📈 Performance Tips

For Maximum Speed

{
"minDelayBetweenRequests": 2,
"maxDelayBetweenRequests": 4,
"humanizeBehavior": false
}

For Maximum Stealth

{
"minDelayBetweenRequests": 8,
"maxDelayBetweenRequests": 15,
"delayBetweenProfiles": 30,
"humanizeBehavior": true
}

For Private Profiles

{
"cookies": "[...]",
"sessionName": "authenticated_session"
}

🐛 Troubleshooting

No Tweets Scraped

  • Check if profile exists and is public
  • Try with authentication cookies
  • Increase wait times
  • Check logs for error messages

Account Blocked/Suspended

  • Reduce scraping speed (increase delays)
  • Use authenticated session
  • Wait 24 hours before retrying
  • Use different account

Tweets Missing Data

  • Some data may not be available publicly
  • Use authentication for better access
  • Some metrics require additional API calls

📄 License

This actor is provided as-is for educational purposes. Be aware of Twitter's Terms of Service when scraping data.

💡 Tips

  • Use for research, analytics, monitoring
  • Respect robots.txt and Terms of Service
  • Don't overload Twitter's servers
  • Consider official Twitter API for production use

Happy Scraping! 🚀