Twitter Profile Scraper avatar

Twitter Profile Scraper

Pricing

$19.99/month + usage

Go to Apify Store
Twitter Profile Scraper

Twitter Profile Scraper

🐦 Twitter Profile Scraper extracts detailed data from X/Twitter profiles—bio, handle, username, followers/following, location, website, verified badge, profile images, join date & recent tweets. 🚀 Ideal for influencer discovery, social listening, lead gen & competitor intel.

Pricing

$19.99/month + usage

Rating

0.0

(0)

Developer

Scraply

Scraply

Maintained by Community

Actor stats

0

Bookmarked

3

Total users

1

Monthly active users

a day ago

Last modified

Share

Twitter Profile Scraper

The Twitter Profile Scraper is a production-ready Twitter profile scraper tool that lets you scrape Twitter profiles and extract structured public data from X/Twitter profiles at scale. It works as a Twitter profile data extractor and Twitter user data scraper for marketers, developers, data analysts, and researchers who need to extract Twitter profile information and recent tweets without relying on official APIs. Built for influencer discovery, social listening, lead gen, and competitor intel, this X profile scraper turns public profile timelines into clean, exportable records for your pipelines.

What data / output can you get?

Use this Twitter profile scraping bot to collect clean, consistent fields from profiles and keyword searches. Below are the core output fields as saved to the dataset.

Data typeDescriptionExample value
userLegacy user profile object for the tweet author{"screen_name":"elonmusk","name":"Elon Musk"}
id_strUnique tweet ID (string)"1519480761749016577"
full_textFull tweet text"Next I'm buying Coca-Cola to put the cocaine back in"
created_atTweet creation time (string)"Thu Apr 28 00:56:58 +0000 2022"
favorite_countNumber of likes4289223
retweet_countNumber of retweets594435
reply_countNumber of replies170050
quote_countNumber of quotes167104
bookmark_countNumber of bookmarks (defaults to 0 if missing)21112
conversation_id_strConversation/thread identifier"1519480761749016577"
entitiesEntities object (hashtags, mentions, URLs) with timestamps array{"hashtags":[],"user_mentions":[],"urls":[],"timestamps":[]}
langDetected language code"en"

Bonus fields included when available: extended_entities (media), is_quote_status, favorited, retweeted, bookmarked, user_id_str, display_text_range. Results can be exported to JSON, CSV, or Excel for downstream use.

Key features

  • 🚀 Dynamic authorization capture Captures a Bearer authorization header on-the-fly using Playwright to access public X endpoints reliably—no manual cookie handling required.

  • 🧭 Hybrid API + browser approach Combines X GraphQL/REST requests with Playwright fallbacks for resilient extraction when scraping Twitter public profiles and keyword timelines.

  • 🔄 Smart proxy fallback & retries Automatically escalates from no proxy → datacenter → residential with exponential backoff when blocked (403/429/401), improving uptime on large jobs.

  • 📥 Flexible inputs (URLs, @usernames, IDs, keywords) Works as a Twitter username scraper and keyword-based X profile scraper. Accepts profile URLs, screen names, numeric user IDs, or search keywords in bulk.

  • 🧩 Configurable profile data Control whether to include legacy user info with each tweet (addUserInfo) or fetch only the user profile with no tweets (onlyUserInfo).

  • 🧪 Not-found / suspended flags Optionally include notFound and suspended markers in output for inputs that cannot be resolved, making error handling explicit in your pipelines.

  • 📦 Live dataset streaming & easy export Saves items as they’re extracted for near-real-time access; export to JSON, CSV, or Excel for analysis, enrichment, or BI workflows.

  • 🧠 Clean, legacy-style tweet objects Output mirrors legacy Tweet fields (id_str, full_text, created_at, counts, entities, extended_entities) for easy integration with existing tooling.

  • 🧹 Search deduplication Keyword search mode auto-deduplicates tweets by id_str to keep results tidy when scrolling through search timelines.

  • 🧰 Developer-friendly & API-ready Built on Apify with Python + Playwright. Trigger via Apify API to integrate this Twitter profile scraper API into Python scripts or automation workflows.

How to use Twitter Profile Scraper - step by step

  1. Create or log in to your Apify account You can start quickly on the Apify platform.

  2. Open “Twitter Profile Scraper” Find the actor in the Apify Store and click Try for free.

  3. Add your input data Paste profile URLs, usernames, user IDs, or keywords into startUrls (string list). Examples:

  4. Configure options

    • maxTweets: Limit tweets per user/keyword (default 10; up to 100).
    • addUserInfo: Include legacy user profile data with each tweet.
    • onlyUserInfo: Fetch only the profile (no tweets).
    • addNotFoundUsersToOutput / addSuspendedUsersToOutput: Include explicit flags for unresolved or suspended profiles.
    • proxyConfiguration: Start with no proxy by default; enable Apify Proxy if you see blocks.
  5. Start the run The actor will capture an authorization header automatically, resolve usernames to user IDs, and fetch timeline pages with cursor pagination.

  6. Monitor progress Items are pushed to the dataset continuously. You can view the “Scraped Tweets” table in the Dataset tab.

  7. Download your results Export the dataset in JSON, CSV, or Excel for analysis, enrichment, or loading into your data warehouse or CRM.

Pro tip: Use the Apify API to schedule runs, feed new usernames/keywords programmatically, and pipe results into your data stack or a Twitter profile scraper tool workflow.

Use cases

Use case nameDescription
Marketers – influencer discoveryIdentify and track creators by scraping Twitter user info and recent posts, then filter by engagement metrics and bios for outreach.
Competitor analysis – content trackingMonitor competitors’ timelines and quantify post volume, topics, and engagement to refine your own strategy.
Social listening – keyword timelinesUse keywords to scrape Twitter user info from public posts matching your topics and analyze trends at scale.
Lead generation – bio enrichmentEnrich leads by extracting bios, website links, and verification status from public profiles for smarter segmentation.
Research & academia – public datasetsBuild reproducible datasets of public tweets (full_text, created_at, entities) for experiments and publications.
Developer pipelines – API ingestionCall the actor via API and stream structured JSON into ETL jobs, data lakes, or LLM training corpora.
Journalism – public figure monitoringTrack public figures’ posting activity and statements with timestamped, structured records.

Why choose Twitter Profile Scraper?

This Twitter public profile scraper is engineered for precision, resilience, and clean outputs that slide straight into your workflows.

  • 🎯 Accurate, structured results: Legacy-style tweet objects and a clear “user” profile object make downstream parsing simple.
  • ⚡ Built for scale: Bulk inputs (URLs, usernames, IDs, keywords) and robust pagination handle ongoing monitoring or batch jobs.
  • 🔌 Developer access: Use the Apify API to trigger runs and export structured data for Python or automation scripts.
  • 🛡️ Resilient networking: Automatic proxy escalation and retries reduce blocks when you scrape Twitter profiles at volume.
  • 💸 Cost-effective operations: Live dataset streaming and selective inclusion of fields help you control compute and storage.
  • 🔀 Alternative to unstable tools: Avoid brittle extensions and copy-paste; get consistent, production-grade extraction instead.

Bottom line: if you need a reliable Twitter profile scraper tool and Twitter profile scraper API for automated, structured extraction, this actor delivers.

Yes—when used responsibly. This actor targets publicly available information on X/Twitter and does not access private or authenticated data.

Guidelines for compliant use:

  • Collect only publicly available information and respect platform terms.
  • Do not attempt to access private profiles or bypass security.
  • Be mindful of rate limits and operational load when you scrape Twitter public profiles.
  • Ensure your use aligns with applicable data protection laws (e.g., GDPR, CCPA).
  • Consult your legal team for edge cases or jurisdiction-specific requirements.

Input parameters & output format

Example JSON input

{
"startUrls": [
"https://x.com/elonmusk",
"elonmusk",
"44196397",
"tesla"
],
"maxTweets": 100,
"sortOrder": "chronological",
"addUserInfo": true,
"onlyUserInfo": false,
"addNotFoundUsersToOutput": false,
"addSuspendedUsersToOutput": false,
"proxyConfiguration": {
"useApifyProxy": false
}
}

Parameters (from input schema):

  • startUrls (array, required): List of Twitter profile URLs (e.g., https://x.com/username), usernames (e.g., username), user IDs (e.g., 44196397), or search keywords (e.g., tesla). Supports bulk input. Default: none (required).
  • maxTweets (integer): Maximum number of tweets to fetch per user or keyword. Minimum: 1, Maximum: 100. Default: 10.
  • sortOrder (string): Sort order for tweets (currently supports chronological order from API). One of: "chronological", "relevance". Default: "chronological".
  • addUserInfo (boolean): Include user profile data (legacy) in each tweet output. Default: true.
  • onlyUserInfo (boolean): If enabled, only fetch user profile information without tweets. Default: false.
  • addNotFoundUsersToOutput (boolean): Include users that were not found in the output with notFound flag. Default: false.
  • addSuspendedUsersToOutput (boolean): Include suspended users in the output with suspended flag. Default: false.
  • proxyConfiguration (object): Proxy configuration. By default, no proxy is used. If Twitter blocks requests, the actor will automatically fallback to datacenter proxy, then residential proxy with retries. Default: {"useApifyProxy": false}.

Example JSON output (tweet object)

{
"user": {
"screen_name": "elonmusk",
"name": "Elon Musk",
"description": "Mars & Cars, Chips & Dips",
"followers_count": 229033543,
"friends_count": 1226,
"statuses_count": 89153,
"favourites_count": 182734,
"listed_count": 165176,
"created_at": "Tue Jun 02 20:12:29 +0000 2009",
"verified": true,
"profile_image_url": "https://pbs.twimg.com/profile_images/...",
"profile_banner_url": "https://pbs.twimg.com/profile_banners/...",
"default_profile": false,
"default_profile_image": false,
"entities": {
"description": {
"urls": []
}
}
},
"id_str": "1519480761749016577",
"full_text": "Next I'm buying Coca-Cola to put the cocaine back in",
"created_at": "Thu Apr 28 00:56:58 +0000 2022",
"favorite_count": 4289223,
"retweet_count": 594435,
"reply_count": 170050,
"quote_count": 167104,
"bookmark_count": 21112,
"conversation_id_str": "1519480761749016577",
"user_id_str": "44196397",
"lang": "en",
"is_quote_status": false,
"favorited": false,
"retweeted": false,
"bookmarked": false,
"display_text_range": [0, 52],
"entities": {
"hashtags": [],
"symbols": [],
"user_mentions": [],
"urls": [],
"timestamps": []
},
"extended_entities": null
}

Additional output variations:

  • Only profile mode (onlyUserInfo = true)
{
"user": {
"screen_name": "mrbeast",
"name": "MrBeast"
}
}
  • Not found or suspended markers (when enabled)
{ "user": { "screen_name": "nonexistent_user" }, "notFound": true }
{ "user": { "screen_name": "suspended_user" }, "suspended": true }

Note: When addUserInfo is false, the “user” object is omitted from tweet items. The “entities” object always includes a “timestamps” array.

FAQ

What is a Twitter Profile Scraper?

It’s a Twitter profile scraping bot that extracts public profile information and recent tweets from X/Twitter. This actor outputs structured JSON with legacy tweet fields and an optional user profile object.

Yes, when you collect public data responsibly. Follow X/Twitter’s terms, avoid private or protected content, and ensure compliance with applicable regulations such as GDPR and CCPA.

Do I need to log in or provide cookies?

No. The actor captures an authorization header dynamically via Playwright and accesses public endpoints, so you can scrape Twitter user info from public pages without providing credentials.

What inputs are supported?

You can pass profile URLs, @usernames, numeric user IDs, or keywords in startUrls. This flexibility makes it a practical Twitter profile data extractor and X profile scraper.

Can it scrape followers?

It does not fetch follower lists. However, when addUserInfo is enabled, the legacy user profile object may include followers_count and other public metrics.

How many tweets can I scrape per profile or keyword?

The maxTweets parameter supports up to 100 per user/keyword (default 10). Use multiple inputs for broader coverage across profiles or topics.

Can I exclude or include profile data in tweet results?

Yes. Set addUserInfo to include the legacy user object with each tweet, or disable it to output tweet-only records. Use onlyUserInfo to fetch just the profile without tweets.

Does it support proxies?

Yes. By default no proxy is used, and if blocks occur the actor automatically falls back to datacenter and then residential proxies with retries. You can also configure proxyConfiguration explicitly.

Does it work with APIs or Python?

Yes. You can trigger runs and fetch datasets through the Apify API, making it easy to integrate this Twitter profile scraper API into Python scripts, ETL jobs, or automation tools.

Can I use it to scrape Twitter profiles for keyword monitoring?

Yes. Provide a keyword in startUrls to collect tweets from the public search timeline, with built-in deduplication by tweet ID.

Closing CTA / Final thoughts

The Twitter Profile Scraper is built for teams that need structured Twitter/X profile timelines at scale. It combines resilient extraction, proxy-aware networking, and clean outputs for analysts, marketers, and developers.

Use it to automate influencer research, social listening, and competitor tracking with export-ready JSON/CSV/Excel. Developers can call the Apify API to embed this Twitter profile scraper tool into Python pipelines and automated workflows. Start extracting smarter Twitter public profile data—reliably, at scale, and in a format your stack understands.