Twitter URL Scraper
Pay $2.50 for 1,000 posts
This Actor may be unreliable while under maintenance. Would you like to try a similar Actor instead?
See alternative ActorsTwitter URL Scraper
Pay $2.50 for 1,000 posts
Copy any Twitter URL and extract Twitter usernames, profile photos, follower count, tweets, hashtags, favorite count, and more. Export scraped datasets, run the scraper via API, schedule and monitor runs or integrate with other tools.
Do you want to learn more about this Actor?
Get a demo2024-03-07
BREAKING CHANGES
- For profiles and tweets that are not found, you will now get an item in the dataset with the URL property and the reason this profile/tweet has not been scraped
2024-01-04
Fix:
- Fixed an issue handling tweets URLs
2023-10-17
Features:
- Support
x.com
urls
2023-08-10
Changes:
- Removed
toDate
input option because now tweets are not ordered and there is limit of 100 tweets per profile (By Twitter)
2023-08-03
Changes:
- Removed
fromDate
andrelativeFromDate
input options
2023-07-12
Fixes:
- Fixed issue where not all search urls were working
2023-07-10
Changes:
- Now you can scrape individual tweets by providing the tweet url and/or user data by providing the tweet url
2023-06-30
BREAKING CHANGES:
On 30 June 2023, Twitter put all content behind login, except for single tweets, so scraping publicly available Twitter content is no longer possible.
You can have a look at https://apify.com/shanes/tweet-flash or https://apify.com/web.harvester/easy-twitter-search-scraperactor that don't require login.
2023-06-22
Changes:
- Deprecated
fromDate
andrelativeFromDate
input options, they will be removed in a few weeks
2023-06-19
Changes:
- Removed
skipPromotedTweets
option - Update
repliesDepth
option to be consistent for profiles and tweets
Fixes:
- Fixed issue where the actor would get stuck on some tweets that are not available
- Fixed issue with scraping replies of replies for profiles
2023-06-05
Changes:
- Disabled
skipPromotedTweets
due to an issue - Removed
maxRequestRetries
,handlePageTimeoutSecs
,maxConcurrency
, andmaxIdleTimeoutSecs
fields from input schema - The actor now uses the
cheerioCrawler
by default, which is more reliable and efficient than thepuppeteerCrawler
- Changed the option
collectOriginalTweetOnly
tocollectOriginalTweetOnly
, since it's not only for retweets, it also applies to quoted tweets - If the
tweetsDesired
is set to0
, the actor will only scrape user info and won't scrape any tweets even ifaddUserInfo
is set to false
2023-05-30
Changes:
- BREAKING CHANGES: Completely removed
extendScraperFunction
function from the actor. - Deprecated
extendOutputFunction
function, it will be supported for a few weeks before getting removed completely.
2023-05-29
Changes:
- Add an experimental option
useNewProfileScraper
that allows scraping multiple profiles at once faster and more efficiently - Add an experimental option
useNewTweetsScraper
that allows scraping multiple Tweets at once faster and more efficiently - Add an option
skipPromotedTweets
that allows the users to skip promoted tweets
Features:
- Add
is_thread
,is_root_thread
, androot_thread_url
to the output - Add
includeThreadsOnly
option to include only threads in the output - Add
username
anduser_id
fields to output, ifaddUserInfo
is false
2023-05-09
Features:
- Add
repliesDepth
option to scrape replies of replies bases on the depth provided
Changes:
- Removed
addTweetViewCount
option, now the view count is always scraped
2023-05-05
Features:
- For retweets now the
retweet
object will contain the original tweet, and thetweet
object will contain the retweet - Add
skipRetweets
option to skip retweets - Add
collectOriginalTweetOnly
option to collect only the original retweet
2023-04-28
Features:
- Now truncated tweets full text is scraped
- Add
is_truncated
to the output - Add the option
tweetsLanguage
to search tweets by language - Add the option
keywordsSearchType
to search for tweets that contain all the keywords or any of them or the exact phrase or none of them - Add
relativeToDate
andrelativeFromDate
to the input to search for tweets using relative dates instead of absolute dates
Fixes:
- Fixed an issue were quoted tweets were not always scraped
- Fixed an issue with
requestsFromUrl
- Fixed issue where
view_count
would be undefined for retweets
2023-04-25
Features:
- Add
quote_count
to the output
2023-04-09
Features:
- Add
video_url
orgif_url
tomedia
object in the output
2023-04-07
Fixes:
- Fixed issue where the browser would get stuck when there a couple of tweets only
Changes:
- Removed
browserFallback
option and updated the method of scraping headers that is faster, doesn't require a lot of resources and doesn't depend on another actor
2023-04-05
Fixes:
- Fixed issue with extracting info from quote tweets that are retweets
2023-04-04
Fixes:
- Some quoted tweets were not scraped
2023-04-03
Features:
- Add
is_retweet
to the output
Fixes:
- Fixed issue where pinned tweets are not scraped
- Filter replies in advanced search
2023-03-29
Features:
- Add new option
browserFallback
, this option enables a fallback to browser-based scraping if Cheerio requests fail (This will use another actor in order to function = more resources). When enabled, the actor will attempt to use the browser to retrieve tweets and provide the results to Cheerio for parsing. This process will occur automatically for every new request, improving the actor's ability to scrape tweets. - Now you will be able to scrape tweets indefinitely, the actor will try scrape all the tweets available for the given url, and it will stop when it reaches the
tweetsDesired
limit, that is if you enabled the new optionbrowserFallback
.
Changes:
- Remove
tweetsDesired
limit, now the you can scrape as many tweets as you want - Add runId to the key-value store to prevent conflicts with other runs
Fixes:
- Fixed reties to scrape tweets that are deleted
- Handle issue where the twitter website would get stuck
- Fixed issue where
view_count
would be undefined - Fixed issue where the actor would get stuck on some tweets that are not available
2023-03-15
Features:
- Added
useAdvancedSearch
option to use the advanced search instead of the regular search for content typeSearch
, it works withfromDate
,toDate
,searchTerms
andhandles
(usernames). It's disabled by default, and it doesn't scrape retweets.
2023-03-02
Features:
- Added
replying_to_tweet
to the output, which is the link to the tweet that the current tweet is replying to - Added
is_quote_tweet
andquoted_tweet
to the output, which is the tweet that the current tweet is quoting
Changes:
- Small code refactoring.
Fixes:
- Fixed issue where twitter sometimes return not tweets for a page, which was causing the request to finish without collecting all the tweets.
2023-02-02
Changes:
- The Actor now uses Cheerio Crawler for content type 'People'
- Improved
cheerioCrawler
stability.
Fixes:
- Fixed issue where the actor get stuck on private profiles
- Fixed issue where some requests where not handled properly or not added to the queue
2023-01-25
Changes:
- Increased default
maxRequestRetries
to 6
2023-01-23
Features:
- Add
profilesDesired
option to limit the number of scraped profiles
Changes:
- Improved cheerio scraper by allowing it to scrape tweets using any content type
- Improved logging for info and errors
- The actor now uses the
cheerioCrawler
by default, which is more reliable than thepuppeteerCrawler
- Removed the
tweetsDesired
max limit, which was 3200 tweets
2023-01-21
Changes:
- Changed the custom infiniteScroll function with the one provided by the SDK, which faster and more reliable
- Disabled the page's cache, which increased the total number of scraped tweets by 20-40%
Feature:
- Add
useCheerio
option to scrape tweets using cheerio crawler instead of puppeteer crawler, cheerio is more reliable, but it doesn't work when provided with login cookies
2023-01-19
Fix:
- Updated scrolling to be more efficient and fixed an issue where the scrolling would quite early
- Increased the default
maxIdleTimeoutSecs
to 60, to ensure that all tweets are scraped
Feature:
- Add
startUrl
to output
2023-01-11
Fix:
- Made scrolling slower to reduce the overload on the CPU
- Add a check to stop scrolling if the desired number of tweets is reached
- Increased the default
maxRequestRetries
to 6
2023-01-10
Feature:
- Add option
addTweetViewCount
to include tweet view count (it's hidden and enabled by default) - Add
view_count
to the output if the option is enabled
2022-08-20
Feature:
- Don't retry non-existing profiles
- More efficient scrolling
- Max concurrency increased to 3 by default
Fix:
- Login modal blocking the scrolling
- More resilient URL inputs and normalization
2022-08-10
Feature:
- Revamp to Typescript and Crawlee
Fix:
- Hanging timers on CPU overload
2022-02-25
Fix:
- Timeline v2 object
2022-02-10
Feature:
- Added '#sort_index' to the output
- Updated README
2022-02-03
Fix:
- Thread replies
2021-11-03
Fix:
- Search results
2021-08-09
Features:
- Update SDK 2
Bug fixes:
- User shape object for some profiles
2021-07-18
Features:
- Update to SDK 1.3.1
Changes:
- Change default timeout values
- Retiring of broken sessions
- Deals with pinned tweets
- Add debug log
Bug fixes:
- Fix thread extraction
2021-06-12
Features:
- Update to SDK 1.2.1
Fixes:
- New GraphQL format
2021-05-03
Features:
- Update to SDK 1.1.2
- Recursive "People" search
- Tweaks to wording in README and INPUT schema
Bug fixes:
- Filter cookies that lead to never loading page / 401 error
- Fetch data from GraphQl responses
2021-03-18
Features:
- Update to SDK 1.0.2
Fixes:
- Clicking on non-replies buttons
2021-02-26
Features:
- Scrape replies of replies
Fixes:
- Improve scraping stability
2021-02-04
Features:
- Add topics
- Add hashtags URLs
- Optimize end of listings
- Labels for outputScraperFunction for various scraper phases
Fixes:
- Deduplication of tweets
- Force retiring forever failing proxies
2021-01-19
- Add mentions, symbols, URLs and hashtags to output
- Add threads/status links support
2021-01-12
- BREAKING CHANGE: Format of the dataset has changed
- Search multiple terms at once, search hashtags and terms
- Enriched user profile information (some information are only available when logged in)
- Added minimum and max tweet dates
- Updated SDK version
- Custom data
- Powerful extend output / scraper function
2020-11-25
- Remove the need to provide credentials
- Update SDK version
- Allow to filter profile tweets for own tweets or include replies
- Scrape faster when there's no login information
- Accept twitter URLs, handles or
@usernames
for better user experience - Throws immediately if invalid handles are passed
Actor Metrics
203 monthly users
-
33 stars
>99% runs succeeded
38 days response time
Created in Mar 2022
Modified 2 months ago