Twitter URL Scraper avatar
Twitter URL Scraper
Try for free

Pay $2.50 for 1,000 posts

View all Actors
Twitter URL Scraper

Twitter URL Scraper

quacker/twitter-url-scraper
Try for free

Pay $2.50 for 1,000 posts

Copy any Twitter URL and extract Twitter usernames, profile photos, follower count, tweets, hashtags, favorite count, and more. Export scraped datasets, run the scraper via API, schedule and monitor runs or integrate with other tools.

2024-03-07

BREAKING CHANGES

  • For profiles and tweets that are not found, you will now get an item in the dataset with the URL property and the reason this profile/tweet has not been scraped

2024-01-04

Fix:

  • Fixed an issue handling tweets URLs

2023-10-17

Features:

  • Support x.com urls

2023-08-10

Changes:

  • Removed toDate input option because now tweets are not ordered and there is limit of 100 tweets per profile (By Twitter)

2023-08-03

Changes:

  • Removed fromDate and relativeFromDate input options

2023-07-12

Fixes:

  • Fixed issue where not all search urls were working

2023-07-10

Changes:

  • Now you can scrape individual tweets by providing the tweet url and/or user data by providing the tweet url

2023-06-30

BREAKING CHANGES: On 30 June 2023, Twitter put all content behind login, except for single tweets, so scraping publicly available Twitter content is no longer possible.

You can have a look at https://apify.com/shanes/tweet-flash or https://apify.com/web.harvester/easy-twitter-search-scraperactor that don't require login.

2023-06-22

Changes:

  • Deprecated fromDate and relativeFromDate input options, they will be removed in a few weeks

2023-06-19

Changes:

  • Removed skipPromotedTweets option
  • Update repliesDepth option to be consistent for profiles and tweets

Fixes:

  • Fixed issue where the actor would get stuck on some tweets that are not available
  • Fixed issue with scraping replies of replies for profiles

2023-06-05

Changes:

  • Disabled skipPromotedTweets due to an issue
  • Removed maxRequestRetries, handlePageTimeoutSecs, maxConcurrency, and maxIdleTimeoutSecs fields from input schema
  • The actor now uses the cheerioCrawler by default, which is more reliable and efficient than the puppeteerCrawler
  • Changed the option collectOriginalTweetOnly to collectOriginalTweetOnly, since it's not only for retweets, it also applies to quoted tweets
  • If the tweetsDesired is set to 0, the actor will only scrape user info and won't scrape any tweets even if addUserInfo is set to false

2023-05-30

Changes:

  • BREAKING CHANGES: Completely removed extendScraperFunction function from the actor.
  • Deprecated extendOutputFunction function, it will be supported for a few weeks before getting removed completely.

2023-05-29

Changes:

  • Add an experimental option useNewProfileScraper that allows scraping multiple profiles at once faster and more efficiently
  • Add an experimental option useNewTweetsScraper that allows scraping multiple Tweets at once faster and more efficiently
  • Add an option skipPromotedTweets that allows the users to skip promoted tweets

Features:

  • Add is_thread, is_root_thread, and root_thread_url to the output
  • Add includeThreadsOnly option to include only threads in the output
  • Add username and user_id fields to output, if addUserInfo is false

2023-05-09

Features:

  • Add repliesDepth option to scrape replies of replies bases on the depth provided

Changes:

  • Removed addTweetViewCount option, now the view count is always scraped

2023-05-05

Features:

  • For retweets now the retweet object will contain the original tweet, and the tweet object will contain the retweet
  • Add skipRetweets option to skip retweets
  • Add collectOriginalTweetOnly option to collect only the original retweet

2023-04-28

Features:

  • Now truncated tweets full text is scraped
  • Add is_truncated to the output
  • Add the option tweetsLanguage to search tweets by language
  • Add the option keywordsSearchType to search for tweets that contain all the keywords or any of them or the exact phrase or none of them
  • Add relativeToDate and relativeFromDate to the input to search for tweets using relative dates instead of absolute dates

Fixes:

  • Fixed an issue were quoted tweets were not always scraped
  • Fixed an issue with requestsFromUrl
  • Fixed issue where view_count would be undefined for retweets

2023-04-25

Features:

  • Add quote_count to the output

2023-04-09

Features:

  • Add video_url or gif_url to media object in the output

2023-04-07

Fixes:

  • Fixed issue where the browser would get stuck when there a couple of tweets only

Changes:

  • Removed browserFallback option and updated the method of scraping headers that is faster, doesn't require a lot of resources and doesn't depend on another actor

2023-04-05

Fixes:

  • Fixed issue with extracting info from quote tweets that are retweets

2023-04-04

Fixes:

  • Some quoted tweets were not scraped

2023-04-03

Features:

  • Add is_retweet to the output

Fixes:

  • Fixed issue where pinned tweets are not scraped
  • Filter replies in advanced search

2023-03-29

Features:

  • Add new option browserFallback, this option enables a fallback to browser-based scraping if Cheerio requests fail (This will use another actor in order to function = more resources). When enabled, the actor will attempt to use the browser to retrieve tweets and provide the results to Cheerio for parsing. This process will occur automatically for every new request, improving the actor's ability to scrape tweets.
  • Now you will be able to scrape tweets indefinitely, the actor will try scrape all the tweets available for the given url, and it will stop when it reaches the tweetsDesired limit, that is if you enabled the new option browserFallback.

Changes:

  • Remove tweetsDesired limit, now the you can scrape as many tweets as you want
  • Add runId to the key-value store to prevent conflicts with other runs

Fixes:

  • Fixed reties to scrape tweets that are deleted
  • Handle issue where the twitter website would get stuck
  • Fixed issue where view_count would be undefined
  • Fixed issue where the actor would get stuck on some tweets that are not available

2023-03-15

Features:

  • Added useAdvancedSearch option to use the advanced search instead of the regular search for content type Search, it works with fromDate, toDate, searchTerms and handles (usernames). It's disabled by default, and it doesn't scrape retweets.

2023-03-02

Features:

  • Added replying_to_tweet to the output, which is the link to the tweet that the current tweet is replying to
  • Added is_quote_tweet and quoted_tweet to the output, which is the tweet that the current tweet is quoting

Changes:

  • Small code refactoring.

Fixes:

  • Fixed issue where twitter sometimes return not tweets for a page, which was causing the request to finish without collecting all the tweets.

2023-02-02

Changes:

  • The Actor now uses Cheerio Crawler for content type 'People'
  • Improved cheerioCrawler stability.

Fixes:

  • Fixed issue where the actor get stuck on private profiles
  • Fixed issue where some requests where not handled properly or not added to the queue

2023-01-25

Changes:

  • Increased default maxRequestRetries to 6

2023-01-23

Features:

  • Add profilesDesired option to limit the number of scraped profiles

Changes:

  • Improved cheerio scraper by allowing it to scrape tweets using any content type
  • Improved logging for info and errors
  • The actor now uses the cheerioCrawler by default, which is more reliable than the puppeteerCrawler
  • Removed the tweetsDesired max limit, which was 3200 tweets

2023-01-21

Changes:

  • Changed the custom infiniteScroll function with the one provided by the SDK, which faster and more reliable
  • Disabled the page's cache, which increased the total number of scraped tweets by 20-40%

Feature:

  • Add useCheerio option to scrape tweets using cheerio crawler instead of puppeteer crawler, cheerio is more reliable, but it doesn't work when provided with login cookies

2023-01-19

Fix:

  • Updated scrolling to be more efficient and fixed an issue where the scrolling would quite early
  • Increased the default maxIdleTimeoutSecs to 60, to ensure that all tweets are scraped

Feature:

  • Add startUrl to output

2023-01-11

Fix:

  • Made scrolling slower to reduce the overload on the CPU
  • Add a check to stop scrolling if the desired number of tweets is reached
  • Increased the default maxRequestRetries to 6

2023-01-10

Feature:

  • Add option addTweetViewCount to include tweet view count (it's hidden and enabled by default)
  • Add view_count to the output if the option is enabled

2022-08-20

Feature:

  • Don't retry non-existing profiles
  • More efficient scrolling
  • Max concurrency increased to 3 by default

Fix:

  • Login modal blocking the scrolling
  • More resilient URL inputs and normalization

2022-08-10

Feature:

  • Revamp to Typescript and Crawlee

Fix:

  • Hanging timers on CPU overload

2022-02-25

Fix:

  • Timeline v2 object

2022-02-10

Feature:

  • Added '#sort_index' to the output
  • Updated README

2022-02-03

Fix:

  • Thread replies

2021-11-03

Fix:

  • Search results

2021-08-09

Features:

  • Update SDK 2

Bug fixes:

  • User shape object for some profiles

2021-07-18

Features:

  • Update to SDK 1.3.1

Changes:

  • Change default timeout values
  • Retiring of broken sessions
  • Deals with pinned tweets
  • Add debug log

Bug fixes:

  • Fix thread extraction

2021-06-12

Features:

  • Update to SDK 1.2.1

Fixes:

  • New GraphQL format

2021-05-03

Features:

  • Update to SDK 1.1.2
  • Recursive "People" search
  • Tweaks to wording in README and INPUT schema

Bug fixes:

  • Filter cookies that lead to never loading page / 401 error
  • Fetch data from GraphQl responses

2021-03-18

Features:

  • Update to SDK 1.0.2

Fixes:

  • Clicking on non-replies buttons

2021-02-26

Features:

  • Scrape replies of replies

Fixes:

  • Improve scraping stability

2021-02-04

Features:

  • Add topics
  • Add hashtags URLs
  • Optimize end of listings
  • Labels for outputScraperFunction for various scraper phases

Fixes:

  • Deduplication of tweets
  • Force retiring forever failing proxies

2021-01-19

  • Add mentions, symbols, URLs and hashtags to output
  • Add threads/status links support

2021-01-12

  • BREAKING CHANGE: Format of the dataset has changed
  • Search multiple terms at once, search hashtags and terms
  • Enriched user profile information (some information are only available when logged in)
  • Added minimum and max tweet dates
  • Updated SDK version
  • Custom data
  • Powerful extend output / scraper function

2020-11-25

  • Remove the need to provide credentials
  • Update SDK version
  • Allow to filter profile tweets for own tweets or include replies
  • Scrape faster when there's no login information
  • Accept twitter URLs, handles or @usernames for better user experience
  • Throws immediately if invalid handles are passed
Developer
Maintained by Apify
Actor metrics
  • 345 monthly users
  • 90.9% runs succeeded
  • 12.1 days response time
  • Created in Mar 2022
  • Modified about 1 month ago