TikTok Discover Scraper avatar

TikTok Discover Scraper

Try for free

This Actor is paid per event

View all Actors
TikTok Discover Scraper

TikTok Discover Scraper

clockworks/tiktok-discover-scraper
Try for free

This Actor is paid per event

Scrape TikTok Discover data. Just add one or more hashtags and the scraper will extract related videos, tag breadcrumbs, similar trends, and subtopics. Export scraped data, run the scraper via API, schedule and monitor runs, or integrate with other tools.

Do you want to learn more about this Actor?

Get a demo

This changelog summarizes all changes of the TikTok actors provided by the Clockworks organization. The specific actors that are affected are listed for each change.

2024-11-11

Features

  • Can now download profile avatars (tiktok-profile, tiktok-paid) and music cover images (tiktok-sound, tiktok-paid)
  • Can now filter profile videos with a "From:" date filter (to scrape old videos)

2024-10-29

Fixes

  • Can now scrape slideshows from shortened links again

2024-10-27

Changes

  • Slightly decreased the number of collected hashtag videos (up to 25%), but the cost should go around 3 times down (tiktok-paid)

2024-10-14

Fixes

  • Profile info is now pushed when a user only has pinned videos that do not match date filters

2024-10-11

Fixes

  • Fixed rare cases of detecting a profile as "not found", despite it existing, and of collecting 0 videos

2024-10-03

Changes

  • Slightly improved waiting times when scraping profiles and post details

2024-09-30

Changes

  • Hashtag scraping logic is updated to collect more results, although at a lower speed

Features

  • Added extraction of post's "Paid partnership" status under the new isSponsored output field
    • (Works for all input options, except for search by queries, because of a limitation on TikTok's side)

2024-09-20

Fixes

  • Profiles can now scrape more than 30 videos (tiktok-profile, tiktok-paid, tiktok-data-extractor)
  • Geo-blocking (e.g. for India) is now detected and an attempt to rotate proxies is made

2024-09-17

Fixes

  • Captcha when scraping profiles is now quickly detected and RESIDENTIAL proxy is used to bypass it (tiktok-profile)

2024-09-11

Features

  • Posts now have a profile link under authorMeta

2024-09-03

BREAKING

  • Info for empty profiles is now under authorMeta property Fixes
  • input field is now correct for empty profiles

2024-08-28

Fixes

  • Mentions are now correctly extracted

2024-08-19

Fixes

  • Decreased cases when less than 100 results would be scraped from a sound page, even though there were more videos (tiktok-sound)

2024-08-08

Fixes

  • Fixed a very rare case when a profile with just 1 video would get stuck in an infinite retry loop (tiktok-profile) Features
  • A new input option to exclude pinned posts from profiles is now present (tiktok-profile, tiktok-paid, tiktok-data-extractor)

2024-08-06

Fixes

  • Profile scraping has been fixed for certain proxy groups (tiktok-profile)

2024-07-26

Changes

  • Somewhat improved speed of comment scraping (tiktok-comments-scraper)

2024-07-19

Fixes

  • Hashtag scraping is more consistent at scraping at least 100 results (if it has this many videos) (tiktok-hashtag, tiktok-paid, tiktok-data-extractor)

2024-07-16

Fixes

  • Hashtag scrape should now again scrape at least 100 results (tiktok-hashtag, tiktok-paid, tiktok-data-extractor)

2024-07-12

Fixes

  • Fixed rare cases when the whole account scrape would fail if a profile contained a faulty video (tiktok-profile)
  • Handled rare cases when profile scraping would get stuck with "No videos returned, but cursor is updated, retrying..." (tiktok-profile)
  • Handled rare cases when profile would timeout waiting for initial user details (tiktok-profile)

2024-07-11

Fixes

  • Reduced number of cases when search scrapes timed out without results (tiktok-paid, tiktok-data-extractor)
  • Pinned posts are scraped again

2024-07-04

Fixes

  • input output field is now present again for missing profiles (tiktok-profile)

2024-07-01

Fixes

  • Improved success rate for search (tiktok-paid, tiktok-data-extractor)

2024-06-26

Fixes

  • Slight improvement of pagination logic (tiktok-comments)
  • input field is back for output items (tiktok-profile)

2024-06-13

Fixes

  • Search scrapes again collect more than one page of results (tiktok-paid, tiktok-data-extractor)
  • Profiles are now scraped in full again (tiktok-profile)

2024-06-11

Fixes

  • Updated Crawlee version to avoid "TargetClosedError" error and added retries in case it happens anyway
  • Fixed cases when comment scrape would end too early (tiktok-comments)

2024-06-10

Fixes

  • Improved video download for some proxy groups
  • Fixed cases when covers would fail to be downloaded
  • Hashtag scrape will now try scraping more results: up to 800-900 for some tags as opposed to 20-50 previously (tiktok-hashtag)

2024-06-07

Fixes

  • Covers are now stored correctly in the Key-Value store
  • Rare cases of scraper thinking video is not found when it is actually found eliminated

2024-06-04

Changes

  • If you provide an incorrect URL, the actor won't fail, but will push an error result to the dataset and carry on with valid URLs

2024-05-31

Fixes

  • Fixed rare crash occurrence when search crawling fails (tiktok-search)

2024-05-29

Fixes

  • Fixed account search functionality (tiktok-paid, tiktok-data-extractor)
  • Fixed scraping of posts from shortened post URLs

2024-05-27

Changes

  • During a run, you can see periodic aggregated statistics in the status message: how many profiles/hashtags/etc. have been finished, how many media has been downloaded.

2024-05-28

Changes

  • Downloaded media now have the author's name and date posted in the 2nd and 3rd segments of filenames

2024-05-27

Fixes

  • Correctly push results for accounts with no videos (tiktok-profile)

2024-05-24

Fixes

  • A rare occurrence when profile scraping would end prematurely is now fixed (tiktok-profile)

2024-05-20

Fixes

  • Fixed a bug when the scraper would fail if a profile contained sensitive content (tiktok-profile)
  • Fixed an early scraper finish related to the above bug (tiktok-profile)
  • Fixed scraping of slideshows with vm. URL shortened format (tiktok-video)

2024-05-07

Features

  • You can now select country for proxy if you want to scrape geolocked videos under "Proxy settings" section. This will increase costs of scraping (tiktok-paid)

2024-05-03

Fixes

  • Data duplication is prevented in some cases of post detail scraping (tiktok-video)

2024-05-02

Fixes

  • Video download links are now not created for slideshow posts

2024-04-30

Fixes

  • Fixed faulty video download in some cases

2024-04-16

Fixes

  • When downloading videos, fixed cases when the actor would crash from a discovered malformed URL

2024-04-08

Fixes

  • Posts with sensitive content are now correctly handled again (tiktok-video, tiktok-data-extractor, tiktok-paid)

2024-04-05

Fixes

  • Improved navigation speed for post scraping (tiktok-video, tiktok-data-extractor, tiktok-paid)

2024-04-04

Fixes

  • Fixed cases when input terms/queries would not be listed in the output items (tiktok-profile, tiktok-paid, tiktok-data-extractor)
  • Author info is collected fully, if it is missing

2024-04-03

Fixes

  • Fixed a bug when the actor would fail for some profile links that do not exist (tiktok-profile) Features
  • Subtitle download is now supported (where it's present on the video) (tiktok-video, tiktok-paid, tiktok-free, tiktok-sound, tiktok-hashtag, tiktok-profile)

2024-04-01

Changes

  • Download logic changes to speed up the process (*)

2024-03-25

BREAKING CHANGES

  • Max memory limit is now 4GB (tiktok-data-extractor) Features
  • If a profile doesn't have videos that match your date filter, it will push an item to the dataset anyway with account information (tiktok-profile)

2024-03-22

Fixes

  • Video downloading is restored (tiktok-profile, tiktok-hashtag)

2024-03-21

Fixes

  • Cases where search returns captcha are now retried (it used to tell that there are 0 results) (tiktok-search) Features
  • Erroneous outputs (e.g. empty profiles or not-found hashtags) now contain input field that matches your exact input for that item (e.g. profile name or URL)

2024-03-19

Features

  • For comments you now can see a link to the commenter's avatar thumbnail (tiktok-comments)

2024-03-06

Fixes

  • For some proxy groups the reliability of profile scraping has been improved (tiktok-profile)

2024-03-05

BREAKING CHANGES

  • For profiles, hashtags, and posts that are not found, and for search queries that return no results, you will now get an item in the dataset with the URL property and the reason this profile/tag has not been scraped

2024-02-22

Fixes

  • Videos are now properly downloaded again (*)

2024-02-21

Fixes

  • A rare case of the Actor failing is now fixed (tiktok-comments)

2024-02-19

Fixes

  • API boosting has been unblocked

2024-02-12

Changes

  • API boosting is temporarily disabled as TikTok has introduced some blocking that is being worked on (tiktok-hashtag, tiktok-profile)

2024-01-31

Fixes

  • Hashtag name from input is now again in the output items (searchHashtag) (tiktok-hashtag)

2024-01-29

Changes

  • New feature for improved scraping speed is now applied by default for 50% of users as part of A/B testing (tiktok-profile, tiktok-hashtag)

2024-01-29

Fixes

  • URLs that contain "/photo/" in them are now correctly handled (tiktok-video)
  • Several improvements for comment scraping that prevent the actor from getting stuck (tiktok-comments)

2024-01-17

Changes

  • Input fields for enabling experimental API scraping have been removed. This behaviour will be enabled by default once it is more polished, and for now you can manualy enable it by passing in "enableCheerioBoost": true in the JSON editor.

2024-01-04

Features

  • Experimental boosting can now be applied to hashtags. As with profile boosting (see 2023-12-29), tick the option "Boost hashtag scraping" and a faster version will be deployed (tiktok-hashtag)

2023-12-29

Features

  • Boosting for profiles is back: tick the "Boost profile scraping" option and it will be faster. Its con is that TikTok might suddenly change their website and this way of scraping will likely start failing then. Under the hood it uses API requests as opposed to launching a full-blown browser simulation. (tiktok-profile)

2023-12-15

Fixes

  • Fixed an issue with validating profile URLs. (tiktok-profile, tiktok-comments, tiktok-paid, tiktok-free)

2023-12-15

Fixes

  • Fixed bug where waiting for page reload when blocked could crash the actor with TimeoutError. (All TikTok actors)

2023-12-14

Fixes

  • Fixed a bug where the Actor is failing to query author data. (All TikTok actors)
  • Fixed issue where the actor sometimes gets stuck (tiktok-profile, tiktok-comments, tiktok-paid, tiktok-free)

2023-12-12

Fixes

  • Fixed a bug where the Actor is failing to scrape profiles. (All TikTok actors)

Changes

  • Plain HTTP stopped working for start URLs so instead use full browser for everything. This will make profiles -> posts runs significantly slower and more costly. We are looking into options on how to enable plain HTTP again

2023-11-18

Fixes

  • Fixed a bug where scraping comments and profiles fail. (tiktok-profile, tiktok-comments, tiktok-paid)

2023-09-29

Features

  • You can now search for profiles (Accounts) by username. You can pass a list of "Profile Queries" (profilesQueries) and set a limit for each query by providing the "Max profiles per query" (maxProfilesPerQuery). The actor will then search for the profiles and scrape the videos of each profile. (tiktok-profile)

2023-08-29

Fixes

  • Cheerio has been unblocked, you can now scrape posts quickly again, if you request <= 15 videos for tiktok-hashtag, tiktok-sound or <= 30 for tiktok-profile Features
  • You can now disable cheerio/HTTP querying in the input by setting disableCheerioBoost and disableEnrichAuthorStats to true. This will help if the problem with quick scraping arises ever again. (tiktok-hashtag, tiktok-profile, tiktok-sound, tiktok-free, tiktok-paid)

2023-08-21

Changes

  • Temporarily, boosting with Cheerio and additional querying for missing author stats has been disabled. It will be returned soon, once TikTok blocking is bypassed

2023-08-15

Fixes

  • The crawler will rotate proxies if a sound is unavailable under the current country (tiktok-sound)

2023-08-14

Features

  • If a sound is blocked in some country, the scraper will retry (tiktok-sound)

2023-08-10

Features

  • You can now download slideshow images by toggling a corresponding input (all, except for tiktok-comments)
  • Output videos now have a field telling whether they are slideshows (all, except for tiktok-comments)
  • If fast Cheerio crawler fails to load a page even with retries, the slower crawler will be used as a fallback (all, except for tiktok-comments)

2023-07-28

Features

  • Output now contains submittedVideoUrl in addition to webVideoUrl. It copies the post url in the input and may differ from the webVideoUrl, but both would lead to the same post. Will be present if you input direct post URLs. (tiktok-video, tiktok-comments-scraper)

2023-07-26

Fixes

  • Videos with sensitive content, which require a login, are now skipped gracefully (tiktok-video, tiktok-comments-scraper)

2023-07-25

BREAKING CHANGES

  • Proxies have been removed from input. Apify's datacenter proxies are always chosen, as they used to be by default (for all scrapers)
  • Scrape info about private/empty channels has also been removed from input, and set to true by default. If you applied this option and set it to false previously, you should experience no changes (tiktok-profile, tiktok-paid, tiktok-free)

Changes

  • The maximum memory is now limited to 4096 MBs for all pay-per-result actors (tiktok-hashtag, tiktok-profile, tiktok-sound, tiktok-video)

2023-07-11

Fixes

  • Now URLs in the format of https://www.tiktok.com/t/.../ are also recognized as post URLs.

2023-07-04

Features

  • Now URLs in the format of vt.tiktok.com are also supported as post URLs.

2023-06-29

Fixes

  • Now correctly utilizes proxy settings when boosting the scraper with Cheerio and querying for author stats. Previously it would often fail. (tiktok-profile-scraper, tiktok-hashtag-scraper, tiktok-sound-scraper)

2023-06-26

Fixes

  • Fixed a bug when in some cases it would not return any comments. (tiktok-comments-scraper)
  • Fixed a bug during reply counting. Previously it would sometimes stop too early, especially if the requested number is low. (tiktok-comments-scraper)
  • If the author stats are missing, the scraper will now make an additional quick request to the author page to get them. These stats get cached, so the query is made only one time. (tiktok-profile-scraper, tiktok-hashtag-scraper, tiktok-sound-scraper)

2023-06-23

Features

  • You can now scrape replies. Note that currently it's not guaranteed that all of them are going to get scraped. (tiktok-comments-scraper)
  • Scraping is up to 4x faster if you limit maxResultsPerPage to 30 for posts and 15 for hashtags. This is because the scraper can utilize non-browser requests without the need to scroll. (tiktok-profile-scraper, tiktok-hashtag-scraper, will be added to tiktok-scraper once it is converted to a price-per-result system).
  • Added oldestPostDate and scrapeLastNDays to only scrape posts from now up to a certain date. (tiktok-profile-scraper)

2023-05-25

Features

2023-05-19

Features

  • You can now enable the flag in input (scrapeEmptyChannelInfo) that will allow you to save info about private or empty channels even if they don't contain any posts (tiktok-profile-scraper)

2023-04-19

Fixes

  • Won't wait for the first XHR response with hashtags if the data in the initial script is enough.
  • Doesn't try to retry on 404 pages or private accounts anymore

2023-04-12

Features

  • Progress now tracks the last video reached while scrolling at a certain page (both for hashtags and profiles) and the comments scraped for a post. If scrolling fails (e.g. due to a captcha) the crawler will try to restore the scroll at the last video/comment, so as not to scroll all the way down again. If restoration fails though, it should fall back to the old behavior and print a warning.
  • Change default shouldRetryStuckComments to true, since now it's possible to restore scroll in such cases.
  • When comment or post list crawlers manage to scrape new videos by scrolling, retryCount is reset to 0.

Fixes

  • The hashtag route now catches the initial XHR response that adds usually up to 15 new videos which previously weren't scraped
  • Added failed request handler to post detail route, so that no matter what error happens during comment scrape, it's going to push partial results if retries are exhausted.

2023-04-03

Features

  • New option to download cover images. Can also specify an optional KVStore name (shared with video download). Analogous to video download from 2023-03-27. If opted in, will replace the link in videoMeta.coverUrl with link to kvStore. Also added originalCoverUrl pointing to tiktok CDN cover url.

2023-03-27

Features

  • New option to download videos. Can also specify optional KVStore name. If opted in, will replace the link in mediaUrls with link to kvStore. Will also place downloadAddr in videoMeta pointing to kvstore and originalDownloadAddr pointing to tiktok CDN

2023-03-07

Fixes

  • Hotfixed broken post URLs

2023-03-06

Fixes

  • Don't stop scrolling if there are still more videos to load (this happened when the initial videos count was less than exptected, now we check dynamically if there are more videos to load)

2023-02-22

Fixes

  • Don't get stuck if all videos or comments were already loaded before scrolling (this bug happened when there were less than 20 videos or comments)

2023-02-03

Fixes

  • Improved scrolling for videos that got stuck on 30 videos loaded. It is still a bit clunky and there is a lot of blocking but should work with few retries.

2023-01-27

Features

  • loginCookies are no longer required for scraping comments
  • The comment crawling is rewritten to scrape from underneath the post
  • no login sessions are created or managed anymore by the actor
  • loginCookies on input are deprecated and generate warning in the log

2022-8-14

Features

  • Output was updated to include more properties. New properties:
1{
2  "locationCreated": "CA",
3  "isAd": false,
4  "isMuted": false,
5  "authorMeta": {
6    "bioLink": "https://www.thewhiskyexplorer.ca",
7    "commerceUserInfo": {
8      "commerceUser": true,
9      "category": "Food/Beverage",
10      "categoryButton": false
11    },
12    "isUnderAge18": false,
13    "privateAccount": false,
14    "region": "CA",
15    "roomId": "",
16    "ttSeller": false
17  },
18  "musicMeta": {
19    "coverMediumUrl": ...,
20    "musicId": "7105825676251351814"
21  },
22  "videoMeta": {
23    "coverUrl": ...,
24    "definition": "720p",
25    "format": "mp4"
26  },
27}

2022-2-1

Features

  • Add the possibility to scrape comments under login using loginCookies
  • Add login session management to avoid blocking of the account used for scraping comments
  • Searching hashtag and number of views for this hashtag are now stored in the output

Fixes

  • Actor does not deduplicate videos for different hashtags - improves the accuracy of the number of outputted items
  • Improved logs

2022-1-14

Fixes

  • fixed scraping of authorMeta data from scripts - affected: post urls and first batch of videos for hashtag and profile

2022-1-10

Fixes

  • updated scraping of hashtags and profiles - TikTok randomly displays two types of scripts with data
  • fixed number of output items to be the same as resultsPerPage
  • fixed Timed out error when waiting for the xhr response with data - the scraper now scrolls until it receives the response or the waiting/scrolling times out
  • more readable error messages
  • fixed progress caching
  • empty strings are no longer accepted as hashtag, postUrl or profile

2022-1-4

Fixes

  • updated scraping of individual posts - TikTok randomly displays two types of scripts with data

2021-10-20

Fixes

  • when page.waitingForResponse timeouts, it retires the session and restarts browser. This should prevent looping of timeouts on request retries

Features

  • TikTok sometimes sends a request for the same data two times. This behavior won't affect total number of outputted data, specified on input. (Also duplicity videos for a hashtag/profile searches will be scraped only once, but won't be counted into the number of outputted data for the specific search)
  • Sometimes there are more than 6 videos loaded on the search page. The scraper won't push them into the outputted results, so that the number of results remains consistent according to the specification on input.

2021-10-18

Fixes

  • computation of outputLength is no longer dependent on persisted progress, meaning scraping of more than one hashtag/profile is now working properly

2021-10-15

Fixes

  • handlePageFunction does not timeout when resultsPerPage are set low

2021-10-14

Features

  • New output structure
  • Added the possibility to scrape more than the first page of results (regulated by resultsPerPage input)
  • Scrapes user profiles defined on input by username in profiles
  • Added optional attributes maxConcurrency, maxRequestRetries and resultsPerPage to input
  • If resultsPerPage is not specified, it defaults to 10 and minimal value is 1
Developer
Maintained by Apify
Actor metrics
  • 22 monthly users
  • 6 stars
  • 100.0% runs succeeded
  • Created in May 2024
  • Modified 2 days ago