🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper avatar
🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper
Try for free

Pay $0.30 for 1,000 tweets

View all Actors
🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper

🏯 Tweet Scraper V2 (Pay Per Result) - X / Twitter Scraper

apidojo/tweet-scraper
Try for free

Pay $0.30 for 1,000 tweets

⚡️ Lightning-fast search, URL, list, and profile scraping, with customizable filters. At $0.30 per 1000 tweets, and 30-80 tweets per second, it is ideal for researchers, entrepreneurs, and businesses! Get comprehensive insights from Twitter (X) now!

DU

Tweets are missing/Inconsistent results.

Closed

durian.corp opened this issue
a month ago

While scraping the query as per the attached run, we expected to obtain 70 tweets (excluding advertisements), however only 66 were returned and there doesn't appear to be any logic for the four tweets which were skipped, i.e., some are simple tweets with photos, while one directly follows an advertisement in the twitter advanced search.

I appreciate your advice on this matter.

-Tyler

DU

durian.corp

a month ago

Might this be related to the very large time span which our query covers? I noted your response to another user concerning breaking the query dates into smaller segment.

DU

durian.corp

a month ago

I went ahead and did month-by-month and week-by-week searches as per the below screenshots, but these produced even fewer results, Whole time range: 66 of 70 Month-by-Month: 50 of 70 Week-by-week: 44 of 70

Example of actor inputs:

{'customMapFunction': '(object) => { return {...object} }', 'includeSearchTerms': True, 'maxItems': 23810, 'maxTweetsPerQuery': 23810, 'onlyImage': False, 'onlyQuote': False, 'onlyTwitterBlue': False, 'onlyVerifiedUsers': False, 'onlyVideo': False, 'searchTerms': ['(ชายแดนใต้ OR ไฟใต้) (ไทยพุทธ) since:2022-01-01 until:2022-01-07\t', '(ชายแดนใต้ OR ไฟใต้) (ไทยพุทธ) since:2022-01-08 until:2022-01-14\t', '(ชายแดนใต้ OR ไฟใต้) (ไทยพุทธ) since:2022-01-15 until:2022-01-21\t', '(ชายแดนใต้ OR ไฟใต้) (ไทยพุทธ) since:2022-01-22 until:2022-01-28\t', ...], 'sort': 'Latest'}

apidojo avatar

Hey hey,

Can you send us how you get 70 results from Twitter so we can investigate further?

Best

DU

durian.corp

a month ago

Sure: I typed the original query into X and counted as I knew prior to this that this topic could be used for testing. :) (Literally (ชายแดนใต้ OR ไฟใต้) (ไทยพุทธ) since:2022-01-15 until:2022-01-21)

I know it isn't exactly the most programatic solution, but it works for test sets.

DU

durian.corp

a month ago

I can also indicate the missing four posts if this is of any assistance?

apidojo avatar

Hey hey,

Yes that would be great. Can you also send me your direct search URL? When I search these, I cannot get any results.

Cheers

DU

durian.corp

a month ago

Also, I note that there is a difference between how the search terms are ordered and we also experimented with this, both versions below produce the same results (66).

Version 1: ไทยพุทธ (ชายแดนใต้ OR ไฟใต้) until:2024-07-24 since:2022-01-01 Version 2: (ชายแดนใต้ OR ไฟใต้) ไทยพุทธ until:2024-07-24 since:2022-01-01

Although the order of the results did vary.

apidojo avatar

Hello,

We are checking the missing tweet issue. For the other one, that is an expected behavior since Twitter searches by keyword and it doesn't check the order. You can try to put these keywords between quotes if you want to get the exact result like "ไทยพุทธ (ชายแดนใต้ OR ไฟใต้) ". I believe that is the suggested way of doing a search like this. You can refer to https://github.com/igorbrigadir/twitter-advanced-search to get more information

Cheers

DU

durian.corp

a month ago

Thanks for prioritizing the missing tweets. I am not too concerned with the order of results, as that is of no consequence. Although I question whether wrapping the boolean operators in double quotes is recommended? That is used for exact phrase matches, isn't it? Wouldn't that effectively render the implicit 'AND' and 'OR' as keywords?

apidojo avatar

Hey there,

Just got a reply from our engineering team. They investigated the missing tweet however couldn't find the real cause here. One of the possible reasons is that these tweets could be marked as sensitive. Our scrapers returning you whatever they can get from Twitter so sometimes there can be some inconsistencies or missing data. We cannot promise anything related to the data we get from Twitter.

For the Twitter queries, I think that github link has the most information out there. We don't have any more information on top of that.

I hope these will help.

Cheers!

DU

durian.corp

a month ago

That's completely understandable and I appreciate you looking into it. :)

Developer
Maintained by Community
Actor metrics
  • 1.4k monthly users
  • 167 stars
  • 98.5% runs succeeded
  • 6.5 hours response time
  • Created in Nov 2023
  • Modified about 22 hours ago