Linkedin Posts Reactions Scraper avatar
Linkedin Posts Reactions Scraper
Try for free

3 days trial then $25.00/month - No credit card required now

View all Actors
Linkedin Posts Reactions Scraper

Linkedin Posts Reactions Scraper

saswave/linkedin-posts-interactions-parser
Try for free

3 days trial then $25.00/month - No credit card required now

Scrape reactions or likes from linkedin post. Allows you to extract all interactions from a post (comment, like, mentions). Input can be a /posts url. You can also provide a /company or /in url and it will parse multiple posts from the source (organic posts and promoted ads)

MW

Allow proxy setting

Closed

mwatch opened this issue
8 months ago

Would it be possible to add the code that enables the user to set the proxy settings (as you have for your Linkedin Informations Parser)?

saswave avatar

SASWAVE (saswave)

8 months ago

It’s planned , I moved it up from my todo list , will be done tonight

All my actors would need a bit of cleaning so they have almost a standard input

Thx for raising the issue , will ping you on this post when ready

saswave avatar

SASWAVE (saswave)

8 months ago

Done, have a try and tell us if you need anything else

MW

mwatch

8 months ago

I gave it a try using inputs that had worked before and they're all return zero results when using the proxy.

saswave avatar

SASWAVE (saswave)

8 months ago

sorry my bad, fixed a typo

Try it now

MW

mwatch

8 months ago

Still getting an error:

2023-11-23T06:44:51.460Z Traceback (most recent call last): 2023-11-23T06:44:51.462Z File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 776, in urlopen 2023-11-23T06:44:51.464Z self._prepare_proxy(conn) 2023-11-23T06:44:51.466Z File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 1045, in _prepare_proxy 2023-11-23T06:44:51.468Z conn.connect() 2023-11-23T06:44:51.469Z File "/usr/local/lib/python3.11/site-packages/urllib3/connection.py", line 625, in connect 2023-11-23T06:44:51.471Z self._tunnel() # type: ignore[attr-defined] 2023-11-23T06:44:51.473Z ^^^^^^^^^^^^^^ 2023-11-23T06:44:51.475Z File "/usr/local/lib/python3.11/http/client.py", line 926, in _tunnel 2023-11-23T06:44:51.477Z raise OSError(f"Tunnel connection failed: {code} {message.strip()}") 2023-11-23T06:44:51.479Z OSError: Tunnel connection failed: 590 UPSTREAM503 2023-11-23T06:44:51.480Z 2023-11-23T06:44:51.482Z The above exception was the direct cause of the following exception: 2023-11-23T06:44:51.484Z 2023-11-23T06:44:51.486Z urllib3.exceptions.ProxyError: ('Unable to connect to proxy', OSError('Tunnel connection failed: 590 UPSTREAM503')) 2023-11-23T06:44:51.488Z 2023-11-23T06:44:51.490Z The above exception was the direct cause of the following exception: 2023-11-23T06:44:51.491Z 2023-11-23T06:44:51.494Z Traceback (most recent call last): 2023-11-23T06:44:51.496Z File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 486, in send 2023-11-23T06:44:51.498Z resp = conn.urlopen( 2023-11-23T06:44:51.500Z ^^^^^^^^^^^^^ 2023-11-23T06:44:51.501Z File "/usr/local/lib/python3.11/site-packages/urllib3/connectionpool.py", line 844, in urlopen 2023-11-23T06:44:51.503Z retries = retries.increment( 2023-11-23T06:44:51.507Z ^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.509Z File "/usr/local/lib/python3.11/site-packages/urllib3/util/retry.py", line 515, in increment 2023-11-23T06:44:51.511Z raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] 2023-11-23T06:44:51.513Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.515Z urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.linkedin.com', port=443): Max retries exceeded with url: /in/ACoAAAFENPwB76qNWy0Qe79fc20rX-RFiI5QbIc (Caused by ProxyError('Unable to connect to proxy', OSError('Tunnel connection failed: 590 UPSTREAM503'))) 2023-11-23T06:44:51.517Z 2023-11-23T06:44:51.518Z During handling of the above exception, another exception occurred: 2023-11-23T06:44:51.520Z 2023-11-23T06:44:51.522Z Traceback (most recent call last): 2023-11-23T06:44:51.524Z File "/usr/src/app/src/main.py", line 544, in main 2023-11-23T06:44:51.526Z linkedin.url_linkedin_unanonymous() 2023-11-23T06:44:51.527Z File "/usr/src/app/src/main.py", line 362, in url_linkedin_unanonymous 2023-11-23T06:44:51.529Z res = requests.get(contact['linkedin'], headers=headers, cookies=self.cookies, proxies=self.proxies) 2023-11-23T06:44:51.531Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.533Z File "/usr/local/lib/python3.11/site-packages/requests/api.py", line 73, in get 2023-11-23T06:44:51.535Z return request("get", url, params=params, **kwargs) 2023-11-23T06:44:51.537Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.539Z File "/usr/local/lib/python3.11/site-packages/requests/api.py", line 59, in request 2023-11-23T06:44:51.541Z return session.request(method=method, url=url, **kwargs) 2023-11-23T06:44:51.543Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.545Z File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 589, in request 2023-11-23T06:44:51.547Z resp = self.send(prep, **send_kwargs) 2023-11-23T06:44:51.549Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.551Z File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 703, in send 2023-11-23T06:44:51.553Z r = adapter.send(request, **kwargs) 2023-11-23T06:44:51.557Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-23T06:44:51.560Z File "/usr/local/lib/python3.11/site-packages/requests/adapters.py", line 513, in send 2023-11-23T06:44:51.562Z raise ProxyError(e, request=request)

saswave avatar

SASWAVE (saswave)

8 months ago

did you provide your own proxies ?

saswave avatar

SASWAVE (saswave)

8 months ago

Send me a google meet invite : sousalopes.thomas@gmail.com or if you are not available , you can send me the input you used by mail, (only proxy used, i will use my own cookie session) So that i can emulate your problem and find a solution

MW

mwatch

8 months ago

I'm using proxies provided by apify. I tried both the residential and datacenter. Here's the output using the default datacenter proxy: 2023-11-27T23:34:33.994Z proxyConfiguration {'useApifyProxy': True, 'apifyProxyGroups': ['BUYPROXIES94952']} 2023-11-27T23:34:34.103Z proxy url http://groups-BUYPROXIES94952:*********@10.0.33.153:8011 2023-11-27T23:34:34.109Z Traceback (most recent call last): 2023-11-27T23:34:34.111Z File "/usr/src/app/src/main.py", line 536, in main 2023-11-27T23:34:34.112Z linkedin.run() 2023-11-27T23:34:34.113Z File "/usr/src/app/src/main.py", line 357, in run 2023-11-27T23:34:34.114Z self.linkedin(url) 2023-11-27T23:34:34.115Z File "/usr/src/app/src/main.py", line 349, in linkedin 2023-11-27T23:34:34.116Z self.call_linkedin_comments(url, type_url, ugc) 2023-11-27T23:34:34.117Z File "/usr/src/app/src/main.py", line 253, in call_linkedin_comments 2023-11-27T23:34:34.118Z sufix_urn_li = url.split('-%s-'%prefix_urn_li)[1].split('-')[0] 2023-11-27T23:34:34.120Z ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^ 2023-11-27T23:34:34.120Z IndexError: list index out of range

saswave avatar

SASWAVE (saswave)

8 months ago

can you provide your input , will try emulate and understand why it's failing to parse the post url

MW

mwatch

8 months ago

I've tried it with and without the proxy. What I'm seeing now at the end of the log is: 023-11-27T20:40:45.645Z Transforming anonymous people urls to urls with universalnames 2023-11-27T20:40:45.657Z 2023-11-27T20:44:12.935Z skiping author 2023-11-27T20:44:12.938Z no results to be saved

Here is an example input that worked before the change but returns zero results now: https://www.linkedin.com/posts/tomvarghesejr_heforshe-lifelonglearning-mentorship-activity-6475554808718860288-VJlB This also worked before: "days_since_post": 6, "url": "https://www.linkedin.com/in/navneet-singh-160012/" And this is a new link format that I just started getting from linkedin: https://www.linkedin.com/feed/update/urn:li:activity:7134940395557330944/

I've had to stop because I was testing without the proxy and got my account temporarily blocked.

saswave avatar

SASWAVE (saswave)

8 months ago

fixed the issue related to "no results to be saved" (small update on api call for Transforming anonymous people urls to urls with universalnames)

How did you get this format ? https://www.linkedin.com/feed/update/urn:li:activity:7134940395557330944/ When i copy post url from my feed page (check screenshot) i only get this format https://www.linkedin.com/posts/

I added the handling of this format to not get the error of index out of range (your message from 2 days ago)

Developer
Maintained by Community
Actor metrics
  • 14 monthly users
  • 4 stars
  • 33.9% runs succeeded
  • 0.14 hours response time
  • Created in Oct 2023
  • Modified 29 days ago