Linkedin data scraper ( EVERY-THING ) avatar

Linkedin data scraper ( EVERY-THING )

Try for free

2 hours trial then $125.00/month - No credit card required now

View all Actors
Linkedin data scraper ( EVERY-THING )

Linkedin data scraper ( EVERY-THING )

muhammad_usama/linkedin-data-scraper-every-thing
Try for free

2 hours trial then $125.00/month - No credit card required now

This actor can scrape anything from linkedin. Anything includes Person Data, Company Data, Person Posts, Company Posts, Search Jobs, Search People, Search Companies, Search Posts and much more

VI

Linden Scrape Actor Fails with Error: "urns is not iterable" When Increasing Page Count

Closed

vaibhav_inzio opened this issue
a month ago

I encountered an issue while using the Linden Scrape Actor. When I try to increase the number of pages, the scrape process fails with the following error: "urns is not iterable". This issue occurs consistently and prevents me from scraping data across multiple pages. The error log looks like this: makefile Copy code 2024-10-15T05:40:04.299Z Arranging resources... 2024-10-15T05:40:09.204Z Starting scrape process... 2024-10-15T05:40:12.630Z Scraping failed! 2024-10-15T05:40:12.632Z Error : urns is not iterable Could you please look into this issue and advise on a fix or workaround?

VI

vaibhav_inzio

a month ago

we are working on urgent client project can you please provide the solution asap.

muhammad_usama avatar

Can you share input you are feeding to the actor?

endpoint ?? body ??

VI

vaibhav_inzio

a month ago

We are encountering a limitation with the LinkedIn post scraper when attempting to scrape posts using the "RSV" keyword. Despite LinkedIn showing over 1000 posts for this query, the Actor only returns approximately 345 posts.

{ "body": { "query": "RSV", "page": 1, "sort_by": "date_posted" }, "endpoint": "search-posts" }

output json contain this value: 'total_posts': 379

got only 379 records which is around 5 days data. can you tell me how to get the all data.

muhammad_usama avatar

Did you go page by page upto page 100?

I believe it returns 10 posts per page

VI

vaibhav_inzio

a month ago

"It is returning 20 records per page, and I ran up to page 18, where I retrieved around 379 posts. After that, on page 19, it returned an 'URN not found' error.

muhammad_usama avatar

This is because there were only 379 posts found against your specified keyword

VI

vaibhav_inzio

a month ago

no, on ui there are a lot of posts more than 10000.

379 records are only from last 5 days.

VI

vaibhav_inzio

a month ago

I have tested more than 5 keywords and it only fetched the latest 300 to 400 records. I think it is restricted to fetching that much data only.

can you check this?

you also can try one record and check the count on both ui and this actor.

muhammad_usama avatar

Can you share a screenshot of total_posts attribute at the bottom

VI

vaibhav_inzio

a month ago

this is the script which I am using:

1import requests
2import json
3import os
4
5# API token and Actor URL
6API_TOKEN = 'api-key'
7ACTOR_URL = f"https://api.apify.com/v2/acts/muhammad_usama~linkedin-data-scraper-every-thing/run-sync-get-dataset-items?token={API_TOKEN}"
8
9
10def scrape_linkedin(query, max_pages, postedAgo, endpoint):
11    all_pages_data = []  # List to store all page data
12
13    # Iterate over the number of pages
14    for page in range(1, max_pages + 1):
15        try:
16            # Create payload matching the input structure used in UI
17            payload = {
18                "body": {
19                    "query": query,
20                    "page": page,  # Incrementing pages
21                    "sort_by": "date_posted",  # Sorting by most recent
22                   
23                },
24                "endpoint": endpoint  # Specifying the endpoint
25            }
26
27            # Make POST request to the Actor URL
28            response = requests.post(ACTOR_URL, json=payload)
29            data = response.json()  # Get the JSON data
30            
31            # Store the data in memory and save to a file
32            all_pages_data.append(data)
33
34            # # Save each page as a separate JSON file
35            # file_path = os.path.join(output_dir, f'page_{page}.json')
36            # with open(file_path, 'w') as file:
37            #     json.dump(data, file, indent=4)  # Save the JSON data to a file
38
39            print(f"Scraping and storing page {page} completed.... [trimmed]
muhammad_usama avatar

As you can see, total_posts:379

VI

vaibhav_inzio

a month ago

I see that, but actually, there are more than 379 posts. Can you search for RSV on LinkedIn's UI once? You will understand what I am saying.

muhammad_usama avatar

387, what's the wrong

VI

vaibhav_inzio

a month ago

I got your point. but I am saying for our client project we need last 3 months data but we got only last 1 week data from the api. can you tell me how to solve this problem. attached is the last post which I also see on mobile application. I want you to help me to fetch last 3 month post. I hope you got my point?

muhammad_usama avatar

We scrape as per LinkedIn, all the search filters are already there. You can see all possible filters in the actor information

VI

vaibhav_inzio

a month ago

Okay, Thank you for you support. :)

Developer
Maintained by Community
Actor metrics
  • 29 monthly users
  • 4 stars
  • 100.0% runs succeeded
  • 2.4 days response time
  • Created in Aug 2024
  • Modified 3 months ago