Indeed Scraper avatar

Indeed Scraper

Try for free

Pay $5.00 for 1,000 results

View all Actors
Indeed Scraper

Indeed Scraper

misceres/indeed-scraper
Try for free

Pay $5.00 for 1,000 results

Scrape jobs posted on Indeed. Get detailed information from this job portal about saved and sponsored jobs. Specify the search based on location with the output attributes position, location, and description.

Do you want to learn more about this Actor?

Get a demo
RO

Job types fields unusable

Closed

rayyala-owner opened this issue
3 months ago

The "Job Types" field in API output is unusable due to improper formatting.

Problem:

  • API combines all job types into one field
  • Can't differentiate job types between records

Example:

In the attached Excel export, job types are saved like this:

  • jobTypes/0 "Permanent"
  • jobTypes/0 "Full-time", jobTypes/1 "Permanent"
  • jobTypes/0 "Full-time", jobTypes/1 "Permanent"

But in API integrations like Zapier, job types is available like this, making it impossible to know which job types belong to which record.:

  • Job Types "Permanent,Full-time,Permanent,Full-time,Permanent"
  • Job Types "Permanent,Full-time,Permanent,Full-time,Permanent"
  • Job Types "Permanent,Full-time,Permanent,Full-time,Permanent"

Possible solutions:

  • Send different columns for each job type in API output, OR
  • Use a different delimiter to separate records e.g. "Permanent,Full-time|Permanent,Full-time|Permanent"
lukas.prusa avatar

Hi Rayyala, thanks for opening this issue!

I like the idea of Single field with clear delimiter (e.g., line breaks) in the Excel output. It makes sense and would make the static Excel format at least somewhat usable.

I don't know how much it will help your problem though, it's just moving the data processing part one step further, when you could just use the much more dynamic JSON output instead. Still, it makes sense and should be available on the platform :)

I will keep you updated here, thanks!

RO

rayyala-owner

3 months ago

@Lukáš the main issue is the comma delimiter because values also contain commas. This confuses Zapier, making it hard to identify individual values. I've updated the description, hopefully the dev will see it soon!

lukas.prusa avatar

I mean, currently the values in Excel are split apart by multiple columns, not different delimiters, right? With this upcoming feature, the values will be split apart, in a single column, by some delimiter, so it will fix this problem right?

There is an edge-case with the same delimiter being in the values themselves, but that could be easily escaped I think.

RO

rayyala-owner

2 months ago

Yes, multiple columns would solve the issue IF the split columns like jobTypes/0, jobTypes/1, etc. are available in the API as fields (currently they are not). @Lukáš are you the developer?

lukas.prusa avatar

I don't think adding this type of output to the JSON makes sense, as it would have to be done on the scraper level, not the platform, so the output come wouldn't even be reusable.

I've discussed it with the team, and it would be possible to merge the Excel columns into one with some delimiter between the values. Though the development may take time, as that has to be implemented on the platform level.

If you want to go the way with the JSON fields being split up, then you could make yourself a solution wit the Merge, Dedup & Transform Datasets utility Actor. But as said, I don't think it's worth downgrading the JSON output like that, it makes more sense to improve the less practical ones like Excel.

Yes, I'm on the of the developers for this scraper.

RO

rayyala-owner

2 months ago

@Lukáš, thanks for your reply, I appreciate it! However, I think we have a big miscommunication.

The Excel output is already good and doesn't need changes. The problem is with the API output. Also, this is a bug, not a feature request.

Current API behaviour:

  • Combines job types from multiple records into one string
  • Uses commas as delimiters, even though job type values can contain commas
  • This makes it impossible to determine which job types belong to which record
  • Other fields work well; only this (important) "job types" field seems to have this issue

Example:

  • 3 records in the data: "Full-time,Permanent", "Part-time", "Contract,Temporary"
  • Current API output: "Full-time,Permanent,Part-time,Contract,Temporary"
  • This output is unusable because we can't tell where one record ends and another begins

Possible solutions:

  • Return an array of job types for each record
  • Use a different delimiter between values or records (like pipe "|", etc.)

Can we address this API output issue specifically?

lukas.prusa avatar

Okay, yeah, there is some clear miscommunication here indeed.

What output do you mean by API? I understand API output as JSON, which works as expected and is the base output. It's outputted as ["Full-time", "Permanent"] which is correct.

lukas.prusa avatar

Can you please provide me an example of such a failing run?

RO

rayyala-owner

2 months ago

Sure, here's a run: https://console.apify.com/view/runs/39EoZKWBagN09biQE The run isn't failing. The problem is that the (very important!) job types field is not usable.

lukas.prusa avatar

Hmm, what output format are you using? The default API output of JSON has the correct values. E.g.: "jobType": ["Permanent", "Full-time"]

https://api.apify.com/v2/datasets/xwKw9tO7JjsxzdTlH/items?clean=true&format=json

lukas.prusa avatar

Hi again, were you able to find the issue?

I will close this issue for now, as it's not a problem of this Actor, rather it's a problem of the Excel output of the platform. The JSON output is correct.

You can reopen this issue if you are still experiencing this. Thanks and happy scraping!

Developer
Maintained by Apify

Actor Metrics

  • 816 monthly users

  • 118 stars

  • >99% runs succeeded

  • 2.1 days response time

  • Created in Mar 2023

  • Modified a day ago

Categories