Indeed Scraper
Pay $5.00 for 1,000 results
Indeed Scraper
Pay $5.00 for 1,000 results
Scrape jobs posted on Indeed. Get detailed information from this job portal about saved and sponsored jobs. Specify the search based on location with the output attributes position, location, and description.
Do you want to learn more about this Actor?
Get a demoHi team, just having an issue with scrapping from Indeed, keep getting "Blocked by HTML" in the log and no jobs are scrapped.
Thank you in advance
Hi, thanks for opening this issue!
Unfortunately, it seems like company jobs search URLs are currently getting fully captcha blocked by Indeed. This has already happened in the past for some specific URLs with Indeed domain combinations, and there is basically nothing we can do about it.
We will investigate this and try to find some way around the blocking, but as said, we might not be able to overcome it. Unless we will get around, we will just have to wait until they remove the blocking. They are pretty much just experimenting with this. Last time it happened, they removed it in just a few days :)
I will keep you updated here, thanks!
Thanks for getting back Lucas and for the information :)
Hello Lukáš. I have the same issue. Would it be possible to let the whole run fail in this case, so that i could at least get the last run that worked with the api (https://api.apify.com/v2/actor-runs?token= ...) Currently i would have to check the details for all runs.
Hi, thanks for the suggestion!
Just to make sure, you mean to fail the scraper straight after starting it, if any of the start URLs are the "company detail" ones? That would make sense for us, as currently this URL is just blocked 100% of the time. We can remove it when Indeed removes this crazy blocking (hopefully).
I will keep you updated here, thanks!
Yes i mean exactly what you wrote. I use this to get the latest successful run:
1const result = await axios.get( 2 `https://api.apify.com/v2/actor-runs?token=${process.env.APIFY_API_KEY}` 3 ); 4 const jsonData = result.data; 5 6 7 const succeededItems = jsonData.data.items.filter( 8 (item) => item.status === "SUCCEEDED" 9 ); 10 11 succeededItems.sort( 12 // @ts-ignore 13 (a, b) => new Date(b.finishedAt) - new Date(a.finishedAt) 14 ); 15 16 return succeededItems.length > 0 ? succeededItems[0].defaultDatasetId : null; 17};
currently it is useless cause the run fails internally. If the run would fail, i would at least get the last working one
Hi, thanks for your patience. We've accidentally got this issue stuck in our revision, and forgot to update the scraper after finishing it...
It will now fail the Actor if there are any company detail URLs on the input :) Thanks and happy scraping!