Indeed Scraper avatar

Indeed Scraper

Try for free

Pay $5.00 for 1,000 results

Go to Store
Indeed Scraper

Indeed Scraper

misceres/indeed-scraper
Try for free

Pay $5.00 for 1,000 results

Scrape jobs posted on Indeed. Get detailed information from this job portal about saved and sponsored jobs. Specify the search based on location with the output attributes position, location, and description.

FF

Scrape unique posts across multiple runs

Closed
ferrari-f1 opened this issue
14 days ago

I noticed that the same job post is being scraped multiple times across different runs. I run the scraper daily and I only want it to scrape jobs that it hasn't already scraped in this or previous runs. Is this possible and I am just not configuring the actor properly?

lhotanok avatar

Hi, incremental scraping is unfortunately not supported out of the box, you would need a custom solution for that. Whenever you start a new run, the Actor will scrape all job posts currently available for your search query, regardless of those scraped in previous runs.

You can do a workaround and integrate this Actor with Merge, Dedup & Transform Datasets. You can run it manually and specify IDs of datasets you want to merge and deduplicate. Or you can set up an automatic integration of Indeed Scraper + Merge, Dedup & Transform Datasets. The integration will be trigerred automatically on each successful run of Indeed Scraper. You just need to provide Indeed Scraper's ID hMvNSpz3JnHgl5jkh in the collapsible section Load datasets from actor or task (see the attached screenshot). Unfortunately, this solution won't save you the extra costs on duplicate job posts being scraped - you will still pay $5 for 1,000 scraped listings. Even if the large portion of those results could be duplicates because the cost of each run is evaluated regardless of previous runs.

We don’t have plans to support incremental scraping with Indeed Scraper in the near future, so I’ll be closing this issue for now. I hope the workaround I provided will at least partially meet your needs.

Developer
Maintained by Apify

Actor Metrics

  • 1.3k monthly users

  • 221 bookmarks

  • >99% runs succeeded

  • 5.1 days response time

  • Created in Mar 2023

  • Modified a month ago

Categories