Craigslist Scraper avatar

Craigslist Scraper

Try for free

2 hours trial then $25.00/month - No credit card required now

Go to Store
Craigslist Scraper

Craigslist Scraper

ivanvs/craigslist-scraper
Try for free

2 hours trial then $25.00/month - No credit card required now

Extract data from classified advertisements on Craigslist. Scrape contact details from jobs, housing, items wanted, items for sale, services, community service, gigs, events and resumes listed on Craigslist. Download listings data in JSON, XML, Excel, and other versatile

PO

Time out error

Closed

paomontero opened this issue
a year ago

The crawler keeps on timing ou.

ivanvs avatar

Hi CrewBloom,

Thank you for contact me, it seems that they have changed something in structure of HTML on some pages. I will take a look.

Thank you for informing me about this issue.

ivanvs avatar

Hi CrewBloom,

I checked log and do some debugging. It seems that in general parser is working. I fixed one minor bug. The problem here is that after some time Craigslist is detecting that it is being parsed. Have you enabled Automatic proxy like I did on picture below?

That should help you with this issue. Also I have add some additional configuration so that it would be harder for the Craigslist to detect that we are scraping it.

PO

paomontero-owner

a year ago

Thanks! Will test it out now.

-- [image: Crewbloom] Jesus Paulo Montero (He/Him) Chief Technology Officer, CrewBloom

O. +1 (718) 747-8756 https://www.linkedin.com/company/crewbloom https://www.facebook.com/crewbloom/ https://www.instagram.com/crewbloom/ https://www.tiktok.com/@crewbloom https://crewbloom.com/30-minute-strategy-session/ https://crewbloom.zohodesk.com/portal/en/newticket

This email contains confidential information and is intended only for the individual to whom it is addressed.

PO

paomontero-owner

a year ago

Hi Ivan,

I am getting this error message when running your actor:

[image: image.png]

And I am not getting any results.

Kindly advise.

Thanks,

-- [image: Crewbloom] Jesus Paulo Montero (He/Him) Chief Technology Officer, CrewBloom

O. +1 (718) 747-8756 https://www.linkedin.com/company/crewbloom https://www.facebook.com/crewbloom/ https://www.instagram.com/crewbloom/ https://www.tiktok.com/@crewbloom https://crewbloom.com/30-minute-strategy-session/ https://crewbloom.zohodesk.com/portal/en/newticket

This email contains confidential information and is intended only for the individual to whom it is addressed.

ivanvs avatar

Let me check what is happening with this.

ivanvs avatar

Hi Paulo,

I checked the issue. Craigslist has changed structure of html. I have fixed the issue, I've tested quiet a few urls, and it seems that everything should work now.

Also, I've added additional field that could be interesting for you. On job posts I've managed to parse company name on pages where company name exists.

Thank you for informing me about this issue. Sorry for this inconvenience.

MR

mr415

a year ago

In the description of this actor is says contact details are available. Does that direct email address, or something different?

ivanvs avatar

It is email address that you get when you press "Replay" button. It takes a lot of time to load this information and it is protected with captcha. So it will increase time needed to scrape page.

Developer
Maintained by Community

Actor Metrics

  • 14 monthly users

  • 5 stars

  • >99% runs succeeded

  • 6.7 days response time

  • Created in Sep 2022

  • Modified 20 days ago