
Leboncoin Scraper
Deprecated
Pricing
$30.00/month + usage

Leboncoin Scraper
Deprecated
Extremely fast Scraper that Extracts ads from leboncoin.fr
0.0 (0)
Pricing
$30.00/month + usage
1
Monthly users
1
Last modified
3 years ago
Retrieve specific ads
Closed
Hello Victor,
I tried to retrieve specific ads using your actor as discussed in our first conversation. I added an array of startUrls like :
1[ 2 {"url": "https://www.leboncoin.fr/ventes_immobilieres/2198189945.htm"}, 3 {"url": "https://www.leboncoin.fr/ventes_immobilieres/2201767813.htm"}, 4 {"url": "https://www.leboncoin.fr/ventes_immobilieres/2205212373.htm"}, 5 {"url": "https://www.leboncoin.fr/ventes_immobilieres/2197876906.htm"}, 6 {"url": "https://www.leboncoin.fr/ventes_immobilieres/2197875868.htm"}, 7 ... 8]
But the results timed out after 1 hour. You can see the runs here if you have access :
- 2022-10-10 17:34 - https://console.apify.com/actors/1gq6JJBYQFbM7kbke/runs/itbXMsyZ1xxLazbIc#log
- 2022-10-10 17:34 - https://console.apify.com/actors/1gq6JJBYQFbM7kbke/runs/kILUxgEEGj30qu12o#log
I can lower the number of urls if this is the issue. But perhaps you will have better insights to identify the origin of the timeouts ?
Best resgards,

Hi Basile, It appears that some of the ads have been disabled. I will issue an update that ignores ads that have been disabled instead of continually retrying.

Hello Basile, I issued the update that skips disabled ads. Please confirm that the scraper works as expected before I close the issue.
Best regards,
BasileDataimo
Hello Victor,
Sorry for the delay. This is better, my last run last 17 minutes instead of 1 hour timeout : https://console.apify.com/actors/1gq6JJBYQFbM7kbke/runs/Qc0rucuXXh3enWALm#log
But I only got 13 results over 50 urls asked. Looking at the logs, I understand that you detected some of the ads have been disabled, and some of the ads are in unknown state (timeout). Am I correct ?
1# Ad disabled example 22022-10-24T17:43:29.839Z ERROR BasicCrawler: [This ad is disabled] https://www.leboncoin.fr/ventes_immobilieres/2226162047.htm [WebSocket is not open] Cette annonce est désactivée 3# Ad timeout example 42022-10-24T17:56:00.552Z ERROR BasicCrawler: Request failed and reached maximum retries. requestHandler timed out after 60 seconds (tCx1fKWrrmlNWEg). {"id":"tCx1fKWrrmlNWEg","url":"https://www.leboncoin.fr/ventes_immobilieres/2226153374.htm","method":"GET","uniqueKey":"https://www.leboncoin.fr/ventes_immobilieres/2226153374.htm"}
If I am right, I need to known if the missing ads are disabled our timeout. Would it be possible for you to add those ads in the results list with a specific status ? For example :
1[ 2 { 3 "list_id": 2226162047, 4 "status": "disabled", 5 "url": "https://www.leboncoin.fr/ventes_immobilieres/2226162047.htm", 6 }, 7 { 8 "list_id": 2226153374, 9 "status": "unknown", 10 "url": "https://www.leboncoin.fr/ventes_immobilieres/2226153374.htm", 11 }, 12]
Best regards,

Hi Basile, You are correct on your assessment. I will work on an update that returns the ads with their status ASAP.
BasileDataimo
You're the best, thanks Victor !

Thanks, Please confirm that the updated actor has fixed this issue. Best regards
BasileDataimo
Hello Victor,
Sorry I confirm the fix, thank you !
Pricing
Pricing model
RentalTo use this Actor, you have to pay a monthly rental fee to the developer. The rent is subtracted from your prepaid usage every month after the free trial period. You also pay for the Apify platform usage.
Free trial
3 days
Price
$30.00