Lightweight Facebook Pages Scraper
2 hours trial then $20.00/month - No credit card required now
Lightweight Facebook Pages Scraper
2 hours trial then $20.00/month - No credit card required now
Scrape detailed Facebook page information efficiently and cost-effectively with our best scraper. Extract valuable data like page names, URLs, contact details, addresses, likes, and followers for competitive analysis, market research, trend monitoring, and social media analysis.
Actor Failed
This also seems very expensive
https://console.apify.com/actors/runs/4xiixFf2P9fH9AEbv#log
We are using Datacentre for Proxy configuration and its cost us $3.304 for only 250 results ?
it also says
REQUESTS 0 of 0 handled
why is it showing zero ?
Hi there, thank you for reaching out. It seems to be an issue with the timeout and container RAM. We have just updated them. For proxies, you can use your own. I can send you a link for proxies that work well and should be cheaper than this. Here is the link https://oxylabs.go2cloud.org/SH3d . Also, regarding the requests, we will look into that and see how we can fix it, but it does not affect any results as you got the 250 results, so all should be good.
The whole point of me using Apify is that I don't have to start buying proxies :) and doing configuration. Is there any way you can fix the cost issue on your side ? Not just for me but for all users ?
I scraped 1 FB profile and it only cost $0.003
https://console.apify.com/actors/runs/ItTUtFLZyBdNrvnnt#output
This works out around $0.75 cents for 250 which would seem better
I have 10,000 to do and will have to look elsewhere if its costing $3.304 for only 250 results
Kind Regards
Scott
I will try this residential proxy but this does not seem cheap either ! Could you add the option to only select specific fields ? I think this would be a great feature for your users.
you can try the one I sent you and subscribe to the residential proxies. Yes, sure, I will add this as input, and users will have to select specific fields from the output.
thank you, I am assuming if I am only scraping certain fields this will reduce the cost right ? If I was you I would still be looking at reducing the cost on your side, this is probably the reason it says 32 monthly users
Not really, but if you would like to get information from certain pages like "About Details" or "Contact and Basic Info," this will reduce costs for sure. You can try other actors and see how much it will cost you to extract data from Facebook. You will see that we are cost-efficient, although you will need to use good proxies.
Just an FYI you are MUCH more expensive then other actors ! Yes, they charge a monthly fee but it is worth paying this fee if you do a lot of urls. I will try using Oxy but I am onto their support, as its not working
that did not work
https://console.apify.com/actors/runs/UItboon43m15juuYN#output
and it was not cheap !
Can you confirm the url I should be using ?
I used
http://:@ip.oxylabs.io/location:7777
with my actual username and password !
you should be trying the residentials proxies and it should be in this format http://customer-username-cc-US:pwd@pr.oxylabs.io:7777
Also, please list other competitors that are cheaper so we can investigate further. Our prices should be three times lower than our competitors.
No this did not work either
I must say this is a painful process
The actor we used ishttps://console.apify.com/actors/4Hv5RhChiaDk6iwad/input
we ran it again and just slightly cheaper then yours for 250 results. Yours is certainly not 3 times cheaper.
Hi again, would you please confirm if our actor is cheaper than others after updating your actor settings and reducing your RAM to 512MB?
Hi, yes using less Ram it's much much cheaper, although it does take a long time to run if you have many URLs to scrape, but this is not an issue at all, happy to wait due to the excellent cost of your actor. Thank you so much for your help, fantastic customer support support
There is a problem with this, the file I uploaded has less then 10,000 Facebook URLs, but it kept running and I had to abort it, it went to nearly 30,000, can you confirm why this happened and refund me for the difference
Hi there,
We are sorry to hear that. Could you please send us the logs so we can investigate further? Also, we want to inform you that we cannot refund you for the difference as this is a public Actor. All profits go to Apify, and we do not charge them for anything.
Thank you.
sure see attached
Could you please send me the Apify logs URL? I can see these last ones lin logs
2024-07-18T17:47:08.110Z Chunk 1036 processed successfully. 2024-07-18T17:47:08.199Z Processing chunk 1037 of tasks.
So the last one processed is 1037, and for each chunk, we process only 5 requests, so we did not exceed the limits you mentioned. However, I would like to mention that if there are issues with proxies, then we retry up to 3 times.
If you look at the run, it seems some urls were run multiple times, this does not seem correct right ?
https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/h2f1Ja48dfPobCGi8#output
For example
No 1 is
https://www.facebook.com/CalvinKlein
Then No 6 is
https://www.facebook.com/CalvinKlein
No 2 is
Then No 7 is
https://www.facebook.com/Weleda https://www.facebook.com/Weleda
I have tested this out, and it seems to be working fine. I am wondering about the resurrected parameter, which is set to yes and once but in my case, it was set to ‘no.’ So, I assume that this one has been aborted and then you run it again, so it does query the list of URLs from the beginning again.
Sorry ? if you resurrect it, it should never go from the beginning ! it should always start from where you left off. I am very confused when you mention
"So, I assume that this one has been aborted and then you run it again, so it does query the list of URLs from the beginning again."
If I am not mistaken, the Docker container will be restarted, so the file will be loaded again, and the storage will remain the same. The new results will be pushed to the old storage, which explains why you were getting those 30k results. You can double-check this with the Apify team.
Hey, Zuzka from Apify here, I was asked to help by our support. First let me say, really great customer support, Oussema, thank you! Spr123, great testing abilitites! And I do agree with you, with the list of 10k urls, you should only get 10k results in this case, no matter is you ressurect the run. The whole point of ressurecting the run is for the run to so to speak start where it left off. I will ask our support to refund you 2/3 of the run cost if it is ok like that with you. From the technical point of view, Oussema, is your Actor counting with migration? Check this doc, please: https://docs.apify.com/academy/expert-scraping-with-apify/migrations-maintaining-state. If you have any questions about that, get in touch. I will send you an email.
Hi
Many thanks for coming back to me, yes I would like a refund of 2/3 but ONLY after the complete run is finished, when the actor is fixed, I will resurrect it again, when it finishes I will let you know and then can you please refund the 2/3
Many thanks
Hi again,
We apologize for the inconvenience caused and are glad to hear that you were refunded for the 2/3. We also want to inform you that we have fixed the problem, so you can now resume from where you left off. Everything should be working fine now.
Let me know if you'd like any further adjustments!
Resurrected it again, 6545 completed without duplicates, 9490 in total to be completed. roughly 2945 left to do.
29,710 completed on run, so should hit around 32655 and be completed.
I am seeing another issues with this Actor, I am seeing lots and lots of pages not being scraped for example
So, all good regarding resurrecting? Please send the Apify run URL so we can investigate this further.
Yeah, I see that could be due to a proxy error. We will have to add a queue functionality so you can detect the failed ones and rerun them again. We will let you know when this is implemented, which should be by the middle of next week. Thank you for understanding.
Ok thank you, I assume you will not charge for failed ones going forward ?
I'm not sure about this. You can ask the Apify team about this, but particularly, you will be in charge of those as we are going to try sending requests.
Is there anything I can do or you can do to to reduce these failed requests ? Is it because I am using data centre not residential ?
This is really bad, out of around 6900 completed, 5080 are with this login error. This really needs to be fixed or this actor is completely useless.
We will investigate this further next week, but this should be an issue with proxies.
Hi again,
After further investigation, we noticed that the issue is related to the proxies, which is why you were getting the login error. What we can do is either provide support for signing in by adding cookies to the Apify actor (which would be an input so you wouldn’t need to use proxies), or you would need to use different proxies.
Are you okay with providing a Facebook cookie as input, or would you prefer us to explore other solutions?
"Are you okay with providing a Facebook cookie as input"
Is this cookie related/connected to my FB account ? If so could I be banned from FB for using it ? If so please explore other solutions. Many thanks
Hi again,
We have improved our parser, and after further investigation, we found that the issue is related to the proxies you are using. Please switch to residential proxies, and you should see improvements. Additionally, we have worked on the request handling, so you will now see how many requests were processed. We also added a new column named "Status" that will indicate whether a page is public (which we can fetch), private, or if there was a proxy error.
Please note that for private pages, you need to log in using your own cookies. In the next steps, we plan to integrate cookies to allow parsing private pages, which should help avoid login pages. To avoid having your account banned, you can use another account if needed. We hope these changes help. Please test it out and let us know if everything is working fine.
ok thanks, I will test now. and the resurrection issue is resolved as well ?
No sorry, still seeing issues
https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output
For example
https://www.facebook.com/ViscoSoft
Did not pull any details for the above FB account, eg email, phone etc.
We have pushed a new version. Please check again and let us know if the issue persists. otherwise, we may need to explore other solutions. Please send us the actor run link so we can review the logs.
This is costing me money every time I test this :) can you test it on your side with the file attached to the run ?
https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output
testing it out
From what we have been testing, everything looks good. Please let us know if you encounter any issues or have any feedback that can enhance our parser. Here is an example of the output from the first 40 links: https://api.apify.com/v2/datasets/y3Am0WsaiZjhkiC4k/items?attachment=true&clean=true&format=csv.
Also please send us the actor run link so we can look at the logs thank you
I already sent you the link for the actor !
See below again
https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output
Please rerun it from the beginning and do not resume from a previous run, as you should be using the latest build version.
No its still not working properly. Not picking up email
https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/Ja1yDAoDBHZgYq0WC#output
Hi there,
After testing, we have fixed the issues and everything should be working well now. Please make sure you use the residential proxies, as they are better with the new flow. Please double-check everything and let us know if all is good. We appreciate your feedback.
This is still not working again
https://console.apify.com/actors/runs/FLE3KcaCrbNksYb0l#output
https://www.facebook.com/noemieworld
Not pulling phone number.
Hi again,
Thank you for bringing this to our attention. We have made a few enhancements, and it should be working correctly now. We're sorry about the earlier issue.
Please let us know if you identify any other problems so we can quickly jump in and resolve them. We always welcome your feedback and any suggestions for improvements or new features you'd like to see implemented.
- 14 monthly users
- 8 stars
- 98.7% runs succeeded
- 1.5 hours response time
- Created in May 2024
- Modified 13 days ago