Lightweight Facebook Pages Scraper avatar

Lightweight Facebook Pages Scraper

Try for free

2 hours trial then $20.00/month - No credit card required now

View all Actors
Lightweight Facebook Pages Scraper

Lightweight Facebook Pages Scraper

oussemafr/lightweight-facebook-pages-scraper
Try for free

2 hours trial then $20.00/month - No credit card required now

Scrape detailed Facebook page information efficiently and cost-effectively with our best scraper. Extract valuable data like page names, URLs, contact details, addresses, likes, and followers for competitive analysis, market research, trend monitoring, and social media analysis.

SP

Actor Failed

Closed

spr123 opened this issue
4 months ago

Actor Failed

SP

spr123

4 months ago

This also seems very expensive

https://console.apify.com/actors/runs/4xiixFf2P9fH9AEbv#log

We are using Datacentre for Proxy configuration and its cost us $3.304 for only 250 results ?

it also says

REQUESTS 0 of 0 handled

why is it showing zero ?

oussemaFr avatar

Hi there, thank you for reaching out. It seems to be an issue with the timeout and container RAM. We have just updated them. For proxies, you can use your own. I can send you a link for proxies that work well and should be cheaper than this. Here is the link https://oxylabs.go2cloud.org/SH3d . Also, regarding the requests, we will look into that and see how we can fix it, but it does not affect any results as you got the 250 results, so all should be good.

SP

spr123

4 months ago

The whole point of me using Apify is that I don't have to start buying proxies :) and doing configuration. Is there any way you can fix the cost issue on your side ? Not just for me but for all users ?

I scraped 1 FB profile and it only cost $0.003

https://console.apify.com/actors/runs/ItTUtFLZyBdNrvnnt#output

This works out around $0.75 cents for 250 which would seem better

I have 10,000 to do and will have to look elsewhere if its costing $3.304 for only 250 results

Kind Regards

Scott

SP

spr123

4 months ago

I will try this residential proxy but this does not seem cheap either ! Could you add the option to only select specific fields ? I think this would be a great feature for your users.

oussemaFr avatar

you can try the one I sent you and subscribe to the residential proxies. Yes, sure, I will add this as input, and users will have to select specific fields from the output.

SP

spr123

4 months ago

thank you, I am assuming if I am only scraping certain fields this will reduce the cost right ? If I was you I would still be looking at reducing the cost on your side, this is probably the reason it says 32 monthly users

oussemaFr avatar

Not really, but if you would like to get information from certain pages like "About Details" or "Contact and Basic Info," this will reduce costs for sure. You can try other actors and see how much it will cost you to extract data from Facebook. You will see that we are cost-efficient, although you will need to use good proxies.

SP

spr123

4 months ago

Just an FYI you are MUCH more expensive then other actors ! Yes, they charge a monthly fee but it is worth paying this fee if you do a lot of urls. I will try using Oxy but I am onto their support, as its not working

SP

spr123

4 months ago

that did not work

https://console.apify.com/actors/runs/UItboon43m15juuYN#output

and it was not cheap !

Can you confirm the url I should be using ?

I used

http://:@ip.oxylabs.io/location:7777

with my actual username and password !

oussemaFr avatar

you should be trying the residentials proxies and it should be in this format http://customer-username-cc-US:pwd@pr.oxylabs.io:7777

oussemaFr avatar

Also, please list other competitors that are cheaper so we can investigate further. Our prices should be three times lower than our competitors.

SP

spr123

4 months ago

No this did not work either

I must say this is a painful process

The actor we used ishttps://console.apify.com/actors/4Hv5RhChiaDk6iwad/input

we ran it again and just slightly cheaper then yours for 250 results. Yours is certainly not 3 times cheaper.

oussemaFr avatar

Hi again, would you please confirm if our actor is cheaper than others after updating your actor settings and reducing your RAM to 512MB?

SP

spr123

4 months ago

Hi, yes using less Ram it's much much cheaper, although it does take a long time to run if you have many URLs to scrape, but this is not an issue at all, happy to wait due to the excellent cost of your actor. Thank you so much for your help, fantastic customer support support

SP

spr123

4 months ago

There is a problem with this, the file I uploaded has less then 10,000 Facebook URLs, but it kept running and I had to abort it, it went to nearly 30,000, can you confirm why this happened and refund me for the difference

oussemaFr avatar

Hi there,

We are sorry to hear that. Could you please send us the logs so we can investigate further? Also, we want to inform you that we cannot refund you for the difference as this is a public Actor. All profits go to Apify, and we do not charge them for anything.

Thank you.

SP

spr123

4 months ago

sure see attached

oussemaFr avatar

Could you please send me the Apify logs URL? I can see these last ones lin logs

2024-07-18T17:47:08.110Z Chunk 1036 processed successfully. 2024-07-18T17:47:08.199Z Processing chunk 1037 of tasks.

So the last one processed is 1037, and for each chunk, we process only 5 requests, so we did not exceed the limits you mentioned. However, I would like to mention that if there are issues with proxies, then we retry up to 3 times.

SP

spr123

4 months ago

If you look at the run, it seems some urls were run multiple times, this does not seem correct right ?

https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/h2f1Ja48dfPobCGi8#output

For example

No 1 is

https://www.facebook.com/CalvinKlein

Then No 6 is

https://www.facebook.com/CalvinKlein

No 2 is

Then No 7 is

https://www.facebook.com/Weleda https://www.facebook.com/Weleda

oussemaFr avatar

I have tested this out, and it seems to be working fine. I am wondering about the resurrected parameter, which is set to yes and once but in my case, it was set to ‘no.’ So, I assume that this one has been aborted and then you run it again, so it does query the list of URLs from the beginning again.

SP

spr123

4 months ago

Sorry ? if you resurrect it, it should never go from the beginning ! it should always start from where you left off. I am very confused when you mention

"So, I assume that this one has been aborted and then you run it again, so it does query the list of URLs from the beginning again."

oussemaFr avatar

If I am not mistaken, the Docker container will be restarted, so the file will be loaded again, and the storage will remain the same. The new results will be pushed to the old storage, which explains why you were getting those 30k results. You can double-check this with the Apify team.

zuzka avatar

Hey, Zuzka from Apify here, I was asked to help by our support. First let me say, really great customer support, Oussema, thank you! Spr123, great testing abilitites! And I do agree with you, with the list of 10k urls, you should only get 10k results in this case, no matter is you ressurect the run. The whole point of ressurecting the run is for the run to so to speak start where it left off. I will ask our support to refund you 2/3 of the run cost if it is ok like that with you. From the technical point of view, Oussema, is your Actor counting with migration? Check this doc, please: https://docs.apify.com/academy/expert-scraping-with-apify/migrations-maintaining-state. If you have any questions about that, get in touch. I will send you an email.

SP

spr123

4 months ago

Hi

Many thanks for coming back to me, yes I would like a refund of 2/3 but ONLY after the complete run is finished, when the actor is fixed, I will resurrect it again, when it finishes I will let you know and then can you please refund the 2/3

Many thanks

oussemaFr avatar

Hi again,

We apologize for the inconvenience caused and are glad to hear that you were refunded for the 2/3. We also want to inform you that we have fixed the problem, so you can now resume from where you left off. Everything should be working fine now.

Let me know if you'd like any further adjustments!

SP

spr123

4 months ago

Resurrected it again, 6545 completed without duplicates, 9490 in total to be completed. roughly 2945 left to do.

29,710 completed on run, so should hit around 32655 and be completed.

SP

spr123

4 months ago

I am seeing another issues with this Actor, I am seeing lots and lots of pages not being scraped for example

https://www.facebook.com/ShapesSecrets

oussemaFr avatar

So, all good regarding resurrecting? Please send the Apify run URL so we can investigate this further.

oussemaFr avatar

Yeah, I see that could be due to a proxy error. We will have to add a queue functionality so you can detect the failed ones and rerun them again. We will let you know when this is implemented, which should be by the middle of next week. Thank you for understanding.

SP

spr123

4 months ago

Ok thank you, I assume you will not charge for failed ones going forward ?

oussemaFr avatar

I'm not sure about this. You can ask the Apify team about this, but particularly, you will be in charge of those as we are going to try sending requests.

SP

spr123

4 months ago

Is there anything I can do or you can do to to reduce these failed requests ? Is it because I am using data centre not residential ?

SP

spr123

4 months ago

This is really bad, out of around 6900 completed, 5080 are with this login error. This really needs to be fixed or this actor is completely useless.

oussemaFr avatar

We will investigate this further next week, but this should be an issue with proxies.

oussemaFr avatar

Hi again,

After further investigation, we noticed that the issue is related to the proxies, which is why you were getting the login error. What we can do is either provide support for signing in by adding cookies to the Apify actor (which would be an input so you wouldn’t need to use proxies), or you would need to use different proxies.

Are you okay with providing a Facebook cookie as input, or would you prefer us to explore other solutions?

SP

spr123

4 months ago

"Are you okay with providing a Facebook cookie as input"

Is this cookie related/connected to my FB account ? If so could I be banned from FB for using it ? If so please explore other solutions. Many thanks

oussemaFr avatar

Hi again,

We have improved our parser, and after further investigation, we found that the issue is related to the proxies you are using. Please switch to residential proxies, and you should see improvements. Additionally, we have worked on the request handling, so you will now see how many requests were processed. We also added a new column named "Status" that will indicate whether a page is public (which we can fetch), private, or if there was a proxy error.

Please note that for private pages, you need to log in using your own cookies. In the next steps, we plan to integrate cookies to allow parsing private pages, which should help avoid login pages. To avoid having your account banned, you can use another account if needed. We hope these changes help. Please test it out and let us know if everything is working fine.

SP

spr123

4 months ago

ok thanks, I will test now. and the resurrection issue is resolved as well ?

SP

spr123

4 months ago

No sorry, still seeing issues

https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output

For example

https://www.facebook.com/ViscoSoft

Did not pull any details for the above FB account, eg email, phone etc.

oussemaFr avatar

We have pushed a new version. Please check again and let us know if the issue persists. otherwise, we may need to explore other solutions. Please send us the actor run link so we can review the logs.

SP

spr123

4 months ago

This is costing me money every time I test this :) can you test it on your side with the file attached to the run ?

https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output

oussemaFr avatar

testing it out

oussemaFr avatar

From what we have been testing, everything looks good. Please let us know if you encounter any issues or have any feedback that can enhance our parser. Here is an example of the output from the first 40 links: https://api.apify.com/v2/datasets/y3Am0WsaiZjhkiC4k/items?attachment=true&clean=true&format=csv.

Also please send us the actor run link so we can look at the logs thank you

SP

spr123

4 months ago

I already sent you the link for the actor !

See below again

https://console.apify.com/actors/IPYzJr9bFeFUufskW/runs/pcfYmLEKjYblBVsns#output

oussemaFr avatar

Please rerun it from the beginning and do not resume from a previous run, as you should be using the latest build version.

oussemaFr avatar

Hi there,

After testing, we have fixed the issues and everything should be working well now. Please make sure you use the residential proxies, as they are better with the new flow. Please double-check everything and let us know if all is good. We appreciate your feedback.

SP

spr123

3 months ago
oussemaFr avatar

Hi again,

Thank you for bringing this to our attention. We have made a few enhancements, and it should be working correctly now. We're sorry about the earlier issue.

Please let us know if you identify any other problems so we can quickly jump in and resolve them. We always welcome your feedback and any suggestions for improvements or new features you'd like to see implemented.

Developer
Maintained by Community
Actor metrics
  • 14 monthly users
  • 8 stars
  • 98.7% runs succeeded
  • 1.5 hours response time
  • Created in May 2024
  • Modified 13 days ago