
Facebook Marketplace Scraper
Pricing
$25.00/month + usage

Facebook Marketplace Scraper
Extract data from Facebook Marketplace listings and export to CSV, JSON, or use through a powerful API.
4.0 (5)
Pricing
$25.00/month + usage
13
Total users
229
Monthly users
58
Runs succeeded
96%
Issues response
23 hours
Last modified
6 days ago
Duplicates Question
Closed
When using "Deduplicate across runs" with a low maxItems (e.g., 10) for frequent checking, could the Actor potentially stop the run earlier if it detects that all 10 results are all duplicates instead of skipping them? Currently, it seems to skip all the duplicates, returning the listings that come after, but the URL i'm using for facebook is sorted by new, so i only need to detect if any of the first few listings at the top are new, if the first 10 come back as duplicates then I know that no new listings have arrived.
This would help optimize it greatly so it doesnt have to keep scraping the same info over and over, and doesn't make it skip over to the next batch of items even though they aren't the newest listings. Thanks!

Hello, we will take into consideration in the next release.

Hello, I'm back from a break . What about a parameter stop_on_first_page_all_duplicates ? will that resolve your issue ?
futurafree
Hello, that should work, so if all items i’m searching for (let’s say the first 10 listings) are all duplicates, then it will stop the task and return nothing, correct? If so that’s great.
Now, if there are let’s say 6 duplicate listings and the other 4 are new ones, are all 10 listings still returned? or will it only cost the user for the 4 new listings, not the 6 duplicates, then stop the task?

You're absolutely right.
Just to clarify, when I mention the cost per item and per page, I'm referring to proxy usage cost.
So, if all 10 listings (6 duplicates + 4 new) are on the same page, you'll only incur proxy costs for: 1 page crawl + 4 new item detail requests.
Since the 6 duplicates are already known, we skip their detail requests saving proxy usage and reducing the overall cost.
futurafree
Great okay, and that will also save reads and writes as those have been one of my biggest costs (storage). Thanks a lot, that sounds perfect!

Do you have some statistics about your costs (especially for storage). Can you share them with me please ? datavoyant -> gmail

The feature is now available! When you get a moment, could you please leave a star rating for the actor? Your feedback really helps—thanks!
2025-06-05T00:15:14.179Z [apify] INFO ▶️ Start scraping: https://www.facebook.com/marketplace/prague/vehicles/?sortBy=creation_time_descend&exact=true2025-06-05T00:15:19.598Z [apify] INFO ♻️ skipped 7 duplicate items2025-06-05T00:15:19.723Z [apify] INFO ♻️ skipped 8 duplicate items2025-06-05T00:15:19.725Z [apify] INFO ♻️ skipped 8 duplicate items2025-06-05T00:15:21.518Z [apify] INFO ♻️ skipped 8 duplicate items2025-06-05T00:15:21.520Z [apify] INFO ♻️ skipped 8 duplicate items2025-06-05T00:15:21.522Z [apify] INFO ♻️ skipped 8 duplicate items2025-06-05T00:15:21.657Z [apify] INFO 📄 page 2 → new: 0, total: 12025-06-05T00:15:21.760Z [apify] INFO 🛑 All items on this page are duplicates. Stopping early.2025-06-05T00:15:21.762Z [apify] INFO ✅ Finished https://www.facebook.com/marketplace/prague/vehicles/?sortBy=creation_time_descend&exact=true (total items: 1)2025-06-05T00:15:21.764Z [apify] INFO ✅ All URLs processed; exiting.