Bbb.org Business Search Scraper avatar

Bbb.org Business Search Scraper

Try for free

3 days trial then $20.00/month - No credit card required now

Go to Store
Bbb.org Business Search Scraper

Bbb.org Business Search Scraper

ecomscrape/bbb-business-search-scraper
Try for free

3 days trial then $20.00/month - No credit card required now

The BBB.org Business Search Scraper extracts business listings from BBB.org using search query URLs. The output includes business name, address, categories, rating, and more, making it ideal for market research, competitor analysis, and lead generation. Perfect for data-driven business insights.

CT

Let run for 21min, No results

Open
Calldrive_Team opened this issue
7 days ago

Let cost get to $5 and run for 21 minutes on initial test and received zero results even though log says its scraping items. URL matched example shown on scraper page.

ecomscrape avatar

Hi, thanks for your feedback. I reviewed your run and noticed some mistakes.

  1. You don’t need to include all page URLs of a target search query in the input; it’s redundant. Instead, you only need to provide a single URL and set the max_items_per_url value to 100 to retrieve up to 100 items from that URL until the end of the search query results.
1{
2  "max_items_per_url": 100,
3  "max_retries_per_url": 2,
4  "proxy": {
5    "useApifyProxy": false
6  },
7  "urls": [
8    "https://www.bbb.org/search?find_country=USA&find_entity=60858-200&find_id=5386_20800-500-200&find_text=Tax+Negotiators&find_type=Category&page=1"
9  ]
10}

You can check my example at this link: https://console.apify.com/view/runs/QtHNDvwEIU5DLUcx8. In this run, the actor started from page 1 and scraped up to 100 items before exiting.If you don’t set a value for max_items_per_url, the actor will retrieve all items from the given URL, starting from its position (the current page) to the end of the search query results. For example, if you set the url value to:"https://www.bbb.org/search?find_country=USA&find_entity=60858-200&find_id=5386_20800-500-200&find_text=Tax+Negotiators&find_type=Category&page=2"and leave max_items_per_url blank, the actor will extract all data from page 2 onward.

ecomscrape avatar
  1. The bandwidth of the Apify network sometimes encounters issues, which can result in very long delays for each request to receive a response. Currently, this actor waits for the response before proceeding with its logic. However, I can set the request timeout to 10 seconds per request, but this won’t guarantee that you’ll receive results for every run.

You can see three examples—three cases of the same input but with different proxy options:

My advice:

  • Try using a proxy (datacenter proxy, residential proxy, or your own proxy). You can rent an external service and provide the proxy URL as input to reduce waiting time.
  • Divide large requests into smaller sub-requests. For example, if you need to retrieve 400 items from page 2 to the end, you can create an actor run for page 2 with 200 items and another run for page n with the remaining 200 items.
ecomscrape avatar

Hi, I've just done some optimization and significantly reduced the cost for this actor.

ecomscrape avatar

If you haven't solved your problem yet, please let me know.

CT

Calldrive_Team

5 days ago

Hey. So I guess it just had a behavior I wasnt familiar with. Instead of continually showing the results as they are collected it wouldnt show the results until the end of the scrape. So I ran it again and just let it finish and then all of the results populated at the end.

ecomscrape avatar

This is to minimize some costs. Apify calculates the final cost based on many factors. I want to keep data transfer, dataset read, write, and timed storage costs as low as possible. All data during each run will be stored in a variable and returned no matter what happens (including raised errors) in the end. But if you want data to be populated during the actor's run, I'll consider adding a new flag to the input.

CT

Calldrive_Team

4 days ago

Ok. Makes sense. It might be a cool option to have.

Developer
Maintained by Community

Actor Metrics

  • 7 monthly users

  • 2 bookmarks

  • 98% runs succeeded

  • 1.2 days response time

  • Created in Jan 2025

  • Modified 3 days ago