Pricing

$20.00 / 1,000 results

Go to Store

Findify Best

Try for free

Developed by

selçuk güney

🔍 AI-powered e-commerce scraper that extracts detailed product data from any online store. Uses LLMs (Mistral/Gemini) for intelligent extraction, handles pagination, variants & CAPTCHAs. Perfect for price monitoring, market research & competitive analysis. #webscraping #ecommerce

0.0 (0)

Pricing

$20.00 / 1,000 results

Issues response

40 days

Last modified

4 months ago

E-commerce

Developer tools

Findify.best - AI-Powered E-Commerce Data Solution

Version: 1.1

Findify.best is a powerful Apify Actor powered by artificial intelligence that automatically extracts product data from e-commerce sites. Using advanced language models like Mistral AI and Google Gemini, you can easily collect product name, price, description, SKU, brand, and more from ANY e-commerce site. It even works on popular sites like Amazon, Trendyol, and Hepsiburada!

Why Choose Findify.best?

✅ Data Extraction from Any E-Commerce Site: Collect data from ANYWHERE you want, from a single product page to entire category pages.

✅ AI-Powered Solution: No matter the site structure, our AI technology finds and extracts the right data.

✅ Automatic Pagination: Automatically detects and follows "Next Page" links on category pages.

✅ Variant Detection: Automatically extracts product variants like color, size, and model in a structured format.

✅ Bot Protection Bypass: Works even on sites with strong bot protection like Amazon, thanks to Playwright integration.

✅ CAPTCHA Detection and Bypass: Automatically detects CAPTCHA barriers and tries to bypass them with proxy rotation.

✅ Proxy Support: Overcomes geographical restrictions and blocks with Apify Proxy (Datacenter, Residential).

✅ Customizable Output: You decide which data fields you want to extract.

✅ Secure API Key Management: API keys are included, no extra configuration needed.

✅ Robust Error Handling: Works continuously with automatic retry and backup mechanisms.

Who Is It Ideal For?

🔹 E-Commerce Businesses: For competitor analysis and price tracking 🔹 Market Researchers: For collecting market trends and product data 🔹 Price Tracking Services: For automatic price monitoring solutions 🔹 Data Analysts: For creating e-commerce datasets 🔹 Marketing Specialists: For product information and lead generation

How to Use?

Input Settings

startUrls: List of URLs to scan. Can be product pages or category pages.
targetDataFields: Data fields to extract. Options:
- productName
- price
- currency
- description
- brand
- imageUrls
- availability
- variants
- ratingValue
- reviewCount
- sku
- categoryPath
- specifications
enablePagination: When enabled, follows pagination links on category pages.
usePlaywright: Recommended for sites with strong bot protection like Amazon.
llmProvider: AI model to use:
- Mistral: Uses Mistral AI API.
- Gemini: Uses Google Gemini API.
- Auto (Default): Tries Mistral first, switches to Gemini if unsuccessful.
maxConcurrency: Maximum number of pages to process in parallel.

Note: Mistral and Gemini API keys are included, no extra configuration needed.

Output Data

The actor saves the extracted data to the Apify Dataset. Each item represents data extracted from a URL.

{
  "scrapedUrl": "https://...", // Processed URL
  "llmUsed": "Mistral " / "Gemini ", // AI model used
  "extractionTimestamp": "YYYY-MM-DDTHH:mm:ss.sssZ", // Timestamp of extraction attempt
  // --- Extracted Data Fields (based on targetDataFields input) ---
  "productName": "Example Product",
  "price": 29.99,
  "currency": "USD",
  "description": "This is a great product...",
  "sku": "EXAMPLE-123",
  "brand": "ExampleBrand",
  "imageUrls": ["https://.../img1.jpg", "https://.../img2.jpg"],
  "availability": "In Stock",
  "variants": [
    {
      "name": "Small Red",
      "size": "S",
      "color": "Red",
      "price": 19.99,
      "currency": "USD",
      "availability": "In Stock",
      "sku": "PROD-S-RED"
    }
  ],
  "ratingValue": 4.5,
  "reviewCount": 105,
  // --- Status & Error ---
  "status": "Success" / "Failed - ...",
  "error": null / "Error message..."
}

Usage Tips

Accuracy: Data extraction accuracy depends on HTML quality and the selected model. Results may vary from site to site.
CAPTCHA Handling: The actor can detect common CAPTCHA challenges and tries to bypass them using proxy rotation or Playwright. Success rate varies depending on the target website.
Playwright Integration: When usePlaywright is enabled, the actor helps bypass complex bot protection mechanisms by simulating real user behavior. This increases the success rate for sites with strong anti-bot measures like Amazon.
Pagination: When enablePagination is enabled, the actor tries to detect and follow common pagination patterns (Next links, numbered pagination). This feature works best on standard e-commerce sites.
Compliance: It is your responsibility to ensure that your use of this actor complies with the terms of service of the websites you scan and the LLM providers. Avoid collecting personal data.

Example Usage Scenarios

Scenario 1: Basic Product Data Extraction

To extract basic product information from specific product URLs:

{
  "startUrls": [
    { "url": "https://www.amazon.com/Apple-iPhone-13-128GB-Blue/dp/B09G9HD6PD" },
    { "url": "https://www.bestbuy.com/site/samsung-galaxy-s21-5g-128gb-phantom-gray-unlocked/6448113.p" }
  ],
  "targetDataFields": ["productName", "price", "currency", "brand", "imageUrls"],
  "usePlaywright": true
}

Scenario 2: Category Page Scanning with Pagination

To extract products from a category page, including all pagination pages:

{
  "startUrls": [
    { "url": "https://www.amazon.com/s?k=laptops" }
  ],
  "targetDataFields": ["productName", "price", "currency", "availability"],
  "enablePagination": true,
  "usePlaywright": true
}

Scenario 3: Detailed Product Analysis with Variants

For a comprehensive analysis of products including their variants:

{
  "startUrls": [
    { "url": "https://www.amazon.com/Apple-iPhone-13-128GB-Blue/dp/B09G9HD6PD" }
  ],
  "targetDataFields": ["productName", "price", "currency", "description", "brand", "variants", "ratingValue", "reviewCount"],
  "usePlaywright": true
}

Scenario 4: Scraping Amazon with Bot Protection Bypass

To extract product data from Amazon, which has sophisticated bot protection:

{
  "startUrls": [
    { "url": "https://www.amazon.com/Apple-iPad-10-9-inch-Wi-Fi-64GB/dp/B09G9FPHY6" }
  ],
  "targetDataFields": ["productName", "price", "currency", "description", "brand", "variants"],
  "usePlaywright": true,
  "useApifyProxy": true,
  "proxyGroups": ["RESIDENTIAL"]
}

Quick Start

Configure the actor
- Add the product or category pages you want to scan to the startUrls field.
- Select the data fields you want to extract from the targetDataFields field.
- Enable the usePlaywright option for sites with strong bot measures.
- Adjust other settings like maxConcurrency if desired.
Run the actor
- Click the "Start" button to begin the scanning process.
- Monitor the run logs to see progress and potential issues.
- When completed, download your data in your preferred format (JSON, CSV, Excel).

Troubleshooting

If you encounter issues with the actor, try these solutions:

Browser Automation Issues:
- Enable the usePlaywright option - this significantly improves scanning for complex websites.
- Try using a proxy - enable the useApifyProxy option and select "RESIDENTIAL" for proxyGroups.
Data Extraction Issues:
- Try a different LLM provider - change the llmProvider setting.
- Request fewer data fields - shorten the targetDataFields list.
Pagination Issues:
- Some sites use non-standard pagination - in this case, manually add each page to the startUrls list.
CAPTCHA Issues:
- Use a residential proxy - select "RESIDENTIAL" for proxyGroups.
- Increase the captchaMaxAttempts value.

What Can You Do with Findify.best?

🛍️ Competitor Analysis: Automatically track your competitors' product prices, stock status, and features.

📊 Market Research: Conduct market analyses by collecting all products and prices in a specific product category.

💰 Price Monitoring: Regularly track prices of specific products to catch price changes.

📱 Product Comparison: Compare prices and conditions offered by different sellers for the same product.

🔍 Data Mining: Create structured datasets from e-commerce sites.

🤖 Automatic Catalog Creation: Create digital catalogs by extracting bulk product information.

Findify.best is a reliable, fast, and easy-to-use solution for your e-commerce data needs. Try it now and collect your data effortlessly!

Recent Updates

Version 1.1

Added Playwright integration to extract data from sites with strong bot protection like Amazon
Developed automatic pagination support for category pages
Added advanced detection mechanism for product variants (color, size, model)
Improved CAPTCHA detection and bypass mechanisms
Strengthened error handling and retry mechanisms
Updated Gemini API to use the latest model
Improved CAPTCHA detection and handling
Enhanced variant detection and extraction
Added support for running with local IP (without proxy) for testing purposes
Fixed various bugs and improved error handling

Version 1.0

Initial release with basic LLM-powered extraction
Support for Mistral and Gemini APIs
HTML cleaning and preprocessing
Pagination support
CAPTCHA detection with proxy rotation

On this page

Findify.best - AI-Powered E-Commerce Data Solution

Share Actor:

E-Commerce Scraper

iglu/e-commerce-scraper

E-Commerce Scraper API employs AI-powered technologies and eliminates the hassle of data collection. Quickly scrape Amazon, eBay, GameStop, Western Digital, and tons of other e-commerce.

IGLU

620

1.0

Shopify Product Scraper: Extract Product Data via JSON API

linen_snack/shopify-product-scraper-extract-product-data-via-json-api

Effortlessly scrape comprehensive product data (titles, descriptions, prices, variants, images, SKUs, inventory & more) from any Shopify store. data extraction for e-commerce analysis, price monitoring, or building product feeds. Fast, reliable, and easy to configure with just the store URL.

ius iyb

Trendyol Product Search Scraper

ecomscrape/trendyol-product-search-scraper

The Trendyol Product Search Scraper extracts detailed product data from Trendyol, including name, price, brand, etc., using search query URLs. It's perfect for market research, trend analysis, lead generation, and campaign planning.

ecomscrape

Magento E-Commerce Scraper 🚧

jupri/magento-scraper

Scrape data about product price, description and other information from Magento E-Commerce websites.

cat

357

Dania Furniture Scraper

mshopik/dania-furniture-scraper

Scrape Dania Furniture and extract data on home furnishings from daniafurniture.com. Our Dania Furniture API lets you crawl product information and pricing. The saved data can be downloaded as HTML, JSON, CSV, Excel, and XML.

Mark Carter

Trendyol Product Scraper

yeyo/trendyol-scraper

This scraper can help you retrieve current and accurate data about products from Trendyol, an online retail platform. This tool can be useful for keeping track of the latest offerings from Trendyol or for gathering information for research or analysis purposes.

sametcodes

226

Scrape product data from any e-commerce site with a dataLayer

eloquent_mountain/scrape-product-data-from-any-e-commerce-site

Scrapes e-commerce product data from any (e-commerce) website that has a dataLayer object (mostly used in google analytics implementations). It returns all product data in multiple data formats. Also available as an API to integrate with your own or other products. Circumvents the Cookie wall.

Paco

279

Trendyol Product Details Scraper

ecomscrape/trendyol-product-page-details-scraper

The Trendyol Product Details Page Scraper extracts detailed product data from all Trendyol country sites, including name, brand, price, rating, image URLs, etc., by providing product details page urls.

ecomscrape

Trendyol Email Scraper

scraper-mind/trendyol-email-scraper

Boost your outreach with the Trendyol Email Scraper—fast, accurate & affordable! Extract Trendyol emails by keywords & location. Perfect for marketers & e-commerce growth. Try it now!

Scraper Mind

AI Product Matcher

equidem/ai-product-matcher

Match products across multiple e-commerce websites. Use this AI product matching Actor whenever you need to find matching pairs of products from different online shops for dynamic pricing, competitor analysis or market research.