
Findify Best
Pricing
$20.00 / 1,000 Results

Findify Best
🔍 AI-powered e-commerce scraper that extracts detailed product data from any online store. Uses LLMs (Mistral/Gemini) for intelligent extraction, handles pagination, variants & CAPTCHAs. Perfect for price monitoring, market research & competitive analysis. #webscraping #ecommerce
0.0 (0)
Pricing
$20.00 / 1,000 Results
0
Monthly users
1
Runs succeeded
>99%
Last modified
12 hours ago
Findify.best - AI-Powered E-Commerce Data Solution
Version: 1.1
Findify.best is a powerful Apify Actor powered by artificial intelligence that automatically extracts product data from e-commerce sites. Using advanced language models like Mistral AI and Google Gemini, you can easily collect product name, price, description, SKU, brand, and more from ANY e-commerce site. It even works on popular sites like Amazon, Trendyol, and Hepsiburada!
Why Choose Findify.best?
✅ Data Extraction from Any E-Commerce Site: Collect data from ANYWHERE you want, from a single product page to entire category pages.
✅ AI-Powered Solution: No matter the site structure, our AI technology finds and extracts the right data.
✅ Automatic Pagination: Automatically detects and follows "Next Page" links on category pages.
✅ Variant Detection: Automatically extracts product variants like color, size, and model in a structured format.
✅ Bot Protection Bypass: Works even on sites with strong bot protection like Amazon, thanks to Playwright integration.
✅ CAPTCHA Detection and Bypass: Automatically detects CAPTCHA barriers and tries to bypass them with proxy rotation.
✅ Proxy Support: Overcomes geographical restrictions and blocks with Apify Proxy (Datacenter, Residential).
✅ Customizable Output: You decide which data fields you want to extract.
✅ Secure API Key Management: API keys are included, no extra configuration needed.
✅ Robust Error Handling: Works continuously with automatic retry and backup mechanisms.
Who Is It Ideal For?
🔹 E-Commerce Businesses: For competitor analysis and price tracking 🔹 Market Researchers: For collecting market trends and product data 🔹 Price Tracking Services: For automatic price monitoring solutions 🔹 Data Analysts: For creating e-commerce datasets 🔹 Marketing Specialists: For product information and lead generation
How to Use?
Input Settings
startUrls
: List of URLs to scan. Can be product pages or category pages.targetDataFields
: Data fields to extract. Options:productName
price
currency
description
brand
imageUrls
availability
variants
ratingValue
reviewCount
sku
categoryPath
specifications
enablePagination
: When enabled, follows pagination links on category pages.usePlaywright
: Recommended for sites with strong bot protection like Amazon.llmProvider
: AI model to use:Mistral
: Uses Mistral AI API.Gemini
: Uses Google Gemini API.Auto
(Default): Tries Mistral first, switches to Gemini if unsuccessful.
maxConcurrency
: Maximum number of pages to process in parallel.
Note: Mistral and Gemini API keys are included, no extra configuration needed.
Output Data
The actor saves the extracted data to the Apify Dataset. Each item represents data extracted from a URL.
1{ 2 "scrapedUrl": "https://...", // Processed URL 3 "llmUsed": "Mistral " / "Gemini ", // AI model used 4 "extractionTimestamp": "YYYY-MM-DDTHH:mm:ss.sssZ", // Timestamp of extraction attempt 5 // --- Extracted Data Fields (based on targetDataFields input) --- 6 "productName": "Example Product", 7 "price": 29.99, 8 "currency": "USD", 9 "description": "This is a great product...", 10 "sku": "EXAMPLE-123", 11 "brand": "ExampleBrand", 12 "imageUrls": ["https://.../img1.jpg", "https://.../img2.jpg"], 13 "availability": "In Stock", 14 "variants": [ 15 { 16 "name": "Small Red", 17 "size": "S", 18 "color": "Red", 19 "price": 19.99, 20 "currency": "USD", 21 "availability": "In Stock", 22 "sku": "PROD-S-RED" 23 } 24 ], 25 "ratingValue": 4.5, 26 "reviewCount": 105, 27 // --- Status & Error --- 28 "status": "Success" / "Failed - ...", 29 "error": null / "Error message..." 30}
Usage Tips
- Accuracy: Data extraction accuracy depends on HTML quality and the selected model. Results may vary from site to site.
- CAPTCHA Handling: The actor can detect common CAPTCHA challenges and tries to bypass them using proxy rotation or Playwright. Success rate varies depending on the target website.
- Playwright Integration: When
usePlaywright
is enabled, the actor helps bypass complex bot protection mechanisms by simulating real user behavior. This increases the success rate for sites with strong anti-bot measures like Amazon. - Pagination: When
enablePagination
is enabled, the actor tries to detect and follow common pagination patterns (Next links, numbered pagination). This feature works best on standard e-commerce sites. - Compliance: It is your responsibility to ensure that your use of this actor complies with the terms of service of the websites you scan and the LLM providers. Avoid collecting personal data.
Example Usage Scenarios
Scenario 1: Basic Product Data Extraction
To extract basic product information from specific product URLs:
1{ 2 "startUrls": [ 3 { "url": "https://www.amazon.com/Apple-iPhone-13-128GB-Blue/dp/B09G9HD6PD" }, 4 { "url": "https://www.bestbuy.com/site/samsung-galaxy-s21-5g-128gb-phantom-gray-unlocked/6448113.p" } 5 ], 6 "targetDataFields": ["productName", "price", "currency", "brand", "imageUrls"], 7 "usePlaywright": true 8}
Scenario 2: Category Page Scanning with Pagination
To extract products from a category page, including all pagination pages:
1{ 2 "startUrls": [ 3 { "url": "https://www.amazon.com/s?k=laptops" } 4 ], 5 "targetDataFields": ["productName", "price", "currency", "availability"], 6 "enablePagination": true, 7 "usePlaywright": true 8}
Scenario 3: Detailed Product Analysis with Variants
For a comprehensive analysis of products including their variants:
1{ 2 "startUrls": [ 3 { "url": "https://www.amazon.com/Apple-iPhone-13-128GB-Blue/dp/B09G9HD6PD" } 4 ], 5 "targetDataFields": ["productName", "price", "currency", "description", "brand", "variants", "ratingValue", "reviewCount"], 6 "usePlaywright": true 7}
Scenario 4: Scraping Amazon with Bot Protection Bypass
To extract product data from Amazon, which has sophisticated bot protection:
1{ 2 "startUrls": [ 3 { "url": "https://www.amazon.com/Apple-iPad-10-9-inch-Wi-Fi-64GB/dp/B09G9FPHY6" } 4 ], 5 "targetDataFields": ["productName", "price", "currency", "description", "brand", "variants"], 6 "usePlaywright": true, 7 "useApifyProxy": true, 8 "proxyGroups": ["RESIDENTIAL"] 9}
Quick Start
-
Configure the actor
- Add the product or category pages you want to scan to the
startUrls
field. - Select the data fields you want to extract from the
targetDataFields
field. - Enable the
usePlaywright
option for sites with strong bot measures. - Adjust other settings like
maxConcurrency
if desired.
- Add the product or category pages you want to scan to the
-
Run the actor
- Click the "Start" button to begin the scanning process.
- Monitor the run logs to see progress and potential issues.
- When completed, download your data in your preferred format (JSON, CSV, Excel).
Troubleshooting
If you encounter issues with the actor, try these solutions:
-
Browser Automation Issues:
- Enable the
usePlaywright
option - this significantly improves scanning for complex websites. - Try using a proxy - enable the
useApifyProxy
option and select "RESIDENTIAL" forproxyGroups
.
- Enable the
-
Data Extraction Issues:
- Try a different LLM provider - change the
llmProvider
setting. - Request fewer data fields - shorten the
targetDataFields
list.
- Try a different LLM provider - change the
-
Pagination Issues:
- Some sites use non-standard pagination - in this case, manually add each page to the
startUrls
list.
- Some sites use non-standard pagination - in this case, manually add each page to the
-
CAPTCHA Issues:
- Use a residential proxy - select "RESIDENTIAL" for
proxyGroups
. - Increase the
captchaMaxAttempts
value.
- Use a residential proxy - select "RESIDENTIAL" for
What Can You Do with Findify.best?
🛍️ Competitor Analysis: Automatically track your competitors' product prices, stock status, and features.
📊 Market Research: Conduct market analyses by collecting all products and prices in a specific product category.
💰 Price Monitoring: Regularly track prices of specific products to catch price changes.
📱 Product Comparison: Compare prices and conditions offered by different sellers for the same product.
🔍 Data Mining: Create structured datasets from e-commerce sites.
🤖 Automatic Catalog Creation: Create digital catalogs by extracting bulk product information.
Findify.best is a reliable, fast, and easy-to-use solution for your e-commerce data needs. Try it now and collect your data effortlessly!
Recent Updates
Version 1.1
- Added Playwright integration to extract data from sites with strong bot protection like Amazon
- Developed automatic pagination support for category pages
- Added advanced detection mechanism for product variants (color, size, model)
- Improved CAPTCHA detection and bypass mechanisms
- Strengthened error handling and retry mechanisms
- Updated Gemini API to use the latest model
- Improved CAPTCHA detection and handling
- Enhanced variant detection and extraction
- Added support for running with local IP (without proxy) for testing purposes
- Fixed various bugs and improved error handling
Version 1.0
- Initial release with basic LLM-powered extraction
- Support for Mistral and Gemini APIs
- HTML cleaning and preprocessing
- Pagination support
- CAPTCHA detection with proxy rotation
Pricing
Pricing model
Pay per resultThis Actor is paid per result. You are not charged for the Apify platform usage, but only a fixed price for each dataset of 1,000 items in the Actor outputs.
Price per 1,000 items
$20.00