Amazon AI Product Intelligence

Under maintenance

Pricing

Pay per event

Try for free

Go to Apify Store

Amazon AI Product Intelligence

Under maintenance

Try for free

Developed by

bySeitz AI & Automation

Maintained by Community

Amazon AI Product Intelligence Stream is an advanced, AI-driven Actor designed to provide deep, structured intelligence from the global Amazon marketplace. It is built for targeted competitive and market analysis on e-commerce products.

0.0 (0)

Pricing

Pay per event

Last modified

a day ago

Agents

E-commerce

🧠 Amazon AI Product Intelligence Stream

This Actor performs advanced, structured data extraction and synthesis on Amazon product pages. It uses Playwright for targeted, stealthy scraping and leverages large language models (LLMs) via LangChain's structured output feature to convert raw HTML product details into actionable, clean JSON data and a final business report.

The Actor is designed for maximum reliability and flexibility, using a robust, two-tier processing system (Crawl Only Mode and Local Structured AI Mode).

🚀 Key Features and Improvements

Local Structured AI Mode (Tier 2): Replaced the unstable external ChatKit API workflow with reliable local structured extraction using LangChain and OpenAI. This eliminates HTTP 404 errors and ensures predictable JSON output.
Dynamic Schema Selection: Automatically switches the LLM's output schema based on the user's Analysis Objective (Prompt Selection). This provides precise, dedicated structured output for technical specifications (AmazonTechnicalSpecs) and general data (AmazonProductData).
Complete Data Output: The final dataset now includes the single Aggregate Synthesis Report plus individual Structured Item Reports for every successfully processed product, offering both macro and micro data views.
Price & ASIN Robustness: Includes advanced Playwright selectors and injection logic to maximize the capture rate of dynamic data like Price and ASIN before passing content to the LLM for structuring.
Improved User Experience: The input interface is optimized with emojis and user-friendly editors, including a multi-select for search queries (stringList) and a dropdown for LLM model selection (including GPT-5) and Amazon domains.

⚙️ Configuration and Input

The Actor's input is defined via input_schema.json, providing a user-friendly interface divided into three sections:

1. 🔍 Search Configuration

Field	Type	Description
`amazonSearchQueries`	`array` (`stringList`)	The keywords to search for (one query per line).
`amazonDomain`	`string` (`select`)	The Amazon marketplace to target (e.g., `com`, `co.uk`, `jp`).
`maxTotalProducts`	`integer`	Max total unique product pages to process in the run.
`maxProductsPerPage`	`integer`	Max product links to pull from each search result page.

2. 🧠 Analysis & AI Control

Field	Type	Description
`enableAISynthesis`	`boolean`	If true (default): Runs the full LLM-based structured extraction and synthesis (Tier 2).
`promptSelection`	`string` (`select`)	Defines the analysis objective (e.g., `core_summary`, `technical_specs`, `customer_sentiment`, or `custom_input`).
`customPrompt`	`string` (`textarea`)	Used by the LLM when `custom_input` is selected (e.g., "Extract the screen size and processor model.").
`llmModel`	`string` (`select`)	Selects the GPT model (e.g., `gpt-4o-mini`, `gpt-4o`, `gpt-5`) for all extraction and synthesis tasks.
`verboseLog`	`boolean`	Enables detailed debug logging for troubleshooting.

📊 Output Structure

The Actor pushes multiple JSON objects to the default Dataset, ensuring a comprehensive output:

Item 1: Final Synthesis Report (`_tier: AI_SYNTHESIS_REPORT`)

This is the single aggregate summary of all products processed for the original query.

Field	Description
`report`	The comprehensive, synthesized final business summary generated by the LLM.
`sources`	Array of all product URLs used in the report.
`extra_specs_json`	A single JSON string summarizing the most common miscellaneous specifications found across all products.

Subsequent Items: Individual Product Reports (`_tier: AI_SYNTHESIS`)

These contain the raw, structured data extracted from each successful product page.

Field	Description
`product_title`	The title of the product.
`asin`	The product's ASIN.
`report`	A short, human-readable summary of the structured data extracted for this specific product.
`core_data_point` / `price_with_currency` / etc.	The specific structured data fields defined by the chosen analysis objective.

Fallback Items (`_tier: CRAWL_ONLY_FALLBACK`)

These items are pushed if the LLM extraction fails (e.g., API error or Pydantic error), providing the raw HTML/Markdown content for manual review.

🛠️ Developer Notes

Model IDs: The _initialize_llm function automatically strips the redundant "openai/" prefix from the model name selected in the input UI to prevent Invalid Model ID errors when calling the OpenAI API.
Schema Handling: The scraper_logic.py dynamically selects and converts between Pydantic models (AmazonProductData, AmazonTechnicalSpecs, FinalReportSchema) and Python dictionaries using .model_dump() to ensure clean data flow and prevent Pydantic validation errors during aggregation.
Dependencies: The requirements.txt includes necessary asynchronous libraries (playwright, httpx) and the LangChain/OpenAI stack (langchain-openai) for robust execution.

On this page

Share Actor:

Amazon Product Search Scraper

igolaizola/amazon-search

Amazon Product Search Scraper Actor

Iñigo Garcia Olaizola

3.0

My amazon-product-scraper

lsdflying/amazon-product-scraper

amazon-product-scraper

Liang Undef

Amazon Product Description

pintostudio/amazon-product-description

The Amazon Product Description Actor is a powerful tool that allows you to extract detailed product information from Amazon product pages.

Pinto Studio

Amazon Product Details Scraper

tpp/amazon-product-details-scraper

Amazon Product Details Scraper is your essential tool for accessing deep, detailed data from Amazon product pages. Designed to serve the needs of data analysts, market researchers, and e-commerce professionals, this scraper efficiently extracts critical information.

TPP

141

Amazon product scraper

unlimitedleadtestinbox/amazon-product-scraper

Use this Amazon scraper to collect data based on Amazon product URL from Amazon website. Extract product information including title, rating, prices, descriptions, and ASIN.

unli

Amazon Search Products Scraper

pintostudio/amazon-search-products-scraper

The Amazon Search Products Actor is designed to search and scrape product data from Amazon based on a specific keyword, category, filters, and region. It returns structured product details such as title, price, ASIN, ratings, pagination and more.

Pinto Studio

Amazon Seller Intelligence Pro

red.cars/amazon-seller-intelligence-pro

The #1 Investment-Grade Amazon Seller Intelligence Scraper - Extract comprehensive seller data from Amazon with advanced business analytics, risk assessment, and competitive intelligence. No Amazon API key required, instant access to Fortune 500-ready seller insights!

AutomateLab

Amazon Product Review

pintostudio/amazon-product-review

The Amazon Product Review Scraper is an Apify actor that lets you extract comprehensive review data from Amazon product pages

Pinto Studio

Amazon Seller Info Scraper

pintostudio/amazon-seller-info-scraper

The Amazon Seller Info Actor is an Apify actor designed to extract comprehensive seller information from Amazon marketplace.

Pinto Studio

Amazon Seller Products Scraper

pintostudio/amazon-seller-products-scraper

The Amazon Seller Products Actor is an Apify Actor designed to scrape product listings from Amazon seller pages. This actor extracts comprehensive product information including titles, prices, ratings, reviews, delivery options, and more from any Amazon seller's product catalog.

Pinto Studio