AI Product Matcher
No credit card required
AI Product Matcher
No credit card required
Match products across multiple e-commerce websites. Use this AI product matching Actor whenever you need to find matching pairs of products from different online shops for dynamic pricing, competitor analysis or market research.
Do you want to learn more about this Actor?
Get a demoHi I tried the actor with 2 eshop datasets like : [{ "id": "GZ2244", "name": "Sample 60", "price": "5.90", "short_description": "Découvrez le blahblah.", "long_description": "", "specification": [ { "key": "brand", "value": "" }, { "key": "Degré", "value": "54" }, { "key": "Provenance", "value": "Mexique" }, { "key": "Volume", "value": "6" } ] }, {... }]
I configured the input attributes mapping with : { "eshop1": { "id": "id", "name": "name", "price": "price", "short_description": "short_description", "long_description": "long_description", "specification": "specification", "code": "code" }, "eshop2": { "id": "id", "name": "name", "price": "price", "short_description": "short_description", "long_description": "long_description", "specification": "specification", "code": "code" } }
and i got this error 2023-06-02T05:25:31.016Z Actor failed with an exception 2023-06-02T05:25:31.018Z multiprocessing.pool.RemoteTraceback: 2023-06-02T05:25:31.018Z """ 2023-06-02T05:25:31.019Z Traceback (most recent call last): 2023-06-02T05:25:31.020Z File "/usr/local/lib/python3.9/multiprocessing/pool.py", line 125, in worker 2023-06-02T05:25:31.021Z result = (True, func(*args, **kwds)) 2023-06-02T05:25:31.022Z File "/usr/local/lib/python3.9/multiproces... [trimmed]
Hi Alex, I will check the issue you are having and fix it if possible tomorrow (Monday, 5.6.2023). Sorry for the long wait, the weekend got in the way :)
Hi Matěj, Thanks ! I have investigated the issue, it seems that the datasets I use contain logs as they come from scraps using the standard cheerio scrapper. I mean that the parameters #error and #debug may cause the TypeError.
"#error": false, "#debug": { "requestId": "xxx", "url": "https://xxx", "loadedUrl": "https:xxx", "method": "GET", "retryCount": 0, "errorMessages": [], "statusCode": 200 }
To come to this conclusion, il modified the Cheerio Scraper Actor page function so that it returns a JSON containing exactly the same contents that are in the sample dataset of your Actor (meaning dataset GYVCj4hEeqnX3dJyu and OmzHV4VEByO4KohMF). The only thing that differs between the outputed dataset form the scraper and the sample datasets are these log keys-values.
Maybe adding to the Actor a "cleaner" to remove hidden fields would be an easy fix ?
Hope this helps :) I am so eager to try you Actor you know
Hope
Hey Alex, the problem should be fixed now. Check on your side please and let me know if it still persists. It was indeed connected to the debug output, thanks for the help with debugging :)
Hi thanks. I tried it with sample dataset on 2023-06-06 09:19 and it worked. But now It tried it again at 2023-06-06 13:29:15 with my dataset, error. And tried again using exactly the same input and dataset as i used at 9:10 and it failed with "TypeError: '<' not supported between instances of 'str' and 'int'"
2023-06-06T11:29:41.272Z File "/usr/src/app/main.py", line 16, in main 2023-06-06T11:29:41.273Z if max_items_to_process < 1: 2023-06-06T11:29:41.274Z TypeError: '<' not supported between instances of 'str' and 'int'
Here is the input I used { "dataset1_ids": [ "DNOqtHmubZf5KiHSc" ], "dataset2_ids": [ "6PGDkddVGhybuJpgv" ], "input_mapping": { "eshop1": { "id": "url", "name": "name", "price": "price", "short_description": "shortDescription", "long_description": "longDescription", "specification": "specification", "code": [ "sku" ] }, "eshop2": { "id": "url", "name": "name", "price": "price", "short_description": "shortDescription", "long_description": "longDescription", "specification": "specification", "code": [ "sku" ] } }, "output_mapping": { "eshop1": { "id_source": "url", "name_source": "name" }, "eshop2... [trimmed]
First run succeeded : https://console.apify.com/view/runs/gB5PHNWeTkfwSDnWk
Same input but failed https://console.apify.com/view/runs/7W6ZGqn0247q9D0d0
Hey Alex, this issue should now be fixed as well.
By the way, I noticed that you only put "sku" into the "code" input. In general, I would advise against that. Since SKUs are very often different for the same products in different online shops, it is better to use codes/ids that are more universal, such as the "productModel" in the sample datasets (and very often, these codes are indeed called "Product model" or "Product number" on real online shops as well). Since the current model takes the codes very seriously, putting codes that are always gonna be different such as SKUs there is counterproductive, as can be seen by the results you get from your input. I am currently developing a model that doesn't use the codes for cases when no good identifiers are available, it should be out in a few days. If you wish to be alerted, let me know and I will write to you here when it's finished.
Best regards, Matěj
Hi thanks Matěj The main problem I have with the datasets I work : they don't have ean, nor gtin or any matching code. I wanted to try your actor and see how precise it can be with this context. But it seems that I can't make it work due to the format of my dataset. I tried the actor with the sample datasets and it worked well. But when I change the keys to match my datasets, I get errors again. Would you mind to give a look at the task run : https://console.apify.com/view/runs/CyQSKirHPgU4UKN4L ?
Hey Alex, the issue with your input was that sometimes there were items in specification that only contained key, but no value. Since there is no real reason for us not to accept such an input, I patched it so you should be able to run your input with no problem. Don't use the results to determine accuracy though, using "name" as a code will heavily impact it. I am finishing up on the no-codes model, so I will let you know as soon as possible when it's deployed.
Best, Matěj
Thanks Matej The thing is that for the products need to track the EAN code is consistant when it is available. So if I have the Ean code, I can use it as is to match product and I don’t need IA :) As you may see the brand and spec of a product can be enough to narrow the list to match so that the description and name can be used to finally close the gap. I will wait for your new version :)
Hi Alex, sorry for the delay, vacations got in the way. The version requiring no codes has been uploaded now, so feel free to try it and see if it suits your needs.
Best, Matěj
Hi ! Juste git de newsletter from Apify ! Thanks i will look info it asap
Actor Metrics
20 monthly users
-
9 stars
86% runs succeeded
49 days response time
Created in Apr 2023
Modified 6 months ago