Pinecone Integration avatar
Pinecone Integration

Pricing

Pay per usage

Go to Store
Pinecone Integration

Pinecone Integration

apify/pinecone-integration

Developed by

Apify

Maintained by Apify

This integration transfers data from Apify Actors to a Pinecone and is a good starting point for a question-answering, search, or RAG use case.

4.6 (5)

Pricing

Pay per usage

29

Monthly users

121

Runs succeeded

95%

Response time

4 days

Last modified

14 days ago

RB

How does Delta updates settings work?

Closed
responsible_box opened this issue
9 months ago

I want it to delete all the current vectors in Pinecone when there is a new upload. So if I run it every 2. week, it will delete my current vectors, and add the new ones instead. How is this possible?

Thanks for the help. :)

jiri.spilka avatar

Thank you for your interest in the Pinecone integration!

Currently, the integration does not support deleting all vectors in the Pinecone database during each new upload. This feature is not available due to the risk of accidental misconfiguration, which could result in the deletion of the entire database.

Instead, the integration offers the deltaUpdates functionality, which ensures efficient and safe updates to your Pinecone database. Here’s how it works:

  • Unchanged Content: Updates the last_seen_at metadata field.
  • Changed Content: Deletes the old data, computes new vectors, and adds them to the database.
  • New Content: Computes vectors and adds them to the database.

Handling Removed Content: If a URL is removed from the website and is not present in the current crawl, you can delete objects in the Pinecone database that have not been seen in the past X days. This is managed using the expiredObjectDeletionPeriodDays setting.

Example Configuration Here is an example of how you can set this up for your use case:

Input Data When scraping a website, such as apify.com, using the Web Scraper, the output might look like this:

1{
2  "url": "https://apify.com",
3  "title": "Apify",
4  "content": "Apify is the platform where developers build, deploy, and publish web scraping, data extraction, and web automation tools."
5}

Integration Settings as follows:

1{
2  "datasetFields": ["content"],
3  "enableDeltaUpdates": true,
4  "deltaUpdatesPrimaryDatasetFields... [trimmed]
jiri.spilka avatar

I'm going to close this issue now. Please let me know if you face any problems.

Pricing

Pricing model

Pay per usage

This Actor is paid per platform usage. The Actor is free to use, and you only pay for the Apify platform usage.