No credit card required
Pinecone integration
No credit card required
Simplify your data operations with this Apify and Pinecone integration. Easily push selected fields from your Apify Actor directly into any Pinecone index. If the index doesn't exist, the integration will create it. Practical and straightforward solution for handling data between Apify and Pinecone.
Integrate Apify Actors with Pinecone to seamlessly transfer and store data as vectors.
⚠️ Important: This Actor is intended for use alongside other Actors. For instance, when using the Website Content Crawler, enable this integration to store data as vectors in Pinecone.
Explore how to utilize vector stores on the Apify platform by reading our blog post: Understanding Pinecone and Its Importance for Your LLMs.
Description
This integration is designed to process and store data vectors from various Apify Actors. It interfaces with OpenAI
and Pinecone
through langchain
to perform the following steps:
- Retrieve Actor's dataset using
dataset_id
(automatically passed in integration). - Fetch the dataset using the
Apify SDK
. - [Optional] Segment text data into chunks with
langchain
'sRecursiveCharacterTextSplitter
(parameters likechunk_size
andchunk_overlap
are customizable). - Compute embeddings via
OpenAI
. - Store the resulting vectors in
Pinecone
.
Before You Start
Ensure you have the following prerequisites for this integration:
- An OpenAI account and API token. Sign up for a free account at OpenAI.
- A Pinecone database with a valid API KEY (
pinecone_token
).
Inputs
Refer to the input schema for detailed information:
index_name
: Name of the Pinecone index.pinecone_token
: Your Pinecone access token (API KEY).openai_token
: Your OpenAI API token.fields
- Array of fields you want to save. For example, if you want to pushname
anduser.description
fields, you should set this field to["name", "user.description"]
.metadata_values
- Object of metadata values you want to save. For example, if you want to pushurl
andcreatedAt
values to Pinecone, you should set this field to{"url": "https://www.apify.com", "createdAt": "2021-09-01"}
.metadata_fields
- Object of metadata fields you want to save. For example, if you want to pushurl
andcreatedAt
fields, you should set this field to{"url": "url", "createdAt": "createdAt"}
. If it has the same key asmetadata_values
, it's replaced.chunk_size
: Maximum character length for each text chunk.chunk_overlap
: Overlap in characters between consecutive text chunks.
Fields, metadata_values
, and metadata_fields
support dot notation for nested data.
Outputs
This integration saves selected fields from your Actor's output into your Pinecone database.
Community and Support
- Join our developer community on Discord to connect with other developers and discuss integrations.
- Visit Apify for data needs of your LLMs for tools to ingest comprehensive datasets from various sources, enriching your large language models.
- 16 monthly users
- 99.8% runs succeeded
- 21.4 days response time
- Created in May 2023
- Modified 7 days ago