Pinecone Integration avatar

Pinecone Integration

Try for free

No credit card required

Go to Store
Pinecone Integration

Pinecone Integration

apify/pinecone-integration
Try for free

No credit card required

This integration transfers data from Apify Actors to a Pinecone and is a good starting point for a question-answering, search, or RAG use case.

Do you want to learn more about this Actor?

Get a demo
SN

Setting up the Pinecone Namespace

Closed

sprouto_net opened this issue
4 months ago

How do I set the Pinecone Namespace?

jiri.spilka avatar

Thank you for the interested in the Pinecone integration. The namespace parameter is not exposed at the moment, but I will release it tomorrow (Friday, the 6th). Please let me know if there's anything else you'd like to see added.

jiri.spilka avatar

Here's the revised version of your message:

Hi, in the latest build, I’ve added the option to select a Pinecone namespace using the pineconeIndexNamespace parameter. This parameter is optional:

1{
2    "pineconeApiKey": "....",
3    "pineconeIndexName": "....",
4    "pineconeIndexNamespace": "ns1"
5}

I’m closing this issue now. Please let me know if there’s anything else I can help with.

SN

sprouto_net

4 months ago

Thanks for the quick revert. However, we would need this pineconeIndexNamespace to be parametrized. So basically, when we send the URL for crawling to the website content crawler, we would send the "AccountID" of the user for whom we would want to crawl this website. This parameter of "AccountID" would be passed on to the pinecone and thus all the crawled data would be saved for this user in the same namespace. This way we can keep crawl data of one user separate from another.

So basically, we just want to send a parameter to website crawler in any available field and that should be passed as parameter to our pinecone integration. This is just same like how we are passing the URL as a parameter.

jiri.spilka avatar

Ok, I understand your requirement. I'm not sure how you're executing the website content crawler, but I assume that if you want it to be parameterized, you’re calling the website content crawler via API or SDK (JavaScript, Python).

Here’s a simple example of how you can achieve this in Python:

  1. Call the website content crawler:
1actor_call = client.actor("apify/website-content-crawler").call(
2    run_input={"startUrls": [{"url": "https://docs.pinecone.io/home"}]}
3)
  1. After the website content crawl finishes, call the Pinecone integration:
1pinecone_integration_inputs = {
2    "pineconeApiKey": PINECONE_API_KEY,
3    "pineconeIndexName": PINECONE_INDEX_NAME,
4    "pineconeIndexNamespace": "AccountID",
5    "datasetId": actor_call["defaultDatasetId"],
6}
7
8actor_call = client.actor("apify/pinecone-integration").call(run_input=pinecone_integration_inputs)

Do you plan to execute it in any other way? From the console.apify.com?

Developer
Maintained by Apify

Actor Metrics

  • 28 monthly users

  • 19 stars

  • 95% runs succeeded

  • 38 days response time

  • Created in Jun 2024

  • Modified 17 days ago