llamaindex-pinecone
v0.1.0
Published
Pinecone integration for llamaindex
Downloads
2
Readme
LlamaIndex integration with Pinecone
This repository contains a LlamaIndex-compatible vector store backed by Pinecone indices.
Installation
npm install llamaindex-pinecone
API config
To automatically have a working client initialized, site these environment variables:
PINECONE_API_KEY
: Your API KeyPINECONE_ENVIRONMENT
: The environment of the pinecone instance you're using
Usage
The heart and soul of this package is PineconeVectorStore
. Let's see how that works.
VectorStore
interface compliance
PineconeVectorStore
adheres to the VectorStore
interface from llamaindex
:
export interface VectorStore {
storesText: boolean;
isEmbeddingQuery?: boolean;
client(): any;
add(embeddingResults: NodeWithEmbedding[]): Promise<string[]>;
delete(refDocId: string, deleteKwargs?: any): Promise<void>;
query(query: VectorStoreQuery, kwargs?: any): Promise<VectorStoreQueryResult>;
persist(persistPath: string, fs?: GenericFileSystem): Promise<void>;
}
A basic integration
import { storageContextFromDefaults, VectorStoreIndex } from "llamaindex";
import { FullContentMetadataBuilder, FullContentNodeHydrator, PineconeVectorStore } from "llamaindex-pinecone";
// Basic settings that have higher storage usage in Pinecone,
// but allow for quick plug-and-play
const easyModeSettings = {
// Store the entire node as JSON in the vector's metadata
pineconeMetadataBuilder: FullContentMetadataBuilder,
// When reading a vector, re-build the node from that JSON
nodeHydrator: FullContentNodeHydrator
}
// Initialize with the name of an index in Pinecone
const vectorStore = new PineconeVectorStore("speeches", easyModeSettings);
// define a storage context that's backed by our Pinecone vector store
const storageContext = await storageContextFromDefaults({ vectorStore })
// use that storage while we're loading documents
const vectorStoreIndex = await VectorStoreIndex.fromDocuments(presidentialInauguralAddresses, { storageContext });
// Make a query engine
const queryEngine = vectorStoreIndex.asQueryEngine();
// and ask it some questions!
const queryResponse = await queryEngine.query("What is the role of the Founding Fathers across inaugural addresses?");
And that's it!
Below is deeper documentation on interacting with the store itself. There is ample opportunity for customization.
VectorQueryStore
Adding a node to the store
Adding a node to the store requires us to pass a node and its embedding. The below example has already been tokenized:
import { Document } from "llamaindex";
const vectorStore = new PineconeVectorStore("tongue-twisters");
const tongueTwister = new Document({text: "Peter Piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, Where's the peck of pickled peppers Peter Piper picked?", id_: "peter-piper"})
const embedding = [
1, 5310, 7362, 546, 18691, 263, 1236, 384, 310, 5839, 839, 1236,
22437, 29889, 319, 1236, 384, 310, 5839, 839, 1236, 22437, 5310, 7362,
546, 18691, 29889, 960, 5310, 7362, 546, 18691, 263, 1236, 384, 310,
5839, 839, 1236, 22437, 29892, 6804, 29915, 29879, 278, 1236, 384, 310,
5839, 839, 1236, 22437, 5310, 7362, 546, 18691, 29973
];
await vectorStore.add({ node: tongueTwister, emedding })
// => ["peter-piper]
We see that it has returned an array of node ids that were successfully upserted to Pinecone. Woohoo!
Batching
The call to add
takes a set of options as its second argument. Calls to the Pinecone API are batched, in groups of 100 by default. Passing a batchSize
changes this value:
await vectorStore.add(nodesWithEmbeddings, { batchSize: 500 })
Changing the batch size possibly affects how many requests are made to Pinecone's API. You may want to fiddle with this if you are hitting rate limits. Note that Pinecone recommends batch sizes less than 1000.
Sparse values
Upsertion supports sparse values, if includeSparseValues: true
in passed in. By default, the sparse values will be built for you with a rather naive count-the-frequencies method.
With the same node and embedding above:
await vectorStore.upsert(nodesWithEmbeddings, { includeSparseValues: true });
This will result in a sparse values dictionary included in vector upsert that looks like this:
{
indices: [
1, 263, 278, 310, 319, 384, 546,
839, 960, 1236, 5310, 5839, 6804, 7362,
18691, 22437, 29879, 29889, 29892, 29915, 29973
],
values: [
1, 2, 1, 4, 1, 4, 4,
4, 1, 8, 4, 4, 1, 4,
4, 4, 1, 2, 1, 1, 1
]
}
Querying
Now that we've put some data into Pinecone, let's take it back out again.
The old standby: query
Let's imagine a Pinecone index that contains vectors of the last 80 years of US President innaugural addresses. We'll now craft a query of "I love America"—a sentiment we'd expect to be pretty common!—and see the top five most relevant results:
import { VectorStoreQueryMode } from "llamaindex";
const speechQuery = "I love America";
const queryEmbedding = Tokenizer.tokenize(speechQuery) // Tokenizer not included
const queryResponse = await vectorStore.query({ queryEmbedding, similarityTopK: 5, mode: VectorStoreQueryMode.DEFAULT });
// => {
// nodes: [],
// similarities: [ 0.134187013, 0.107230663, 0.095018886, 0.0797220916, 0.07660871 ],
// ids: [
// './speeches/inaugural-addresses/Dwight Eisenhower/1953.txt',
// './speeches/inaugural-addresses/Ronald Reagan/1981.txt',
// './speeches/inaugural-addresses/Franklin Delano Roosevelt/1945.txt',
// './speeches/inaugural-addresses/Richard Nixon/1973.txt',
// './speeches/inaugural-addresses/Harry Truman/1949.txt'
// ]
// }
A top score of 0.13 is not inspiring, but, at least for README purposes, you can see the shape of the response. By default, the node ids are read from the nodeId
field of the Pinecone vector's metadata. The similarities scores and ids correspond to each other by array index: the first similarity is for the first id, the second similarity for the second id, etc.
If you would like query
to rebuild the nodes themselves, see "Hydrating nodes from the vector metadata" down below!
Looking for more: queryAll
This method is mostly analagous with VectorStore.query
except that it returns Pinecone's matches rather than nodes.
import { VectorStoreQuery, VectorStoreQueryMode } from 'llamaindex';
// A class that shows the subset of `VectorStoreQuery` fields supported
// by the PineconeVectorStore (and their API.)
class PineconeVectorStoreQuery implements VectorStoreQuery {
queryEmbedding?: number[];
similarityTopK: number;
mode: VectorStoreQueryMode;
alpha?: number;
filters?: MetadataFilters;
constructor(args: any) {
Object.assign(this, args);
}
}
const query = new PineconeVectorStoreQuery({
similarityTopK: 3,
queryEmbedding: [1,2,3,4,5],
mode: VectorStoreQueryMode.SPARSE
});
await vectorStore.queryAll(query);
/** =>
* {
* matches: [
* { id: "vector-1", score: 0.123 },
* { id: "vector-2", score: 0.32 },
* { id: "vector-3", score: 0.119 }
* ]
* }
Options
The second argument to queryAll
is a dictionary of options. Those may include:
namespace
: the namespace to query. If not provided, will query the default namespace.includeMetadata
: whether or not to include metadata in the response. Defaults to true.includeValues
: whether or not to include matching vector values in the response. Defaults to true.vectorId
: the id of a vector already in Pinecone that will be used as the query. Exclusive withquery.queryEmbedding
.
Return value
As we've seen, the response is an object with a key matches
and an array of scored vectors. Each object contains:
id
: id of the vectorscore
: the similarity to the query vectorvalues
: the vector, ifincludeValues
wastrue
sparseValues
: the sparse values of the vectormetadata
: the metadata of the vector, ifincludeMetadata
wastrue
Fetching vectors
Simple stuff.
Note: this fetches vectors, not vectors for a node.
For nodes with an embedding <= the dimension of the index, that's the same as the node.nodeId
.
vectorStore.client.fetch(["peter-piper"], "Namespace (Optional: defaults to default namespace)")
// => {
// namespace: "default",
// vectors: {
// "peter-piper": {
// id: "peter-piper",
// values: [1,2,3,3,4,5], // vector values
// sparseValues: { indicies: [98, 412, 5, 12, 4], values: [3, 2, 1, 4, 2]},
// metadata: {
// nodeId: "peter-piper",
// age: "old",
// diffculty: "medium"
// }
// }
// }
// }
The response contains a vectors
object where the key is the vector id and the value is the vector.
Deleting vectors
By node id
Deletes all vectors associated with the given node ids.
🚨 NOTE 🚨
This does not work on Free/Starter plans, which don't support filters on delete operations. Use deleteVectors
instead.
await client.delete("peter-piper");
For multiple nodes
const nodeIds = ["wordNode", "peter-piper"];
await client.deleteAll(nodeIds);
By vector id
If you have the vectors' ids handy, you're welcome to delete those specifically:
const vectorIds = [
"wordNode-0",
"wordNode-1"
"wordNode-2",
"wordNode-3"
];
await client.deleteVectors(vectorIds, "Namespace (Optional: defaults to default namespace)");
Customization
PineconeVectorStore
works well out of the box. You might want some customization, though.
Customizing the client
By default, these Pinecone variables are pulled from the environment:
PINECONE_API_KEY
: Your API KeyPINECONE_ENVIRONMENT
: The environment of the pinecone instance you're using
If you'd like to bring your own client, the client
option will gladly accept yours.
import { PineconeClient } from "@pinecone-database/pinecone";
const myPineconeClient = new PineconeClient();
await myPineconeClient.init({ apiKey: "something secure", environment: "something environmental" });
const vectorStore = new PineconeVectorStore({ indexName: "UFO-files", client: myPineconeClient })
Customizing Sparse value generation
The naive sparse value generation is, as its name implies, naive. Other methods, like BM25 or SPLADE, may be more effective. The vector store supports passing a class that knows how to generate sparse values.
import { SparseValues, SparseValueBuilder } from "llamaindex-pinecone";
class FancySparseValueBuilder implements SparseValueBuilder {
embeddings: Array<number>;
constructor(embeddings: Array<number>) {
this.embeddings = embeddings;
}
build(): SparseValues {
// Do fancy math to generate indices and values
// Return a dictionary with those (aka `SparseValues`):
return { indicies, values }
}
}
// Now we can use it!
const vectorStore = PineconeVectorStore("fancy-documents", { sparseValueBuilder: FancySparseValueBuilder });
When calling add
or upsert
with includeSparseValues: true
, that builder will be used to generate sparse values being sent to Pinecone.
Custom Vector Metadata Format
By default, PineconeVectorStore will upsert metadata in the form of:
{
nodeId: node.nodeId,
...node.metadata
}
To customize this, pass a class that implements PineconeMetadataBuilder
. That means one methof of the signature buildMetadata(node: BaseNode) => PineconeMetadata
as the pineconeMetadataBuilder
option for the PineconeVectorStore
constructor.
For example, if you want to include the full node's content serialized into the metadata, you might have:
class FullContentMetadataBuilder implements PineconeMetadataBuilder {
buildMetadata(node: BaseNode): PineconeMetadata {
const nodeContent = JSON.stringify(node);
const metadata: PineconeMetadata = {
nodeId: node.id_,
nodeType: node.getType(),
nodeContent,
};
return metadata;
}
}
const pineconeVectorStore = new PineconeVectorStore("speeches", { pineconeMetadataBuilder: FullContentMetadataBuilder });
await vectorStore.add({ node: tongueTwister, emedding });
In fact, this exact implementation is included out of the box. import { FullContentMetadataBuilder } from "pinecone-llamaindex"
today!
Custom Vector Metadata Options
Out of the box, PineconeVectorStore.add
and .upsert
use SimpleMetadataBuilder
. It essentially adds the nodeId
and then splats the node.metadata
into an object. For Pinecone metadata, the only acceptable keys are string and values must be strings, numbers, booleans, or arrays of those types.
pineconeMetadataBuilderOptions
is an optional parameter on the PineconeVectorStore
constructor which will filter down to that builder. Fort the simple builder, the only presently supported option is excludedMetadataKeys
, which takes an array of keys to skip.
tongueTwister.metadata = {
difficulty: "medium",
age: "old",
pineconeAPIKey: "xxxxxxxxxxxxxxxx"
}
const pineconeVectorStore = new PineconeVectorStore("speeches", { pineconeMetadataBuilderOptions });
const pineconeMetadataBuilderOptions = { excludedMetadataKeys: ["pineconeAPIKey"] }
await vectorStore.add({node: tongueTwister, embedding })
In this case, only { diffculty: "medium", age: "old" }
will be upserted to Pinecone, but the tongueTwister
node itself will remain untouched.
This can be even more useful when implementing a custom PineconeMetadataBuilder
as seen above.
Hydrating nodes from the vector metadata
Returning the ids of nodes is fun, but maybe you'd like to re-build the nodes themselves. There are two options that can be passed into PineconeVectorStore
's constructor that enable this:
nodeHydrator
: a class conforming to NodeHydratorClass. Instances of this class must include a method of the signaturehydrate(vectorMetadata: PineconeMetadata) => BaseNode
;nodeHydratorOptions
: Something you want passed to the constructor of thenodeHydrator
Included in this package is FullContentNodeHydrator
which will:
- Look for a
nodeContent
property on the vector's metadata - Parse it as JSON
- Look for a
nodeType
- Match it against
ObjectType
in the llamaindex package - If it is "DOCUMENT", "TEXT", or "INDEX", initialized that node type passing in the node content to its constructor.
To see how this works, less look at an example:
import { Document } from "llamaindex";
import { FullContentNodeHydrator } from "llamaindex-pinecone";
// Example node object when it was originally upserted
const originalNodeData = { id_: "document-11", text: "secret data", metadata: { author: "CIA" } };
// What its vector metadata was upserted as
const pineconeVectorMetadata = {
nodeId: "document-11",
nodeType: "DOCUMENT",
nodeContent: '{"id_":"document-11","text":"secret data","metadata":{"author":"CIA"}}'
};
const pineconeVectorStore = new PineconeVectorStore("speeches", { nodeHydrator: FullContentNodeHydrator })
const queryResults = await pineconeVectorStore.query(vectorStoreQuery);
const expectedDocument = new Document(originalNodeData);
expectedDocument == queryResults.nodes[0];
// => true
This pairs nicely with the example in "Custom Vector Metadata Format" above, which will upsert nodes in that format automatically.
Advanced uses
Upsert
Upsert is the underlying method that backs add
. Let's look back at the example from add
's documentation above to see what else upsert
tells us.
import { Document } from "llamaindex";
const tongueTwister = new Document({text: "Peter Piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, Where's the peck of pickled peppers Peter Piper picked?", id_: "peter-piper"})
const embedding = [
1, 5310, 7362, 546, 18691, 263, 1236, 384, 310, 5839, 839, 1236,
22437, 29889, 319, 1236, 384, 310, 5839, 839, 1236, 22437, 5310, 7362,
546, 18691, 29889, 960, 5310, 7362, 546, 18691, 263, 1236, 384, 310,
5839, 839, 1236, 22437, 29892, 6804, 29915, 29879, 278, 1236, 384, 310,
5839, 839, 1236, 22437, 5310, 7362, 546, 18691, 29973
];
const nodesWithEmbeddings = [{node: tongueTwister, embedding}];
await vectorStore.upsert(nodesWithEmbeddings);
/**
* => {
* upsertedNodeCount: 1,
* upsertedNodeIds: ["peter-piper"],
* upsertedVectorCount: 1,
* upsertedVectorByNode: { "peter-piper": ["peter-piper"] }
* failedNodeCount: 0,
* failedNodeIds: [],
* errors: []
* }
*/
It's a lot! Let's break down what's in there:
upsertedNodeCount
: the total number of nodes upserted,upsertedNodeIds
: the node ids that were successfully upsertedupsertedVectorCount
: the total number of vectors upsertedupsertedVectorByNode
: a mapping of the node ids to the vectors that were upserted for that nodefailedNodeCount
: the number of nodes that failed to fully upsertfailedNodeIds
: the ids of the nodes that failed to fully upserterrors
: an array of errors that occurred during upsert
For most cases, upsertedNodeCount
and upsertedVectorCount
will be the same. See the "Automatic Embedding Splitting" section below for the option to split nodes across multiple vectors.
Passing duplicate vectors
Note that the passing duplicate nodes—those with identical node ids—and embeddings will only create one vector in Pinecone, but the response will count both. The returned array will return the nodeId twice.
Automatic Embedding Splitting
By default, the embedding for a node is expected to be exactly as long as the dimension of the index. If not, PineconeVectorStore
will throw an error.
If you have a large embedding that you would like automatically, though naively, chunked into multiple vectors the size of the index's dimension, set splitEmbeddingsByDimension
to true
. This is most useful for testing.
Here's a contrived example with a pinecone index whose dimension is 1
:
const indexInfo = await myPineconeClient.Index("letters").describeIndexStats();
console.log(indexInfo.dimension)
// => 1
const node = TextNode({text: "word", id_:"wordNode"})
vectorStore.upsert([{ node , embedding: [23, 15, 18, 4]}], { splitEmbeddingsByDimension: true })
/**
* => {
* upsertedNodeCount: 1,
* upsertedNodeIds: ["wordNode"],
* upsertedVectorCount: 4,
* upsertedVectorByNode: {
* "wordNode": [
* "wordNode-0",
* "wordNode-1",
* "wordNode-2",
* "wordNode-3"
* ]
* },
* failedNodeCount: 0,
* failedNodeIds: [],
* errors: []
* }
*/
The API request to Pinecone would look something like this:
{
"vectors": [
{ "id": "wordNode-0", "values":[23], "metadata": { "nodeId": "wordNode" } },
{ "id": "wordNode-1", "values":[15], "metadata": { "nodeId": "wordNode" } },
{ "id": "wordNode-2", "values":[18], "metadata": { "nodeId": "wordNode" } },
{ "id": "wordNode-3", "values":[4], "metadata": { "nodeId": "wordNode" } }
]
}
Notice that the vector has been split to fit the dimension of the index. The ids in Pinecone have been adapted to be indexed in order, prefixed by the node's id, and the metadata is preserved.
The node's id is always included in the metadata, so deleting the document handles cleaning up all related vectors automatically. Query's can still filter by that node id in the metadata.
With batching
Note that for batched requests, the vectors for a given node can be spread across multiple requests. These requests can succeed or fail independently, so it is possible for a given node id to be in both the upserted and failed lists. Since these are upserts, it is safe to retry the failed nodes.