tinkerbird

v0.0.7

Published

10 months ago

Chrome Native Vector Database | Client Side Vector Search

Downloads

0High
0Medium
0Low

wizenheimer

TinkerBird🐦

TinkerBird is a Chrome-native vector database designed for efficient storage and retrieval of high-dimensional vectors (embeddings). Its query engine, written in TypeScript, leverages HNSW (Hierarchical Navigable Small World) indexes for fast vector retrieval. The storage layer utilizes IndexedDB, which could be extended with an lru-cache.

One significant challenge facing large language models (LLMs) is their tendency to generate syntactically correct but factually inaccurate responses, a phenomenon known as hallucination. To address this issue, vector databases provide LLMs with relevant context to structure it's response. This not only improves the accuracy of LLM responses but also minimizes computational costs incurred by irrelevant tokens. Additionally, vector databases ensure that LLMs are anchored to the required context.

TinkerBird disrupts traditional RAG workflows, which rely heavily on server-side interactions. By co-locating data and embeddings, it eliminates the roundtrip delays associated with client-server model. With Tinkerbird, sensitive data remains local, thus benefiting from vector search, without the associated compliance and security risks.

TinkerBird uses IndexedDB as it's storage layer, which in turn builds upon Blobs and LevelDB storage systems. By using Indexeddb, it benefits from IndexedDB's adoption, stability and familiarity as a native choice for offline first workflows.

Documentation

Documentation covers the key methods and usage scenarios for the HNSW and VectorStore classes.

`HNSW` Class

Encapsulates an in-memory implementation of HNSW index

// Initialize HNSW index
const hnsw = new HNSW();

// Build HNSW index incrementally
await hnsw.buildIndex(data);

// Save the serialized index to a file
const serializedIndex = hnsw.serialize();

// Query Index with queryVector and kValue
const results = hnsw.query(queryVector, kValue);

Properties:

metric: SimilarityMetric: The similarity metric used in the HNSW graph.
similarityFunction: vectorReducer: The similarity function for reducing vectors.
d: number | null: The vector dimension. It can be null if not specified.
M: number: The maximum neighbor count.
efConstruction: number: The effervescence coefficient used during construction.
entryPointId: number: The ID of the entry node in the graph.
nodes: Map<number, Node>: A mapping of node IDs to nodes.
probs: number[]: Probabilities for each level in the graph.
levelMax: number: The maximum level of the graph.

Usage:

// Example usage of the HNSW class
const hnsw = new HNSW();

`SimilarityMetric` Enum

An enumeration of similarity metrics used in the context of the HNSW class.

Members:

cosine: Represents the cosine similarity metric.
euclidean: Represents the Euclidean similarity metric.

Usage:

// Example usage of the SimilarityMetric enum
const selectedMetric: SimilarityMetric = SimilarityMetric.cosine;

VectorStore API

This documentation provides an overview of the VectorStore class, its methods, and usage examples. Feel free to adjust the parameters and methods as per your needs.

The VectorStore class extends the HNSW class and provides functionality for managing a vector index, including building, loading, saving, deleting the index, and querying vectors. It supports caching to improve query performance.

Class: VectorStore

Static Method: create

static async create(options: VectorStoreOptions): Promise<VectorStore>

Creates a new instance of the VectorStore class.
options: Object containing options for vector store initialization.
- collectionName: A unique name for the vector collection.
- M: Maximum neighbor count (default is 16).
- efConstruction: Effervescence coefficient during construction (default is 200).

Static Method: recreate

static async recreate(collectionName: string): Promise<VectorStore>

Attempts to recreate index from collection or else create a new instance of the VectorStore class if collection doesn't exist. This has been reported to result in unintended results, prefer create() instead.
options: Object containing options for vector store initialization.
- collectionName: A unique name for the vector collection.

Method: loadIndex

async loadIndex(): Promise<void>

Loads the vector index and meta data from the IndexedDB database. Used internally to recreate index

Method: saveIndex

async saveIndex(): Promise<void>

Persists the current state of the vector index and meta data onto the IndexedDB database.

Method: deleteIndex

async deleteIndex(): Promise<void>

Deletes the persisted index and metadata from the IndexedDB and populates it with an empty index and metadata

Method: query

query(target: number[], k: number = 3): vectorResult

Performs a vector search on the index.
target: The vector for which the search is performed.
k: The number of neighbors to retrieve (default is 3).
Returns: The result of the vector query.

Constants:

VectorStoreUnintialized: Error indicating that the vector store is uninitialized.
VectorStoreIndexMissing: Error indicating that the vector store index is missing.
VectorStoreIndexPurgeFailed: Error indicating that the vector store index deletion failed.

Example Usage:

// Create VectorStore instance
const vectorStore = await VectorStore.create({
    collectionName: "my-collection",
    M: 16,
    efConstruction: 200
});

// Load, Save, and Delete Index
await vectorStore.loadIndex();
await vectorStore.saveIndex();
await vectorStore.deleteIndex();

// Perform Vector Query
const queryVector = [
    /* ... */
];
const kValue = 5;
const results = vectorStore.query(queryVector, kValue);

VectorStoreOptions

The VectorStoreOptions type defines the configuration options for initializing a VectorStore instance.

Type: VectorStoreOptions

Properties:

collectionName: string: A unique name for the vector collection.
M?: number: Maximum neighbor count (default is 16).
efConstruction?: number: Effervescence coefficient during construction (default is 200).
cacheOptions?: CacheOptions | null: Options for caching (optional).

Example:

const vectorStoreOptions: VectorStoreOptions = {
    collectionName: "my-collection",
    M: 16,
    efConstruction: 200,
    cacheOptions: {
        max: 1000,
        maxAge: 60000 // 1 minute
    }
};

Contributing

Feel free to contribute to TinkerBird by sending us your suggestions, bug reports, or cat videos. Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

License

Distributed under the MIT License. See LICENSE for more information. TinkerBird is provided "as is" and comes with absolutely no guarantees. We take no responsibility for irrelevant searches, confused users, or existential crises induced by unpredictable results. If it breaks, well, that's your problem now 💀.

Credits

TinkerBird was brought to you by an indie developer who should probably be working on something more productive. But hey I am.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

TinkerBird🐦

Documentation

HNSW Class

Properties:

Usage:

SimilarityMetric Enum

Members:

Usage:

VectorStore API

Class: VectorStore

Static Method: create

Static Method: recreate

Method: loadIndex

Method: saveIndex

Method: deleteIndex

Method: query

Constants:

Example Usage:

VectorStoreOptions

Type: VectorStoreOptions

Properties:

Example:

Contributing

License

Credits

References

`HNSW` Class

`SimilarityMetric` Enum