npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

genkitx-astra-db

v0.1.2

Published

An Astra DB indexer and retriever for Genkit

Downloads

181

Readme

Astra DB Plugin for Genkit

This plugin provides a Astra DB retriever and indexer for Genkit.

Installation

npm i genkitx-astra-db

Prerequisites

You will need a DataStax account in which to run an Astra DB database. You can sign up for a free DataStax account here.

Once you have an account, create a Serverless Vector database. Once the database has been provisioned, create a collection. Ensure that you choose the same number of dimensions as the embedding provider you are going to use.

You will then need the database's API Endpoint, an Application Token and the name of the collection in order to configure the plugin.

Configuration

To use the Astra DB plugin, specify it when you call configureGenkit().

import { astraDB } from "genkitx-astra-db";

configureGenkit({
  plugins: [
    astraDB([
      {
        clientParams: {
          applicationToken: "your_application_token",
          apiEndpoint: "your_astra_db_endpoint",
          namespace: "default_keyspace",
        },
        collectionName: "your_collection_name",
        embedder: textEmbeddingGecko001,
      },
    ]),
  ],
});

Client Parameters

You will need an Application Token and API Endpoint from Astra DB. You can either provide them through the clientParams object or by setting the environment variables ASTRA_DB_APPLICATION_TOKEN and ASTRA_DB_API_ENDPOINT.

If you are using the default namespace, you do not need to pass it as config.

Options

  • collectionName: You need to provide a collection name that matches a collection in the database accessed at the API endpoint
  • embedder: You need to provide an embedder, like Google's textEmbeddingGecko001. Ensure that you have set up your collection with the correct number of dimensions for the embedder that you are using
  • embedderOptions: If the embedder takes extra options you can provide them

Astra DB Vectorize

You do not need to provide an embedder as you can use Astra DB Vectorize to generate your vectors. Ensure that you have set up your collection with an embedding provider. You can then skip the embedder option:

import { astraDB } from "genkitx-astra-db";

configureGenkit({
  plugins: [
    astraDB([
      {
        clientParams: {
          applicationToken: "your_application_token",
          apiEndpoint: "your_astra_db_endpoint",
          namespace: "default_keyspace",
        },
        collectionName: "your_collection_name",
      },
    ]),
  ],
});

Usage

Import the indexer and retriever references like so:

import { astraDBIndexerRef, astraDBRetrieverRef } from "genkitx-astra-db";

Then get a reference using the collectionName and an optional displayName and pass the relevant references to the Genkit functions index() or retrieve().

Indexer

export const astraDBIndexer = astraDBIndexerRef({
  collectionName: "your_collection_name",
});

await index({
  indexer: astraDBIndexer,
  documents,
});

Retriever

export const astraDBRetriever = astraDBRetrieverRef({
  collectionName: "your_collection_name",
});

await retrieve({
  retriever: astraDBRetriever,
  query,
});

Options

You can pass options to retrieve() that will affect the retriever. The available options are:

  • k: The number of documents to return from the retriever. The default is 5.
  • filter: A Filter as defined by the Astra DB library. See below for how to use a filter

Advanced usage

If you want to perform a vector search with additional filtering (hybrid search) you can pass a schema type to astraDBRetrieverRef. For example:

type Schema = {
  _id: string;
  text: string;
  score: number;
};

export const astraDBRetriever = astraDBRetrieverRef<Schema>({
  collectionName: "your_collection_name",
});

await retrieve({
  retriever: astraDBRetriever,
  query,
  options: {
    filter: {
      score: { $gt: 75 },
    },
  },
});

You can find the operators that you can use in filters in the Astra DB documentation.

If you don't provide a schema type, you can still filter but you won't get type-checking on the filtering options.

Further information

For more on using indexers and retrievers with Genkit check out the documentation on Retrieval-Augmented Generation with Genkit.