npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

axgen

v0.0.23

Published

A framework for connecting your data to large language models

Downloads

629

Readme

Axgen Github CI

Axgen is a framework for connecting your data to large language models.

Ingest, structure, and query your data with ease using the latest vector databases and LLMs.

npm i axgen

We built an open source demo UI for axgen, with a short video that shows the features.

Goals

Axgen's goal is to break down the various concepts of working with LLMs into components with well-defined interfaces. These interfaces are important as we cannot provide out-of-the-box components for every use case. Some components are provided (with more coming soon) but the interfaces enable you to extend the framework with your own implementations to satisfy arbitrary use cases.

Axgen aims to be a foundational framework from which you can construct higher-level, declarative workflows.

Example

This example showcases some core concepts of Axgen by 1) ingesting markdown files into the Pinecone vector database and 2) using retrieval augemented generation to answer a question about the contents of the data.

import {
  Ingestion,
  Pinecone,
  FileSystem,
  MarkdownSplitter,
  OpenAIEmbedder,
  RAG,
  Retriever,
  PromptWithContext,
  OpenAICompletion,
} from 'axgen';

const { OPENAI_API_KEY, PINECONE_API_KEY } = process.env;

// OpenAI's embedding model (defaults to text-embedding-ada-002)
const embedder = new OpenAIEmbedder({ apiKey: OPENAI_API_KEY });

//////////////////////////////////
// Connect to your vector store //
//////////////////////////////////
const pinecone = new Pinecone({
  index: 'mdindex',
  namespace: 'default',
  environment: 'us-west1-gcp-free',
  apiKey: PINECONE_API_KEY,
});

/////////////////////////////////
// Ingest local markdown files //
/////////////////////////////////
await new Ingestion({
  store: pinecone,
  source: new FileSystem({ path: '../path/to/sales/data', glob: '**/*.md' }),
  splitter: new MarkdownSplitter({ chunkSize: 1000 }),
  embedder: embedder,
}).run();

///////////////////////////////////////////////////////////
// Use retrieval augmented generation to query your data //
///////////////////////////////////////////////////////////
const template = `Context information is below.
---------------------
{context}
---------------------
Given the context information and not prior knowledge, answer the question: {query}
`;

const rag = new RAG({
  embedder: embedder,
  model: new OpenAICompletion({
    model: 'text-davinci-003',
    max_tokens: 256,
    apiKey: OPENAI_API_KEY,
  }),
  prompt: new PromptWithContext({ template }),
  retriever: new Retriever({ store: pinecone, topK: 3 }),
});

// stream the response
const { result, info } = rag.stream(
  'What were our biggest sales in Q4 of this year and who were the customers?'
);

for await (const chunk of result) {
  process.stdout.write(chunk);
}

process.stdout.write('\n');

// Information about what results were used from the vector database.
console.log(info);

Overview

The main components of the API are as follows:

  • Vector stores persist your data embeddings which can later be queried.
  • Data sources are documents pulled from arbitrary locations, e.g., a PDF from your local file system, documents from Notion, a wikipedia page, etc.
  • Data splitters split documents from a data source into smaller chunks. The embeddings of those chunks can be persisted in a vector store and later queried by similarity.
  • Data embedders create embeddings from chunks of text.
  • Data retrievers query vector stores for chunks of text similar to an input.
  • Prompts and prompt templates are used to construct the instructions sent to the LLM.
  • Models (LLMs) perform calls to generate e.g. completions or chat completions.

Additionally, there are two higher-level component types that create workflows out of the above components:

  1. Ingestion constructs a data ingestion pipeline from a data source, splitter, embedder, and vector store.
  2. Generation construct data generation pipelines (e.g., chat completion over your custom data) from some input, an embedder, prompts, and a model.

Supported Models

We currently support

  • OpenAI models
  • Anthropic models
  • Google models (e.g., text-bison, chat-bison, textembedding-gecko) through vertexai.
  • Cohere generate models

Hugging Face inference and more coming soon!

Documentation

We're working on new documentation website. In the meantime, please check out the following (runnable) examples:

Vector Stores

These implement CRUD operations for the supported vector stores.

  • prepare. Run with npm run vector_store:prepare -- <options>
  • teardown. Run with npm run vector_store:teardown -- <options>
  • upload. Run with npm run vector_store:upload -- <options>
  • delete. Run with npm run vector_store:delete -- <options>

Models / RAG pipelines

These implement basic LLM queries as well as RAG queries with all supported models.

Development

See the development docs.

License

MIT