npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

llm-api

v1.6.0

Published

Fully typed chat APIs for OpenAI and Azure's chat models - with token checking and retries

Downloads

8,504

Readme

✨ LLM API

test

Fully typed chat APIs for OpenAI, Anthropic, and Azure's chat models for browser, edge, and node environments.

👋 Introduction

  • Clean interface for text and chat completion for OpenAI, Anthropic, and Azure models
  • Catch token overflow errors automatically on the client side
  • Handle rate limit and any other API errors as gracefully as possible (e.g. exponential backoff for rate-limit)
  • Support for browser, edge, and node environments
  • Works great with zod-gpt for outputting structured data
import { OpenAIChatApi } from 'llm-api';

const openai = new OpenAIChatApi({ apiKey: 'YOUR_OPENAI_KEY' });

const resText = await openai.textCompletion('Hello');

const resChat = await openai.chatCompletion({
  role: 'user',
  content: 'Hello world',
});

🔨 Usage

Install

This package is hosted on npm:

npm i llm-api
yarn add llm-api

Model Config

To configure a new model endpoint:

const openai = new OpenAIChatApi(params: OpenAIConfig, config: ModelConfig);

These model config map to OpenAI's config directly, see doc: https://platform.openai.com/docs/api-reference/chat/create

interface ModelConfig {
  model?: string;
  contextSize?: number;
  maxTokens?: number;
  temperature?: number;
  topP?: number;
  stop?: string | string[];
  presencePenalty?: number;
  frequencyPenalty?: number;
  logitBias?: Record<string, number>;
  user?: string;

  // use stream mode for API response, the streamed tokens will be sent to `events in `ModelRequestOptions`
  stream?: boolean;
}

Request

To send a completion request to a model:

const text: ModelResponse = await openai.textCompletion(api: CompletionApi, prompt: string, options: ModelRequestOptions);

const completion: ModelResponse = await openai.chatCompletion(api: CompletionApi, messages: ChatCompletionRequestMessage, options: ModelRequestOptions);

// respond to existing chat session, preserving the past messages
const response: ModelResponse = await completion.respond(message: ChatCompletionRequestMessage, options: ModelRequestOptions);

options You can override the default request options via this parameter. A request will automatically be retried if there is a ratelimit or server error.

type ModelRequestOptions = {
  // set to automatically add system message (only relevant when using textCompletion)
  systemMessage?: string | (() => string);

  // send a prefix to the model response so the model can continue generating from there, useful for steering the model towards certain output structures.
  // the response prefix WILL be appended to the model response.
  // for Anthropic's models ONLY
  responsePrefix?: string;

  // function related parameters are for OpenAI's models ONLY
  functions?: ModelFunction[];
  // force the model to call the following function
  callFunction?: string;

  // default: 3
  retries?: number;
  // default: 30s
  retryInterval?: number;
  // default: 60s
  timeout?: number;

  // the minimum amount of tokens to allocate for the response. if the request is predicted to not have enough tokens, it will automatically throw a 'TokenError' without sending the request
  // default: 200
  minimumResponseTokens?: number;

  // the maximum amount of tokens to use for response
  // NOTE: in OpenAI models, setting this option also requires contextSize in ModelConfig to be set
  maximumResponseTokens?: number;
};

Response

Completion responses are in the following format:

interface ModelResponse {
  content?: string;

  // used to parse function responses
  name?: string;
  arguments?: JsonValue;

  usage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };

  // function to send another message in the same 'chat', this will automatically append a new message to the messages array
  respond: (
    message: ChatCompletionRequestMessage,
    opt?: ModelRequestOptions,
  ) => Promise<ModelResponse>;
}

📃 Token Errors

A common error with LLM APIs is token usage - you are only allowed to fit a certain amount of data in the context window.

If you set a contextSize key, llm-api will automatically determine if the request will breach the token limit BEFORE sending the actual request to the model provider (e.g. OpenAI). This will save one network round-trip call and let you handle these type of errors in a responsive manner.

const openai = new OpenAIChatApi(
  { apiKey: 'YOUR_OPENAI_KEY' },
  { model: 'gpt-4-0613', contextSize: 8129 },
);

try {
  const res = await openai.textCompletion(...);
} catch (e) {
  if (e instanceof TokenError) {
    // handle token errors...
  }
}

🔷 Azure

llm-api also comes with support for Azure's OpenAI models. The Azure version is usually much faster and more reliable than OpenAI's own API endpoints. In order to use the Azure endpoints, you must include 2 Azure specific options when initializing the OpenAI model, azureDeployment and azureEndpoint. The apiKey field will also now be used for the Azure API key.

You can find the Azure API key and endpoint in the Azure Portal. The Azure Deployment must be created under the Azure AI Portal.

Note that the model parameter in ModelConfig will be ignored when using Azure. This is because in the Azure system, the model is selected on deployment creation, not on run time.

const openai = new OpenAIChatApi({
  apiKey: 'AZURE_OPENAI_KEY',
  azureDeployment: 'AZURE_DEPLOYMENT_NAME',
  azureEndpoint: 'AZURE_ENDPOINT',

  // optional, defaults to 2023-06-01-preview
  azureApiVersion: 'YYYY-MM-DD',
});

🔶 Anthropic

Anthropic's models have the unique advantage of a large 100k context window and extremely fast performance. If no explicit model is specified, llm-api will default to the Claude Sonnet model.

const anthropic = new AnthropicChatApi(params: AnthropicConfig, config: ModelConfig);

🔶 Groq

Groq is a new LLM inference provider that provides the fastest inference speed on the market. They currently support Meta's Llama 2 and Mistral's Mixtral models.

const groq = new GroqChatApi(params: GroqConfig, config: ModelConfig);

❖ Amazon Bedrock

const conf = {
  accessKeyId: 'AWS_ACCESS_KEY',
  secretAccessKey: 'AWS_SECRET_KEY',
};

const bedrock = new AnthropicBedrockChatApi(params: BedrockConfig, config: ModelConfig);

🤓 Debugging

llm-api usese the debug module for logging & error messages. To run in debug mode, set the DEBUG env variable:

DEBUG=llm-api:* yarn playground

You can also specify different logging types via:

DEBUG=llm-api:error yarn playground DEBUG=llm-api:log yarn playground