@intuitive-systems/protege-engine

v0.0.33

Published

9 months ago

**Protege Engine SDK** is a TypeScript library for interacting with the **Protege Engine GraphQL API**. It's designed for seamless integration in both frontend applications and CLI tooling, providing a robust interface for managing and interacting with LL

Downloads

463

0High
0Medium
0Low

bmr5

yourbuddyconner

Protege Engine SDK

Protege Engine SDK is a TypeScript library for interacting with the Protege Engine GraphQL API. It's designed for seamless integration in both frontend applications and CLI tooling, providing a robust interface for managing and interacting with LLM training through RLHF (Reinforcement Learning from Human Feedback). The API is built with Prisma and typegraphql-prisma, and the SDK leverages graphql-codegen for typed query inputs and returns.

Features

Components and Features

GraphQL client for easy queries and mutations
Comprehensive set of importable functions for managing the core Protege Engine primitives
CLI tool for convenient administration and interaction with the Protege Engine

TLDR: Protege Engine

Protege Engine is an AI-driven system developed by Intuitive Systems, designed as a versatile drop-in replacement for Large Language Model Inference APIs like those provided by OpenAI.

Protege Engine TLDR Diagram

It empowers users to create and integrate Large Language Model Inference and training functionality into their products with minimal engineering effort, thanks to its comprehensive RLHF interface and easy-to-use SDK.

By facilitating seamless integration into existing user interfaces, Protege Engine significantly reduces the Total Cost of Ownership (TCO) for AI pipelines. It enhances user outcomes in domain-specific contexts, making it an ideal solution for anyone looking to leverage advanced AI capabilities without the need for expensive and hard-to-find technical resources.

Things you need to know:

Predictions: The process involving call and response with prompts and completions from the inference backend, where completions are parsed into labels for further applications, aiding in generating structured outputs.
Label Parsers: Tools or mechanisms that take the output (completions) from the inference process and convert them into structured labels, facilitating the interpretation and use of AI-generated data within applications.
Feedback Mechanism: A pivotal component where human interaction through the UI leads to the approval or correction of prediction labels, directly influencing the dataset preparation for further training and enhancing model accuracy over time.
Inference Backends: The computational backend that performs the AI model's inference tasks. It serves as an instance equipped with an API endpoint for request proxying and execution tracing, crucial for generating predictions.
Datasets: Collections of data compiled from various prompts essential for training models. These datasets can be synthetic or standard and are vital for replicating the behavior demonstrated in the prompts, ensuring the model's continuous learning and adaptation.

Prerequisites

Environment Requirements

Node.js 18.10.x

Installing nvm and Node.js

Install nvm: Follow the installation instructions on the nvm GitHub page. This will involve running a curl or wget command in your terminal.
Install Node.js: Once nvm is installed, you can install Node.js. For compatibility with the Protege Engine SDK, we recommend using the latest LTS version of Node.js. Install it by running:
```
nvm install 18
```
Then, use it by running:
```
nvm use 18
```
Verify Installation: Ensure that Node.js and npm (Node Package Manager) are correctly installed by checking their versions:
```
node -v
npm -v
```
Get your account information from Intuitive Systems
- The current Alpha API is not public. Ask your account representative for onboarding documentation to facilitate access.
- NOTE: As Protege Engine is in an Alpha Release State, data integrity and associated infrastructure are offered on a best-effort basis. APIs and SDK interfaces may change without notice.

With your environment set up, you're ready to install and use the Protege Engine SDK.

Installation

To use the Protege Engine SDK in your project, install it via npm:

npm install @intuitive-systems/protege-engine

NOTE: If you would like to use the included CLI globally, you should install the package with the -g flag. If you go this route, you can call the protege command directly, without leveraging npx for local execution.

npm install -g @intuitive-systems/protege-engine

You can see the list of supported commands by running:

$ npx protege
Warning: No tokens found in config file! You will need to authenticate.
Usage: protege [options] [command]

Protege Engine SDK CLI

Options:
  -V, --version                                                                 output the version number
  -h, --help                                                                    display help for command

Commands:
  config|c [options]                                                            Configure the CLI
  backend:create|bc [options] <name> <urls...>                                  Create a new inference backend
  backend:list|bl                                                               List all inference backends
  backend:get|bg <id>                                                           Get a inference backend by id
  backend:update|bu [options] <id>                                              Update a inference backend
  backend:delete|bd <id>                                                        Delete a inference backend
  prediction:list|pl [options]                                                  List all predictions
  prediction:count|pc [options]                                                 Count all predictions
  dataset:list|dl                                                               List all datasets
  dataset:listPredictions|dlp <datasetId>                                       List all predictions for a given dataset
  dataset:download|dd [options] <id>                                            Download dataset CSV
  dataset:create|dc <name>                                                      Create a dataset
  dataset:addPredictions|dap <datasetId> <predictionIdStart> <predictionIdEnd>  Add a range of prediction ids to a dataset
  label:listProblematic|lpr                                                     List problematic labels
  help [command]

If you would like the CLI environment to manage your authentication tokens need to initialize the Protege Engine CLI configuration.

NOTE: This is not required when using the ApiKeyAuth strategy.

npx protege config -e <YOUR_PROTEGE_ENDPOINT>

Quick Example

import { ProtegeEngineSDK, ApiKeyAuth } from "@intuitive-systems/protege-engine";

// Configure the Protege Engine SDK with your endpoint and API key
const protege = new ProtegeEngineSDK({
    endpoint: 'YOUR_PROTEGE_ENDPOINT',
    auth: new ApiKeyAuth('YOUR_API_KEY')
});

// Example function to create a prediction and print the result
async function createAndPrintPrediction(prompt: string, modelBackendName: string) {
    // Example parameters - customize these according to your needs
    const openaiKwargs = {}; // Protege Engine supports OpenAI Compatible Request kwargs
    const useCache = false; // Protege Engine can cache results for you
    const labelParser = 'default'; // Example label parser (if applicable)
    const metadata = []; // Example metadata array (if applicable)

    try {
        // Create the prediction
        const res = await protege.prediction.create(openaiKwargs, prompt, modelBackendName, useCache, labelParser, metadata);
        
        // Print the prediction result
        console.log("Prediction Result:", res.prediction);
    } catch (error) {
        console.error("Error creating prediction:", error);
    }
}

// Example usage
const prompt = "Example prompt for prediction";
const modelBackendName = "exampleModelBackendName"; // Replace with your actual model backend name
createAndPrintPrediction(prompt, modelBackendName);

Usage

Here's a quick overview of how to use the Protege Engine SDK in your project:

Initializing the SDK

import { ProtegeEngineSDK, ApiKeyAuth } from './lib'

const sdk = new ProtegeEngineSDK({
  endpoint: 'https://your-protege-engine-endpoint.com',
  auth: new ApiKeyAuth(`your-protege-engine-api-key`)
});

Example: Creating a Prediction

Predictions are the core primitive

Example: Creating a New Inference Backend

const result = await sdk.inferenceBackend.createInferenceBackend({
  name: 'MyInferenceBackend',
  urls: { set: ['http://example.com/inference'] },
  apiKey: 'backend_api_key'
});

Using the CLI

The SDK comes with a CLI tool for managing the Protege Engine. To use it, configure it first:

protege config --endpoint <your_endpoint> --apiKey <your_api_key>

Create a new inference backend:

protege backend:create MyInferenceBackend http://example.com/inference

Advanced Docs

Authentication Strategies

The Protege Engine SDK supports two primary authentication strategies to secure your interactions with the Protege Engine GraphQL API: API Key Authentication and Device Flow Authentication. Below, you'll find guidance on how to utilize these strategies within your application.

API Key Authentication

API Key Authentication is a straightforward method suitable for scenarios where you can securely store and manage an API key. This method is ideal for server-side applications or environments where the API key can be securely stored.

Implementing API Key Authentication

import { ProtegeEngineSDK, ApiKeyAuth } from '@intuitive-systems/protege-engine';

// Initialize SDK with API Key Authentication
const sdk = new ProtegeEngineSDK({
  endpoint: 'https://your-protege-engine-endpoint.com',
  auth: new ApiKeyAuth('your-protege-engine-api-key')
});

Device Flow Authentication

Device Flow Authentication is designed for applications that cannot securely store credentials or where direct user interaction is preferred. This method is particularly useful for CLI tools or client-side applications requiring user consent through a web browser.

Implementing Device Flow Authentication

Before starting with Device Flow Authentication, ensure you have configured the necessary parameters in your application's configuration, including the Auth0 domain, client ID, scope, and audience.

import { DeviceAuth } from '@intuitive-systems/protege-engine';

// Initialize SDK with Device Flow Authentication
const sdk = new ProtegeEngineSDK({
  endpoint: 'https://your-protege-engine-endpoint.com',
  auth: new DeviceAuth()
});

// The DeviceAuth class handles the authentication flow, including user consent through a web browser,
// token retrieval, and token refresh if necessary.

Workflow Overview

Initialization: When you make your first API call, the DeviceAuth class checks for existing tokens.
User Consent: If no tokens are found or they are expired, the class initiates the device flow, prompting the user to visit a URL for authentication.
Token Retrieval: Upon successful authentication, the tokens are stored locally for subsequent API calls.
Token Refresh: The class automatically handles token expiration and refresh.

Additional Tips

Securely Store API Keys: For API Key Authentication, ensure your API keys are stored securely and not hardcoded into your application's source code.
Handle Authentication Flow Gracefully: For Device Flow Authentication, provide clear instructions to your users on how to authenticate through the provided URL and handle potential errors or denials of access.
Environment Variables: Consider using environment variables to manage sensitive information, such as API keys and configuration details for Device Flow Authentication.

Configuration Options

The Protege Engine SDK is designed to be flexible and easily configurable to suit different environments and use cases. Configuration settings are managed through a .protege config file located in the user's home directory. This file leverages environment variables to configure various aspects of the SDK, including API endpoints, authentication details, and more.

Setting Up Your Configuration File

Location: The config file should be named .protege and placed in your home directory (~/.protege).
Structure: The config file uses the dotenv format, which consists of key-value pairs. Each key-value pair defines a specific configuration setting.

Configuration Options Overview

Here's an overview of the available configuration options within the .protege file:

Environment Settings:
- NODE_ENV: Specifies the application's environment, such as development, test, or production. Default is development.
- TENANT_NAME: Defines the tenant name for multi-tenancy setups. Default is dev.
Protege Engine Settings:
- PROTEGE_ENGINE_ENDPOINT: The endpoint URL for the Protege Engine GraphQL API. Default is http://localhost:8081/v3/graphql.
- PROTEGE_ENGINE_API_KEY: The API key for accessing the Protege Engine. This is required if using API Key Authentication.
Auth0 Settings (for Device Flow Authentication):
- AUTH0_DOMAIN: The Auth0 domain used for Device Flow Authentication. Default is a placeholder domain.
- AUTH0_CLIENT_ID: The client ID for the Auth0 application. Default is a placeholder client ID.
- AUTH0_AUDIENCE: The audience for the Auth0 application, which identifies the API that the application is requesting access to. Default is a placeholder audience.
- AUTH0_TOKENS: Stores the tokens received from Auth0 authentication. Default is an empty JSON object {}.

Example Configuration

Here is an example of what your .protege config file might look like:

NODE_ENV=development
TENANT_NAME=dev
PROTEGE_ENGINE_ENDPOINT=https://engine.yourdomain.com/graphql
PROTEGE_ENGINE_API_KEY=your_protege_engine_api_key
AUTH0_DOMAIN=your_auth0_domain.us.auth0.com
AUTH0_CLIENT_ID=your_auth0_client_id
AUTH0_AUDIENCE=https://engine.yourdomain.com
AUTH0_TOKENS={}

Using Configuration in Your Application

The SDK automatically loads these configuration options when initialized. Ensure your application has access to the .protege file, and you're all set. The SDK will use these settings to manage its interactions with the Protege Engine and handle authentication flows.

Best Practices

Security: Keep your .protege file secure, especially if it contains sensitive information like API keys. Ensure it is not included in source control or exposed in shared environments.
Environment Specific Configurations: Use different .protege files or environment variables for different deployment environments (development, staging, production) to manage configuration settings effectively and securely across environments.

Full Documentation

For full documentation, including all available functions and CLI commands, refer to Protege Engine SDK Documentation.

Contributing

We welcome contributions to the Protege Engine SDK! If you'd like to contribute, please follow these steps:

Fork the repository.
Create a new branch for your feature or fix.
Write and test your code.
Submit a pull request with a clear description of your changes.

Acknowledgements

Developed and maintained by Intuitive Systems AI Inc.

npx protege dc "Training Data"
npx protege pc 
npx protege dap 2 25 1177 
npx protege dd 2