@intuitive-systems/protege-engine
v0.0.33
Published
**Protege Engine SDK** is a TypeScript library for interacting with the **Protege Engine GraphQL API**. It's designed for seamless integration in both frontend applications and CLI tooling, providing a robust interface for managing and interacting with LL
Downloads
261
Readme
Protege Engine SDK
Protege Engine SDK is a TypeScript library for interacting with the Protege Engine GraphQL API. It's designed for seamless integration in both frontend applications and CLI tooling, providing a robust interface for managing and interacting with LLM training through RLHF (Reinforcement Learning from Human Feedback). The API is built with Prisma and typegraphql-prisma, and the SDK leverages graphql-codegen for typed query inputs and returns.
Features
- GraphQL client for easy queries and mutations
- Comprehensive set of importable functions for managing the core Protege Engine primitives
- CLI tool for convenient administration and interaction with the Protege Engine
TLDR: Protege Engine
Protege Engine is an AI-driven system developed by Intuitive Systems, designed as a versatile drop-in replacement for Large Language Model Inference APIs like those provided by OpenAI.
It empowers users to create and integrate Large Language Model Inference and training functionality into their products with minimal engineering effort, thanks to its comprehensive RLHF interface and easy-to-use SDK.
By facilitating seamless integration into existing user interfaces, Protege Engine significantly reduces the Total Cost of Ownership (TCO) for AI pipelines. It enhances user outcomes in domain-specific contexts, making it an ideal solution for anyone looking to leverage advanced AI capabilities without the need for expensive and hard-to-find technical resources.
Things you need to know:
- Predictions: The process involving call and response with prompts and completions from the inference backend, where completions are parsed into labels for further applications, aiding in generating structured outputs.
- Label Parsers: Tools or mechanisms that take the output (completions) from the inference process and convert them into structured labels, facilitating the interpretation and use of AI-generated data within applications.
- Feedback Mechanism: A pivotal component where human interaction through the UI leads to the approval or correction of prediction labels, directly influencing the dataset preparation for further training and enhancing model accuracy over time.
- Inference Backends: The computational backend that performs the AI model's inference tasks. It serves as an instance equipped with an API endpoint for request proxying and execution tracing, crucial for generating predictions.
- Datasets: Collections of data compiled from various prompts essential for training models. These datasets can be synthetic or standard and are vital for replicating the behavior demonstrated in the prompts, ensuring the model's continuous learning and adaptation.
Prerequisites
Environment Requirements
Node.js 18.10.x
Installing nvm and Node.js
Install nvm: Follow the installation instructions on the nvm GitHub page. This will involve running a curl or wget command in your terminal.
Install Node.js: Once nvm is installed, you can install Node.js. For compatibility with the Protege Engine SDK, we recommend using the latest LTS version of Node.js. Install it by running:
nvm install 18
Then, use it by running:
nvm use 18
Verify Installation: Ensure that Node.js and npm (Node Package Manager) are correctly installed by checking their versions:
node -v npm -v
Get your account information from Intuitive Systems
- The current Alpha API is not public. Ask your account representative for onboarding documentation to facilitate access.
- NOTE: As Protege Engine is in an Alpha Release State, data integrity and associated infrastructure are offered on a best-effort basis. APIs and SDK interfaces may change without notice.
With your environment set up, you're ready to install and use the Protege Engine SDK.
Installation
To use the Protege Engine SDK in your project, install it via npm:
npm install @intuitive-systems/protege-engine
NOTE: If you would like to use the included CLI globally, you should install the package with the -g
flag. If you go this route, you can call the protege
command directly, without leveraging npx
for local execution.
npm install -g @intuitive-systems/protege-engine
You can see the list of supported commands by running:
$ npx protege
Warning: No tokens found in config file! You will need to authenticate.
Usage: protege [options] [command]
Protege Engine SDK CLI
Options:
-V, --version output the version number
-h, --help display help for command
Commands:
config|c [options] Configure the CLI
backend:create|bc [options] <name> <urls...> Create a new inference backend
backend:list|bl List all inference backends
backend:get|bg <id> Get a inference backend by id
backend:update|bu [options] <id> Update a inference backend
backend:delete|bd <id> Delete a inference backend
prediction:list|pl [options] List all predictions
prediction:count|pc [options] Count all predictions
dataset:list|dl List all datasets
dataset:listPredictions|dlp <datasetId> List all predictions for a given dataset
dataset:download|dd [options] <id> Download dataset CSV
dataset:create|dc <name> Create a dataset
dataset:addPredictions|dap <datasetId> <predictionIdStart> <predictionIdEnd> Add a range of prediction ids to a dataset
label:listProblematic|lpr List problematic labels
help [command]
If you would like the CLI environment to manage your authentication tokens need to initialize the Protege Engine CLI configuration.
NOTE: This is not required when using the ApiKeyAuth
strategy.
npx protege config -e <YOUR_PROTEGE_ENDPOINT>
Quick Example
import { ProtegeEngineSDK, ApiKeyAuth } from "@intuitive-systems/protege-engine";
// Configure the Protege Engine SDK with your endpoint and API key
const protege = new ProtegeEngineSDK({
endpoint: 'YOUR_PROTEGE_ENDPOINT',
auth: new ApiKeyAuth('YOUR_API_KEY')
});
// Example function to create a prediction and print the result
async function createAndPrintPrediction(prompt: string, modelBackendName: string) {
// Example parameters - customize these according to your needs
const openaiKwargs = {}; // Protege Engine supports OpenAI Compatible Request kwargs
const useCache = false; // Protege Engine can cache results for you
const labelParser = 'default'; // Example label parser (if applicable)
const metadata = []; // Example metadata array (if applicable)
try {
// Create the prediction
const res = await protege.prediction.create(openaiKwargs, prompt, modelBackendName, useCache, labelParser, metadata);
// Print the prediction result
console.log("Prediction Result:", res.prediction);
} catch (error) {
console.error("Error creating prediction:", error);
}
}
// Example usage
const prompt = "Example prompt for prediction";
const modelBackendName = "exampleModelBackendName"; // Replace with your actual model backend name
createAndPrintPrediction(prompt, modelBackendName);
Usage
Here's a quick overview of how to use the Protege Engine SDK in your project:
Initializing the SDK
import { ProtegeEngineSDK, ApiKeyAuth } from './lib'
const sdk = new ProtegeEngineSDK({
endpoint: 'https://your-protege-engine-endpoint.com',
auth: new ApiKeyAuth(`your-protege-engine-api-key`)
});
Example: Creating a Prediction
Predictions are the core primitive
Example: Creating a New Inference Backend
const result = await sdk.inferenceBackend.createInferenceBackend({
name: 'MyInferenceBackend',
urls: { set: ['http://example.com/inference'] },
apiKey: 'backend_api_key'
});
Using the CLI
The SDK comes with a CLI tool for managing the Protege Engine. To use it, configure it first:
protege config --endpoint <your_endpoint> --apiKey <your_api_key>
Create a new inference backend:
protege backend:create MyInferenceBackend http://example.com/inference
Advanced Docs
Authentication Strategies
The Protege Engine SDK supports two primary authentication strategies to secure your interactions with the Protege Engine GraphQL API: API Key Authentication and Device Flow Authentication. Below, you'll find guidance on how to utilize these strategies within your application.
API Key Authentication
API Key Authentication is a straightforward method suitable for scenarios where you can securely store and manage an API key. This method is ideal for server-side applications or environments where the API key can be securely stored.
Implementing API Key Authentication
import { ProtegeEngineSDK, ApiKeyAuth } from '@intuitive-systems/protege-engine';
// Initialize SDK with API Key Authentication
const sdk = new ProtegeEngineSDK({
endpoint: 'https://your-protege-engine-endpoint.com',
auth: new ApiKeyAuth('your-protege-engine-api-key')
});
Device Flow Authentication
Device Flow Authentication is designed for applications that cannot securely store credentials or where direct user interaction is preferred. This method is particularly useful for CLI tools or client-side applications requiring user consent through a web browser.
Implementing Device Flow Authentication
Before starting with Device Flow Authentication, ensure you have configured the necessary parameters in your application's configuration, including the Auth0 domain, client ID, scope, and audience.
import { DeviceAuth } from '@intuitive-systems/protege-engine';
// Initialize SDK with Device Flow Authentication
const sdk = new ProtegeEngineSDK({
endpoint: 'https://your-protege-engine-endpoint.com',
auth: new DeviceAuth()
});
// The DeviceAuth class handles the authentication flow, including user consent through a web browser,
// token retrieval, and token refresh if necessary.
Workflow Overview
- Initialization: When you make your first API call, the
DeviceAuth
class checks for existing tokens. - User Consent: If no tokens are found or they are expired, the class initiates the device flow, prompting the user to visit a URL for authentication.
- Token Retrieval: Upon successful authentication, the tokens are stored locally for subsequent API calls.
- Token Refresh: The class automatically handles token expiration and refresh.
Additional Tips
- Securely Store API Keys: For API Key Authentication, ensure your API keys are stored securely and not hardcoded into your application's source code.
- Handle Authentication Flow Gracefully: For Device Flow Authentication, provide clear instructions to your users on how to authenticate through the provided URL and handle potential errors or denials of access.
- Environment Variables: Consider using environment variables to manage sensitive information, such as API keys and configuration details for Device Flow Authentication.
Configuration Options
The Protege Engine SDK is designed to be flexible and easily configurable to suit different environments and use cases. Configuration settings are managed through a .protege
config file located in the user's home directory. This file leverages environment variables to configure various aspects of the SDK, including API endpoints, authentication details, and more.
Setting Up Your Configuration File
Location: The config file should be named
.protege
and placed in your home directory (~/.protege
).Structure: The config file uses the dotenv format, which consists of key-value pairs. Each key-value pair defines a specific configuration setting.
Configuration Options Overview
Here's an overview of the available configuration options within the .protege
file:
Environment Settings:
NODE_ENV
: Specifies the application's environment, such asdevelopment
,test
, orproduction
. Default isdevelopment
.TENANT_NAME
: Defines the tenant name for multi-tenancy setups. Default isdev
.
Protege Engine Settings:
PROTEGE_ENGINE_ENDPOINT
: The endpoint URL for the Protege Engine GraphQL API. Default ishttp://localhost:8081/v3/graphql
.PROTEGE_ENGINE_API_KEY
: The API key for accessing the Protege Engine. This is required if using API Key Authentication.
Auth0 Settings (for Device Flow Authentication):
AUTH0_DOMAIN
: The Auth0 domain used for Device Flow Authentication. Default is a placeholder domain.AUTH0_CLIENT_ID
: The client ID for the Auth0 application. Default is a placeholder client ID.AUTH0_AUDIENCE
: The audience for the Auth0 application, which identifies the API that the application is requesting access to. Default is a placeholder audience.AUTH0_TOKENS
: Stores the tokens received from Auth0 authentication. Default is an empty JSON object{}
.
Example Configuration
Here is an example of what your .protege
config file might look like:
NODE_ENV=development
TENANT_NAME=dev
PROTEGE_ENGINE_ENDPOINT=https://engine.yourdomain.com/graphql
PROTEGE_ENGINE_API_KEY=your_protege_engine_api_key
AUTH0_DOMAIN=your_auth0_domain.us.auth0.com
AUTH0_CLIENT_ID=your_auth0_client_id
AUTH0_AUDIENCE=https://engine.yourdomain.com
AUTH0_TOKENS={}
Using Configuration in Your Application
The SDK automatically loads these configuration options when initialized. Ensure your application has access to the .protege
file, and you're all set. The SDK will use these settings to manage its interactions with the Protege Engine and handle authentication flows.
Best Practices
- Security: Keep your
.protege
file secure, especially if it contains sensitive information like API keys. Ensure it is not included in source control or exposed in shared environments. - Environment Specific Configurations: Use different
.protege
files or environment variables for different deployment environments (development, staging, production) to manage configuration settings effectively and securely across environments.
Full Documentation
For full documentation, including all available functions and CLI commands, refer to Protege Engine SDK Documentation.
Contributing
We welcome contributions to the Protege Engine SDK! If you'd like to contribute, please follow these steps:
- Fork the repository.
- Create a new branch for your feature or fix.
- Write and test your code.
- Submit a pull request with a clear description of your changes.
Acknowledgements
Developed and maintained by Intuitive Systems AI Inc.
npx protege dc "Training Data"
npx protege pc
npx protege dap 2 25 1177
npx protege dd 2