@empiricalrun/llm
v0.9.28
Published
Package to connect and trace LLM calls.
Downloads
5,875
Readme
llm
Package to connect and trace LLM calls.
Usage
import { LLM } from "@empiricalrun/llm";
const llm = new LLM({
provider: "openai",
defaultModel: "gpt-4o",
});
const llmResponse = await llm.createChatCompletion({ ... });
Vision utilities
This package also contains utilities for vision.
Query
Ask a question against the image (e.g. to extract some info, make a decision) and get the answer.
import { query } from "@empiricalrun/llm/vision";
// With Appium
const data = await driver.saveScreenshot("dummy.png");
const instruction =
"Extract number of ATOM tokens from the image. Return only the number.";
const text = await query(data.toString("base64"), instruction);
// Example response: "0.01"
Get bounding boxes
import { getBoundingBox } from "@empiricalrun/llm/vision/bbox";
// With Appium
const data = await driver.saveScreenshot("dummy.png");
// Give a line describing the screen and then the element that you want to find
const instruction =
"This screenshot shows a screen to send crypto tokens. What is the bounding box for the dropdown to select the token?";
const bbox = await getBoundingBox(data.toString("base64"), instruction);
const centerToTap = bbox.center; // { x: 342, y: 450 }
// **Note**: These coordinates are relative to the image dimensions, and actions like
// tap require scaling the coordinates to Appium coordinates
Bounding box can require some prompt iterations, and you can do that with a debug
flag. This flag
returns a base64 image that has the bounding box drawn on top of the original image.
const bbox = await getBoundingBox(data.toString("base64"), instruction, {
debug: true,
});
console.log(bbox.annotatedImage);