empiricalrun
v0.14.0
Published
[![npm](https://img.shields.io/npm/v/empiricalrun)](https://npmjs.com/package/empiricalrun) [![Discord](https://img.shields.io/badge/discord-empirical.run-blue?logo=discord&logoColor=white&color=5d68e8)](https://discord.gg/NeR6jj8dw9)
Downloads
11
Readme
Empirical CLI
Empirical is the fastest way to test different LLMs, prompts and other model configurations, across all the scenarios that matter for your application.
With Empirical, you can:
- Run your test datasets locally against off-the-shelf models
- Test your own custom models and RAG applications (see how-to)
- Reports to view, compare, analyze outputs on a web UI
- Score your outputs with scoring functions
- Run tests on CI/CD
Watch demo video | See all docs
Usage
Empirical bundles together a CLI and a web app. The CLI handles running tests and the web app visualizes results.
Everything runs locally, with a JSON configuration file, empiricalrc.json
.
Required: Node.js 20+ needs to be installed on your system.
Start with a basic example
In this example, we will ask an LLM to parse user messages to extract entities and
give us a structured JSON output. For example, "I'm Alice from Maryland" will
become "{name: 'Alice', location: 'Maryland'}"
.
Our test will succeed if the model outputs valid JSON.
Use the CLI to create a sample configuration file called
empiricalrc.json
.npx empiricalrun init cat empiricalrc.json
Run the test samples against the models with the
run
command. This step requires theOPENAI_API_KEY
environment variable to authenticate with OpenAI. This execution will cost $0.0026, based on the selected models.npx empiricalrun
Use the
ui
command to open the reporter web app and see side-by-side results.npx empiricalrun ui
Make it yours
Edit the empiricalrc.json
file to make Empirical work for your use-case.
- Configure which models to use
- Configure your test dataset
- Configure scoring functions to grade output quality