@tscircuit/prompt-benchmarks
v0.0.20
Published
[Docs](https://docs.tscircuit.com) · [Website](https://tscircuit.com) · [Twitter](https://x.com/tscircuit) · [discord](https://tscircuit.com/community/join-redirect) · [Quickstart](https://docs.tscircuit.com/quickstart) ·
Downloads
3,785
Readme
Prompt Benchmarks
Docs · Website · Twitter · discord · Quickstart · Online Playground
This repo contains benchmarks for tscircuit system prompts used for automatically generating tscircuit code.
Running Benchmarks
You can use bun run benchmark
to select and run a benchmark. A single prompt takes about 10s-15s to
run when run with sonnet
. We have a set of samples (see the tests/samples directory)
that the benchmarks run against. When you change a prompt, you must run the benchmark
for that prompt to update the benchmark snapshot. This is how we record degradation
or improvement in the response quality. Each sample is run 5 times and two tests
are run:
- Does the output from the prompt compile?
- Does the output produce the expected circuit?
The benchmark shows the percentage of samples that pass (1) and (2)