node-llama-cpp
v3.2.0
Published
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
Downloads
28,883
Maintainers
Readme
✨ v3.0
is here! ✨
Features
- Run LLMs locally on your machine
- Metal, CUDA and Vulkan support
- Pre-built binaries are provided, with a fallback to building from source without
node-gyp
or Python - Adapts to your hardware automatically, no need to configure anything
- A Complete suite of everything you need to use LLMs in your projects
- Use the CLI to chat with a model without writing any code
- Up-to-date with the latest
llama.cpp
. Download and compile the latest release with a single CLI command - Enforce a model to generate output in a parseable format, like JSON, or even force it to follow a specific JSON schema
- Provide a model with functions it can call on demand to retrieve information of perform actions
- Embedding support
- Great developer experience with full TypeScript support, and complete documentation
- Much more
Documentation
Try It Without Installing
Chat with a model in your terminal using a single command:
npx -y node-llama-cpp chat
Installation
npm install node-llama-cpp
This package comes with pre-built binaries for macOS, Linux and Windows.
If binaries are not available for your platform, it'll fallback to download a release of llama.cpp
and build it from source with cmake
.
To disable this behavior, set the environment variable NODE_LLAMA_CPP_SKIP_DOWNLOAD
to true
.
Usage
import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaChatSession} from "node-llama-cpp";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const llama = await getLlama();
const model = await llama.loadModel({
modelPath: path.join(__dirname, "models", "Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf")
});
const context = await model.createContext();
const session = new LlamaChatSession({
contextSequence: context.getSequence()
});
const q1 = "Hi there, how are you?";
console.log("User: " + q1);
const a1 = await session.prompt(q1);
console.log("AI: " + a1);
const q2 = "Summarize what you said";
console.log("User: " + q2);
const a2 = await session.prompt(q2);
console.log("AI: " + a2);
For more examples, see the getting started guide
Contributing
To contribute to node-llama-cpp
read the contribution guide.
Acknowledgements
- llama.cpp: ggerganov/llama.cpp