ollamazure

v1.4.0

Published

3 months ago

Emulates Azure OpenAI API on your local machine using Ollama and open-source models.

Downloads

0High
0Medium
0Low

sinedied

azure openai api proxy emulator cli llm

ollamazure

Node version

⭐ If you like this tool, star it on GitHub — it helps a lot!

Overview • Usage • Azure OpenAI compatibility • Sample code

Overview

ollamazure is a local server that emulates Azure OpenAI API on your local machine using Ollama and open-source models.

Using this tool, you can run your own local server that emulates the Azure OpenAI API, allowing you to test your code locally without incurring costs or being rate-limited. This is especially useful for development and testing purposes, or when you need to work offline.

By default, phi3 is used as the model for completions, and all-minilm:l6-v2 for embeddings. You can change these models using the configuration options.

[!NOTE] This tool use different models than Azure OpenAI, so you should expect differences in results accuracy and performance. However, the API is compatible so you can use the same code to interact with it.

Usage

You need Node.js v20+ and Ollama installed on your machine to use this tool.

You can start the emulator directly using npx without installing it:

npx ollamazure

Once the server is started, leave it open in a terminal window and you can use the Azure OpenAI API to interact with it. You can find sample code for different languages and frameworks in the sample code section.

For example, if you have an existing project that uses the Azure OpenAI SDK, you can point it to your local server by setting the AZURE_OPENAI_ENDPOINT environment variable to http://localhost:4041 without changing the rest of your code.

Installation

npm install -g ollamazure

Once the installation is completed, start the emulator by running the following command in a terminal:

ollamazure
# or use the shorter alias `oaz`

Configuration options

ollamazure --help
Usage: ollamazure [options]

Emulates Azure OpenAI API on your local machine using Ollama and open-source models.

Options:
  --verbose                  show detailed logs
  -y, --yes                  do not ask for confirmation (default: false)
  -m, --model <name>         model to use for chat and text completions (default: "phi3")
  -e, --embeddings <name>    model to use for embeddings (default: "all-minilm:l6-v2")
  -d, --use-deployment       use deployment name as model name (default: false)
  -h, --host <ip>            host to bind to (default: "localhost")
  -p, --port <number>        port to use (default: 4041)
  -o, --ollama-url <number>  ollama base url (default: "http://localhost:11434")
  -v, --version              show the current version
  --help                     display help for command

Azure OpenAI compatibility

| Feature | Supported / with streaming | | ------- | -------------------------- | | Completions | ✅ / ✅ | | Chat completions | ✅ / ✅ | | Embeddings | ✅ / - | | JSON mode | ✅ / ✅ | | Function calling | ⛔ / ⛔ | | Reproducible outputs | ✅ / ✅ | | Vision | ⛔ / ⛔ | | Assistants | ⛔ / ⛔ |

Unimplemented features are currently not supported by Ollama, but are being worked on and may be added in the future.