webpack-embedding-generator
v1.0.1
Published
A Webpack plugin designed to generate embedding code snippets by traversing the Abstract Syntax Tree (AST) of your project. Ideal for automating the embedding of specific modules or components into web pages.
Downloads
2
Readme
Webpack Embedding Plugin
The Webpack Embedding Plugin is a custom webpack plugin that generates embeddings of type Number[]
for specified file types in your codebase. These embeddings can be used to create a Retrieval Augmented Generation (RAG) system with private Language Models (LLMs) for enhanced code understanding and generation tasks.
Features
- Generates embeddings of type
Number[]
for specified file types (default:.ts
and.tsx
) - Caches embeddings to avoid redundant generation
- Writes the generated embeddings to a configurable JSON file (default:
embeddings.json
) in the output directory - Supports custom embedding functions for generating embeddings
- Enables building a codebase RAG system with private LLMs
Installation
To install the Webpack Embedding Plugin, you can include it in your project's dependencies using npm or yarn:
npm install webpack-embedding-plugin
or
yarn add webpack-embedding-plugin
Usage
To use the Webpack Embedding Plugin in your webpack configuration, you need to require the plugin and add it to the plugins
array:
const WebpackEmbeddingPlugin = require('webpack-embedding-plugin');
module.exports = {
// ...
plugins: [
new WebpackEmbeddingPlugin({
embedFn: yourEmbeddingFunction,
fileExtensions: ['.ts', '.tsx'],
outputPath: 'path/to/embeddings.json',
}),
],
};
Options
The Webpack Embedding Plugin accepts the following options:
embedFn
(required): A function that generates embeddings for a given file content and name. It should return a promise that resolves to the generated embedding of typeNumber[]
.fileExtensions
(optional): An array of file extensions to generate embeddings for. Default:['.ts', '.tsx']
.outputPath
(optional): The path to the output JSON file relative to the webpack output directory. Default:'embeddings.json'
.
Codebase RAG System
The generated embeddings can be used to build a codebase Retrieval Augmented Generation (RAG) system with private Language Models (LLMs). RAG is a technique that combines information retrieval with language model generation to enhance the understanding and generation capabilities of LLMs for code-related tasks.
By generating embeddings for your codebase files, you can create a vector database that represents the semantic meaning of your code. When querying the private LLM, you can retrieve relevant code snippets based on their embedding similarity to the query. The retrieved code snippets can then be used as additional context for the LLM to generate more accurate and contextually relevant code completions, suggestions, or responses.
To build a codebase RAG system with the generated embeddings:
- Use the Webpack Embedding Plugin to generate embeddings for your codebase files.
- Store the generated embeddings in a vector database or search engine that supports similarity search (e.g., Faiss, Elasticsearch, Milvus).
- When querying the private LLM for code-related tasks, retrieve relevant code snippets from the vector database based on the similarity of the query embedding to the stored embeddings.
- Provide the retrieved code snippets as additional context to the private LLM to generate more accurate and contextually relevant responses.
By leveraging the power of embeddings and RAG, you can enhance the performance and capabilities of your private LLMs for code understanding and generation tasks.
License
The Webpack Embedding Plugin is open-source software licensed under the MIT License.