cmte
v1.0.4
Published
Design by Committee™ except it's just you and LLMs
Downloads
598
Maintainers
Readme
Committee
Overview
This framework enables users to assemble surgical context and create iterative prompts using templates to create chained LLM workflows.
Core Example: Service Analysis
Let's illustrate the core workflow with an example designed to analyze different microservices based on their specific documentation and source code.
1. workflow.yaml
:
Defines file collections, global context, and the structured services
object intended for iteration.
name: "service-analysis-workflow"
description: "Analyze multiple services using their specific docs and code"
outputPath: "_output/service-analysis"
# Define file collections
files:
# General Docs
architectureDoc: "docs/ARCHITECTURE.md"
# Auth Service Files
authConfigDoc: "docs/AUTH-CONFIG.md"
authCode: ["src/auth/**/*.js", "!src/auth/legacy/**"]
# Data Service Files
dataModelsDoc: "docs/DATA-MODELS.md"
dataCode: "src/data/**/*.js"
# Define universally accessible global variables
global_variables:
# General context available to all tasks
overallArchitecture: "{{ files.architectureDoc }}"
# Define data structures for set iteration
iterable_objects:
# Structured object containing service-specific context
services: # Target for 'for_each: services' in a set
auth: # Key becomes 'item.key' during iteration
# Value becomes 'item.value'
description: "Authentication and Authorization Service"
contact: "auth-team@example.com"
# Embed CONTENT of auth-specific files
configDocContent: "{{ files.authConfigDoc }}"
codeContent: "{{ files.authCode }}"
data: # Key becomes 'item.key'
# Value becomes 'item.value'
description: "Data Processing and Storage Service"
contact: "data-team@example.com"
# Embed CONTENT of data-specific files
modelsDocContent: "{{ files.dataModelsDoc }}"
codeContent: "{{ files.dataCode }}"
# Define the sequence of sets
sets:
- useSet: analyze-service # Iterate over 'services' defined in iterable_objects
for_each: services
(Note: The {{ files.collectionName }}
syntax within global_variables
or iterable_objects
embeds the formatted content of the files.)
2. sets/analyze-service.set.yaml
:
Defines a set that iterates over the services
object defined in the workflow's iterable_objects
.
name: "analyze-service"
description: "Run analysis tasks for each service defined in the context"
# Iterates over the 'services' object from workflow.yaml's iterable_objects
# Each item will be { key: serviceName, value: serviceObject }
for_each: services
tasks:
# These tasks run in parallel for each service
- useTask: identify-service-patterns
# Task context automatically includes 'item', 'item.key', 'item.value'
# and variables from 'global_variables' like 'overallArchitecture'
- useTask: suggest-service-improvements
3. tasks/analyze-service.md
:
A task template showing how to access the context provided by the iteration and global variables.
Analyze the service: **{{ item.key }}**
**Service Description:** {{ item.value.description }}
**Contact:** {{ item.value.contact }}
**Overall Architecture Context:**
```
{{ overallArchitecture }} # Accessing a global_variable
```
**Service-Specific Configuration Documentation:**
```
{{ item.value.configDocContent }} # Accessing data from item.value
```
**Service-Specific Code:**
```
{{ item.value.codeContent }} # Accessing data from item.value
```
**Analysis Request:**
Based on the overall architecture and the specific documentation and code for the `{{ item.key }}` service, please perform the analysis requested by the calling task (e.g., identify patterns, suggest improvements).
This example demonstrates how to:
- Define multiple file sources.
- Define
global_variables
accessible everywhere. - Structure data for iteration under
iterable_objects
. - Iterate over this structured data using
for_each
. - Access the iteration key (
item.key
), iteration value (item.value.*
), and global variables within a task template.
Key Concepts
Now let's dive deeper into the core components illustrated above.
Workflows
A workflow is the top-level container defined in workflow.yaml
, as seen in the Core Example. It specifies:
global_variables
accessible throughout the workflow. These form the base context.iterable_objects
defining data structures (arrays/objects) intended for set iteration viafor_each
.- Named file collections (
files:
) to gather context using glob patterns. File content is typically embedded intoglobal_variables
oriterable_objects
. - An ordered sequence of
sets
to be executed.
Sets
Sets group related tasks. Sets defined in the workflow's sets:
list are executed sequentially in the order they appear.
Within a single set, the tasks listed are executed in parallel. Sets can optionally iterate over arrays or objects defined in the workflow's iterable_objects
using for_each
.
Tasks
Tasks are templated prompts (stored as .md
files) that perform a specific action using an LLM, like the analyze-service.md
template in the Core Example. Each task runs with a context including:
global_variables
(fromworkflow.yaml
).- Iteration variables (
item
,item.key
,item.value
if the set usesfor_each
). - Outputs from tasks in previous sets, accessed via
prior_outputs
defined in the set file.
Important: Due to parallel execution within a set, a task cannot access the output of another task running in the same set. Input/output dependencies must be managed by sequencing tasks across different sets.
# Example tasks/extract-document-keywords.md
Analyze the document `{{ item.path }}` and extract the key keywords.
**Document Content:**
```
{{ item.content }} # Assuming the set iterates over a file collection
```
**Analysis from previous set:**
{{ preliminary_scan_output }} # Accessing an output reference defined in the set's prior_outputs
Please list the top 5 keywords.
(Note: The above task assumes the set file mapped the output reference like: prior_outputs: { preliminary_scan_output: "{{ initial-processing.preliminary-scan.output }}" }
)
Output References
The framework uses the following syntax within the set file's prior_outputs
block to reference outputs from tasks in previous sets:
setName.taskName.output
: Output from a task in a previous set. ThetaskName
used here must match theuseTask
value from the task definition in the previous set's YAML file.
Important Convention: Task outputs are always stored and referenced using the exact name specified in the useTask
field. There is no option to rename outputs.
Example Set Configuration (*.set.yml
):
If set1
(iterating iterable_objects.sourceFiles
) contains a task useTask: taskA
, and set2
(also iterating sourceFiles
) needs its output:
name: set2
for_each: sourceFiles # Assumes sourceFiles defined in workflow.yaml iterable_objects
tasks:
- useTask: process-output
prior_outputs:
# Map the reference to a local variable name for use in the task template
taskA_result: "{{ set1.taskA.output }}" # Reference uses the original task name 'taskA'
Example Task Template (tasks/process-output.md
):
Processing output for file {{ item.path }}.
Result from Task A in Set 1:
{{ taskA_result }} # Access the output via the name defined in prior_outputs
Note: Referencing outputs from tasks within the same parallel set execution is unreliable and should be avoided. Structure your workflow with sequential sets for dependencies.
Where Data Comes From: Defining Your Context
Understanding where different types of data are defined and accessed is important for using Committee. The framework uses the following structure:
- Global Variables: Defined in the top-level
global_variables:
block of yourworkflow.yaml
. These are accessible to all sets and tasks throughout the workflow execution. - File Collections & Content: File sources are defined in the
files:
block ofworkflow.yaml
. To make file content available for LLM analysis, embed it into variables within theworkflow.yaml
global_variables:
oriterable_objects:
blocks using{{ files.collectionName }}
. Task templates (.md
) can reference{{ files.collectionName }}
to get a list of paths. - Iteration Data (
item
): Data structures (arrays or objects) intended for iteration usingfor_each
are defined in theiterable_objects:
block ofworkflow.yaml
. Thefor_each: objectName
directive within a*.set.yml
file targets one of these workflow iterable objects. Tasks within that set then access the current iteration's data via theitem
object (oritem.key
/item.value
for object iteration). - Task Outputs (via Prior Outputs): Outputs from previous tasks are made available to a subsequent task via the
prior_outputs:
block defined under that task in its*.set.yml
file. This block maps a local name (used in the task template) to the structured output reference string (e.g.,setName.taskName[iterationKey].output
).
Essentially, workflow.yaml
is the primary location for defining the initial context (global_variables
), data sources (files
), and data for iteration (iterable_objects
), while *.set.yml
files orchestrate the execution flow and manage dependencies on previously generated task outputs via prior_outputs
.
File Collection Handling
You define named file collections in workflow.yaml
using file paths or glob patterns (include
/exclude
):
# workflow.yaml
name: "code-review-workflow"
files:
sourceCode:
include: ["src/**/*.js"]
exclude: ["src/vendor/**"]
testFiles: "test/**/*.test.js"
docs: ["README.md", "CONTRIBUTING.md"]
# ... global_variables, iterable_objects, and sets follow ...
These collections are primarily used to inject context into your workflow. The way you reference a collection using {{ files.collectionName }}
has two behaviors depending on where it is used:
In
workflow.yaml
(global_variables:
oriterable_objects:
):- Behavior: Embeds the full content of each file within the collection directly into the variable's string value. Each file's content is automatically prefixed with a Markdown header indicating its path (e.g.,
# path/to/file.js
). - Purpose: This is the primary mechanism for injecting substantial file content (like source code, documentation) into the context, making it available to subsequent sets and tasks for direct LLM analysis.
- Example (
workflow.yaml
):global_variables: # Embeds the content of all files matching src/**/*.js, # each block prefixed with '# filepath' sourceContext: "{{ files.sourceCode }}" # Embeds content of README.md and CONTRIBUTING.md docsContext: "{{ files.docs }}"
- Behavior: Embeds the full content of each file within the collection directly into the variable's string value. Each file's content is automatically prefixed with a Markdown header indicating its path (e.g.,
In Task Templates (
*.md
files):Behavior: Renders a newline-separated list of the file paths belonging to that collection. It does not embed the file content here.
Purpose: Useful for providing informational context within a task prompt, such as listing related files for the LLM's reference, without including their potentially large content directly in that specific prompt.
Example (
tasks/review-code.md
):Review the following source code file `{{ item.path }}`: ```javascript {{ item.content }} # Assuming iteration over a file collection
Consider related test files (paths listed below): {{ files.testFiles }} # Lists paths from the 'testFiles' collection
Key Distinction: Use {{ files.collectionName }}
in workflow.yaml
(global_variables
or iterable_objects
) to provide the content needed for LLM analysis. Use it in task templates (.md
) when you only need to reference the paths of the files.
(Note: Advanced pattern filtering within the template tag like {{ files.collectionName:*.js }}
is not currently implemented.)
Two-Phase Thinking
Tasks can optionally perform a preliminary "thinking" step before generating the final response. This is useful for complex analysis or reasoning tasks. Configure this using YAML frontmatter at the top of your task's .md
file:
---
name: "complex-analysis-task" # Optional: Task name for clarity
thinking: true # REQUIRED: Enables the thinking phase
thinking_prompt: "path/to/thinking-prompt.md" # Optional: Use a separate prompt file for the thinking phase
thinking_instruction: "Analyze the input step-by-step..." # Optional: Specific instruction for the thinking phase
thinking_params:
temperature: 0.2 # Optional: LLM parameters specifically for the thinking phase
---
# Main Task Prompt
Based on the preceding analysis, provide the final answer.
Context:
{{ context }}
- If
thinking: true
, the framework first runs the thinking phase (using the main prompt orthinking_prompt
if provided, potentially guided bythinking_instruction
). - The output of the thinking phase is then automatically prepended to the context provided to the main task prompt for generating the final response.
- You can control LLM parameters specifically for the thinking step using
thinking_params
.
Using the Framework
Installation
# Navigate to the project root directory
# Install globally (recommended for CLI use)
npm install -g .
# Or install locally
npm install .
Basic Usage
- Create a workflow directory (e.g.,
my-workflow/
) containing:workflow.yaml
(workflow definition)sets/
directory (with.set.yaml
or.set.yml
set definitions)tasks/
directory (with.md
task prompt files)
- Configure your environment variables (e.g., in a
.env
file in your project or system):# Required for using Anthropic API (if not using --local) ANTHROPIC_API_KEY=your_api_key_here # Optional: Specify default model (defaults exist, e.g., Claude 3 Haiku for --lite, Sonnet otherwise) # DEFAULT_MODEL=claude-3-sonnet-20240229 # Optional: For using a local LLM (requires --local flag) # Needs a running server compatible with OpenAI API spec (e.g., Ollama, LM Studio) LOCAL_LLM_URL=http://localhost:11434 # Default Ollama URL example # Optional: Specify model served by local URL (required if server hosts multiple) # LOCAL_LLM_MODEL=llama3
- Run the workflow from your terminal:
# Run workflow 'my-workflow' using Anthropic API (default if key is set)
cmte my-workflow
# Run workflow using Local LLM specified in .env (requires LOCAL_LLM_URL)
cmte --local my-workflow # or: cmte -x my-workflow
Options
# Use local LLM instead of cloud API (--local or -x)
cmte path/to/workflow --local
# Use a lightweight/faster model (--lite or -l)
# (Currently defaults to Claude Haiku if ANTHROPIC_API_KEY is set,
# otherwise uses the configured local model if --local specified)
cmte path/to/workflow --lite
# Save rendered prompts and LLM responses to the output directory (--prompts or -p)
cmte path/to/workflow --prompts
# Combine options
cmte path/to/workflow --local --lite --prompts # or -x -l -p
CLI Testing Options
Committee provides several CLI options to help you test workflows efficiently:
Lite Mode (--lite
or -l
)
The --lite
flag uses a smaller, less expensive model (like Claude Haiku) for all API calls:
cmte --lite my-workflow # or: cmte -l my-workflow
This mode is useful for:
- Faster testing and development
- Reducing token costs
- Quick iterations during development
Dry Run Mode (--dryrun
)
The --dryrun
flag simulates workflow execution without making LLM calls or writing output files:
cmte --dryrun my-workflow
This mode is useful for:
- Testing workflow structure and file paths
- Validating prompt templates and variable substitution
- Previewing what files would be generated
- Debugging workflow configuration
When using --dryrun
:
- No LLM API calls are made (placeholder responses are used)
- Prompts are still saved if
--prompts
is also specified - Output files are not written, but their intended paths are logged
- Structured output would be written to a
<workflow_dir>/dryrun/
directory (but isn't actually written)
Saving Prompts (--prompts
or -p
)
The --prompts
flag saves rendered prompts and LLM responses to the output directory:
cmte path/to/workflow --prompts
This mode is useful for:
- Saving prompts and responses for later analysis or debugging
- Combining with
--lite
for faster testing without writing output files