@bot41/bot41-api

v1.1.118

Published

7 months ago

bot41 api

Downloads

155

0High
0Medium
0Low

amberlamps

pk sk data1 USER#ID COLLEAGUE#slack#source_id#ID SOURCE#ID USER#ID COLLEAGUE#organic#ID SOURCE#ORGANIC ["COLLEAGUE#ID", "COLLEAGUE#ID"] USER#ID SOURCE#ID USER#ID MESSAGE#ID

PROMPT#ID PROMPT USER#ID PROMPT#userId#name PROMPT#ID PROMPT USER#ID PROMPT#public#name USER#ID MODEL_INSTANCE#ID MODEL_INSTANCE#name

Infrastructure

DynamoDB

Production

sam build -t table-production.yaml && sam deploy \
  --stack-name bot41-production-table \
  --capabilities CAPABILITY_IAM \
  --s3-bucket bot41-eu-central-1 \
  --s3-prefix production-table \
  --region eu-central-1 \
  --profile feedme \
  --parameter-overrides ParameterKey=TableName,ParameterValue=bot41-production-table

Development

sam build -t table-development.yaml && sam deploy \
  --stack-name bot41-development-table \
  --capabilities CAPABILITY_IAM \
  --s3-bucket bot41-eu-central-1 \
  --s3-prefix development-table \
  --region eu-central-1 \
  --profile feedme \
  --parameter-overrides ParameterKey=TableName,ParameterValue=bot41-development-table

Production - messages

sam build -t table-messages.yaml && sam deploy \
  --stack-name bot41-messages-production-table \
  --capabilities CAPABILITY_IAM \
  --s3-bucket bot41-us-east-1 \
  --s3-prefix messages-production-table \
  --region us-east-1 \
  --profile feedme \
  --parameter-overrides ParameterKey=TableName,ParameterValue=bot41-messages-production-table

Development - messages

sam build -t table-messages.yaml && sam deploy \
  --stack-name bot41-messages-development-table \
  --capabilities CAPABILITY_IAM \
  --s3-bucket bot41-eu-central-1 \
  --s3-prefix messages-development-table \
  --region eu-central-1 \
  --profile feedme \
  --parameter-overrides ParameterKey=TableName,ParameterValue=bot41-messages-development-table

Prompts

Get prompts by id
- pk = PROMPT#ID AND sk = PROMPT
- /prompts/:id
Get all private prompts ordered by name (gsi2)
- sk = PROMPT AND begins_with(data2, PROMPT#userId)
- /prompts?created_by=USERID&sort=name&q=QUERY
Get all public prompts ordered by name
- sk = PROMPT AND begins_with(data2, PROMPT#public)
- /prompts?sort=name&q=QUERY
Get all private prompts ordered by created date (gsi1)
- data1 = USER#ID AND begins_with(pk, PROMPT)
- /prompts?created_by=USERID&sort=created_at
Get all public prompts ordered by usage
- sk = PROMPT (gsi3 and filter on public/private)
- /prompts?sort=usage

type QueryPrivateName = { created_by: string; sort: "name"; q?: string; }

type QueryPrivateCreatedAt = { created_by: string; sort?: "created_at"; }

type QueryPublicName = { sort?: "name"; q?: string; }

type QueryPublicUsage = { sort: "usage"; }

type Query = QueryPrivateName | QueryPrivateCreatedAt | QueryPublicName | QueryPublicUsage;

const q: Query = { sort: "usage" }

People can create new prompts by copying existing prompts
- When they do that we just create a new entry as private prompt

Models

/models behaves exactly like /prompts

Model profiles

Get model profiles by id
- pk = USER#ID AND sk = MODEL_INSTANCE#ID
- /model-profiles/:id
Get all model profiles order by created date
- pk = USER#ID AND begins_with(sk, MODEL_INSTANCE)
- /model-profiles?created_by=USERID&sort=created_at
Get all model profiles order by name (lsi1)
- pk = USER#ID AND begins_with(data1, MODEL_INSTANCE)
- /model-profiles?created_by=USERID&sort=name&q=QUERY
Get all model profiles order by model name and filtered by type (lsi2)
- pk = USER#ID AND begins_with(data2, MODEL_INSTANCE#MODEL_TYPE)
- /model-profiles?created_by=USERID&model_type=MODEL_TYPE&q=QUERY

Clients

Work just like prompts and models

Datasources

Datasources are just like model instances

Datasources are created from clients
Get datasource by id
- pk = USER#ID AND sk = DATASOURCE#ID
- /datasources/:id
Get all datasources ordered by created date
- pk = USER#ID AND begins_with(sk, DATASOURCE)
- /datasources?created_by=USERID&sort=created_at
Get datasources by name (lsi1)
- pk = USER#ID AND begins_with(data1, DATASOURCE)
- /datasources?created_by=USERID&sort=name&q=QUERY
Get datasources by client name (lsi2)
- pk = USER#ID AND begins_with(data2, DATASOURCE#CLIENT_NAME)
- /datasources?created_by=USERID&client_name=CLIENT_NAME&q=QUERY

Contexts

Get context by id
- pk = USER#ID AND sk = CONTEXT#ID
- /contexts/:id
Get contexts by name (lsi1)
- pk = USER#ID AND begins_with(data1, CONTEXT#datasourceId)
- /contexts?datasource_id=DATASOURCE#ID&q=QUERY&created_by=USERID

Users

Users are just like datasources

Messages

Writing Userid in front of every hash key fixes the authentication problem

USER#ID MESSAGE#ID USERID#DATASOURCE#ID CONTEXT#ID USER#ID

pk: DATASOURCE#ID sk: MESSAGE#ID

Get message by id
- pk = USER#ID AND sk = MESSAGE#ID
- /messages/:id
Get messages by datasource id and order by created_at (gsi4)
Get messages by context and order by created_at (gsi5)
Get messages by user and order by created_at (gsi6)
Get all colleagues (order by name, filtered by source type and source id)
- pk = USER#ID AND begins_with(sk, COLLEAGUE)
  - /colleagues
- pk = USER#ID AND begins_with(sk, COLLEAGUE#slack)
  - /colleagues?source=slack
- pk = USER#ID AND begins_with(sk, COLLEAGUE#slack#source_id)
  - /colleagues?source=slack&source_id=SOURCE#ID
Get all sources
- pk = USER#ID AND begins_with(sk, SOURCE)
  - /sources

Different states of data export:

stateDiagram-v2
    [*] --> Idle
    Idle --> Waiting: healthy
    Idle --> Failed: error
    Waiting --> Processing: init
    Waiting --> Failed: error
    Processing --> Processing: update
    Processing --> Success: commit
    Processing --> Failed: error
    Success --> [*]
    Failed --> [*]

Flow of sync:

User creates client
- client includes schema and healthcheck
User create datasource from client
- Datasource includes current schema and data and version of client
User starts a sync of a datasource
- api does healthcheck and write data export entry in dynamodb
  - dataExportId
  - state: healthy (Waiting for client)
- stream consumer puts a message in webhook sqs when data export entry is created
- consumer of webhooks sqs tries to call webhook 3 times with 10 seconds delay
  - if success, update data export entry in dynamodb
    - state: processing
  - if fail, update data export entry in dynamodb
    - state: failed
- stream consumer puts a message on data export failed sqs with a 15 minutes delay when state changed from waiting to processing
  - clients have 15 minutes to complete export
- consumer of data export failed sqs sets the state to failed if it is still processing
- client can update data export during the 15 minutes as much as they like with a PATCH { state: processing, data: { ... } }
- only when client updates data export with a commit the data export is considered complete and messages are written to the datasource
  - state: success

Prompts:

Get prompts of user by name
Get public prompts by name
Get public prompts by usage count
Get prompts of user by favorite
Every prompt has an optional field with recommended model
A prompt needs at least one subject
A subject is either a user or a context
In the prompt different subjects can be used using "{{subjectA:name}}" or "{{subjectB:name}}"
The prompt is parsed and for every subjectX there is going to be an input field for type of subject (user or context) and additional information for the user to fill out
Context and user might have different attributes. If you use an attribute that does not exist for the subject the UI should throw an error
A prompt might also want to specify the date range of messages to consider (e.g. last 30 days)

Models:

Models are handled like clients
You can browse models, model instances and create your own model
A model requires a webhook url to be called when a report is created
The webhook call is a POST with the prompt in the body
A model requires a maximum token length that restrict the maximum length of the prompt
A model might require additional data (e.g. api key for chat gpt)
When a user creates an instance from a model by providing additional data we call it a model-instance

Reports:

The POST /reports endpoint should accepts a list of prompts and a list of required subjects combinations. It will then create the product of prompts and subjects (e.g. if there are 3 prompts and 3 subjects it will create 9 reports)
A report gets all of the data needed for the prompt. If it exceeds the maximum token length it will split the data into chunks.
Before creating a report we should calculate the number of chunks. A prompt of the last 2 years with a lot of data is creating a lot of chunks and therefore a lot of costs.
When a report is created we call the webhook of the model with the chunked prompt.
Every chunked prompt request has its own state.
The state of the report is the state of all of the chunked prompts.
Users can look at the result of all of the chunks individually.
Users can run additional prompts on top of all of the results (e.g. from all of the results which one is the most convincing). This prompt can be added as a "Consolidation prompt" to the original prompt