npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

carbon-typescript-sdk

v0.2.53

Published

Client for Carbon

Downloads

8,723

Readme

Visit Carbon

Carbon

Connect external data to LLMs, no matter the source.

npm

Table of Contents

Installation

npm i carbon-typescript-sdk
pnpm i carbon-typescript-sdk
yarn add carbon-typescript-sdk

Getting Started

import { Carbon } from "carbon-typescript-sdk";

// Generally this is done in the backend to avoid exposing API key to the client

const carbonWithApiKey = new Carbon({
  apiKey: "API_KEY",
  customerId: "CUSTOMER_ID",
});

const accessToken = await carbonWithApiKey.auth.getAccessToken();

// Once an access token is obtained, it can be passed to the frontend
// and used to instantiate the SDK client without an API key

const carbon = new Carbon({
  accessToken: accessToken.data.access_token,
});

// use SDK as usual
const whiteLabeling = await carbon.auth.getWhiteLabeling();
// etc.

Reference

carbon.auth.getAccessToken

Get Access Token

🛠️ Usage

const getAccessTokenResponse = await carbon.auth.getAccessToken();

🔄 Return

TokenResponse

🌐 Endpoint

/auth/v1/access_token GET

🔙 Back to Table of Contents


carbon.auth.getWhiteLabeling

Returns whether or not the organization is white labeled and which integrations are white labeled

:param current_user: the current user :param db: the database session :return: a WhiteLabelingResponse

🛠️ Usage

const getWhiteLabelingResponse = await carbon.auth.getWhiteLabeling();

🔄 Return

WhiteLabelingResponse

🌐 Endpoint

/auth/v1/white_labeling GET

🔙 Back to Table of Contents


carbon.cRM.getAccount

Get Account

🛠️ Usage

const getAccountResponse = await carbon.cRM.getAccount({
  id: "id_example",
  dataSourceId: 1,
  includeRemoteData: false,
});

⚙️ Parameters

id: string
dataSourceId: number
includeRemoteData: boolean
includes: BaseIncludes[]

🔄 Return

Account

🌐 Endpoint

/integrations/data/crm/accounts/{id} GET

🔙 Back to Table of Contents


carbon.cRM.getAccounts

Get Accounts

🛠️ Usage

const getAccountsResponse = await carbon.cRM.getAccounts({
  data_source_id: 1,
  include_remote_data: false,
  order_dir: "asc",
  includes: [],
  order_by: "created_at",
});

⚙️ Parameters

data_source_id: number
include_remote_data: boolean
next_cursor: string
page_size: number
order_dir: OrderDirV2Nullable
includes: BaseIncludes[]
filters: AccountFilters
order_by: AccountsOrderByNullable

🔄 Return

AccountResponse

🌐 Endpoint

/integrations/data/crm/accounts POST

🔙 Back to Table of Contents


carbon.cRM.getContact

Get Contact

🛠️ Usage

const getContactResponse = await carbon.cRM.getContact({
  id: "id_example",
  dataSourceId: 1,
  includeRemoteData: false,
});

⚙️ Parameters

id: string
dataSourceId: number
includeRemoteData: boolean
includes: BaseIncludes[]

🔄 Return

Contact

🌐 Endpoint

/integrations/data/crm/contacts/{id} GET

🔙 Back to Table of Contents


carbon.cRM.getContacts

Get Contacts

🛠️ Usage

const getContactsResponse = await carbon.cRM.getContacts({
  data_source_id: 1,
  include_remote_data: false,
  order_dir: "asc",
  includes: [],
  order_by: "created_at",
});

⚙️ Parameters

data_source_id: number
include_remote_data: boolean
next_cursor: string
page_size: number
order_dir: OrderDirV2Nullable
includes: BaseIncludes[]
filters: ContactFilters
order_by: ContactsOrderByNullable

🔄 Return

ContactsResponse

🌐 Endpoint

/integrations/data/crm/contacts POST

🔙 Back to Table of Contents


carbon.cRM.getLead

Get Lead

🛠️ Usage

const getLeadResponse = await carbon.cRM.getLead({
  id: "id_example",
  dataSourceId: 1,
  includeRemoteData: false,
});

⚙️ Parameters

id: string
dataSourceId: number
includeRemoteData: boolean
includes: BaseIncludes[]

🔄 Return

Lead

🌐 Endpoint

/integrations/data/crm/leads/{id} GET

🔙 Back to Table of Contents


carbon.cRM.getLeads

Get Leads

🛠️ Usage

const getLeadsResponse = await carbon.cRM.getLeads({
  data_source_id: 1,
  include_remote_data: false,
  order_dir: "asc",
  includes: [],
  order_by: "created_at",
});

⚙️ Parameters

data_source_id: number
include_remote_data: boolean
next_cursor: string
page_size: number
order_dir: OrderDirV2Nullable
includes: BaseIncludes[]
filters: LeadFilters
order_by: LeadsOrderByNullable

🔄 Return

LeadsResponse

🌐 Endpoint

/integrations/data/crm/leads POST

🔙 Back to Table of Contents


carbon.cRM.getOpportunities

Get Opportunities

🛠️ Usage

const getOpportunitiesResponse = await carbon.cRM.getOpportunities({
  data_source_id: 1,
  include_remote_data: false,
  order_dir: "asc",
  includes: [],
  order_by: "created_at",
});

⚙️ Parameters

data_source_id: number
include_remote_data: boolean
next_cursor: string
page_size: number
order_dir: OrderDirV2Nullable
includes: BaseIncludes[]
filters: OpportunityFilters
order_by: OpportunitiesOrderByNullable

🔄 Return

OpportunitiesResponse

🌐 Endpoint

/integrations/data/crm/opportunities POST

🔙 Back to Table of Contents


carbon.cRM.getOpportunity

Get Opportunity

🛠️ Usage

const getOpportunityResponse = await carbon.cRM.getOpportunity({
  id: "id_example",
  dataSourceId: 1,
  includeRemoteData: false,
});

⚙️ Parameters

id: string
dataSourceId: number
includeRemoteData: boolean
includes: BaseIncludes[]

🔄 Return

Opportunity

🌐 Endpoint

/integrations/data/crm/opportunities/{id} GET

🔙 Back to Table of Contents


carbon.dataSources.addTags

Add Data Source Tags

🛠️ Usage

const addTagsResponse = await carbon.dataSources.addTags({
  tags: {},
  data_source_id: 1,
});

⚙️ Parameters

tags: object
data_source_id: number

🔄 Return

OrganizationUserDataSourceAPI

🌐 Endpoint

/data_sources/tags/add POST

🔙 Back to Table of Contents


carbon.dataSources.query

Data Sources

🛠️ Usage

const queryResponse = await carbon.dataSources.query({
  order_by: "created_at",
  order_dir: "desc",
});

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserDataSourceOrderByColumns
order_dir: OrderDir
filters: OrganizationUserDataSourceFilters

🔄 Return

OrganizationUserDataSourceResponse

🌐 Endpoint

/data_sources POST

🔙 Back to Table of Contents


carbon.dataSources.queryUserDataSources

User Data Sources

🛠️ Usage

const queryUserDataSourcesResponse =
  await carbon.dataSources.queryUserDataSources({
    order_by: "created_at",
    order_dir: "desc",
  });

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserDataSourceOrderByColumns
order_dir: OrderDir
filters: OrganizationUserDataSourceFilters

🔄 Return

OrganizationUserDataSourceResponse

🌐 Endpoint

/user_data_sources POST

🔙 Back to Table of Contents


carbon.dataSources.removeTags

Remove Data Source Tags

🛠️ Usage

const removeTagsResponse = await carbon.dataSources.removeTags({
  data_source_id: 1,
  tags_to_remove: [],
  remove_all_tags: false,
});

⚙️ Parameters

data_source_id: number
tags_to_remove: string[]
remove_all_tags: boolean

🔄 Return

OrganizationUserDataSourceAPI

🌐 Endpoint

/data_sources/tags/remove POST

🔙 Back to Table of Contents


carbon.dataSources.revokeAccessToken

Revoke Access Token

🛠️ Usage

const revokeAccessTokenResponse = await carbon.dataSources.revokeAccessToken({
  data_source_id: 1,
});

⚙️ Parameters

data_source_id: number

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/revoke_access_token POST

🔙 Back to Table of Contents


carbon.embeddings.getDocuments

For pre-filtering documents, using tags_v2 is preferred to using tags (which is now deprecated). If both tags_v2 and tags are specified, tags is ignored. tags_v2 enables building complex filters through the use of "AND", "OR", and negation logic. Take the below input as an example:

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

  1. "subject" = "holy-bible" OR
  2. "person-of-interest" = "jesus christ" OR
  3. "genre" != "religion" OR
  4. "subject" = "tao-te-ching" AND "author" = "lao-tzu"

Note that the top level of the query must be either an "OR" or "AND" array. Currently, nesting is limited to 3. For tag blocks (those with "key", "value", and "negate" keys), the following typing rules apply:

  1. "key" isn't optional and must be a string
  2. "value" isn't optional and can be any or list[any]
  3. "negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

When querying embeddings, you can optionally specify the media_type parameter in your request. By default (if not set), it is equal to "TEXT". This means that the query will be performed over files that have been parsed as text (for now, this covers all files except image files). If it is equal to "IMAGE", the query will be performed over image files (for now, .jpg and .png files). You can think of this field as an additional filter on top of any filters set in file_ids and

When hybrid_search is set to true, a combination of keyword search and semantic search are used to rank and select candidate embeddings during information retrieval. By default, these search methods are weighted equally during the ranking process. To adjust the weight (or "importance") of each search method, you can use the hybrid_search_tuning_parameters property. The description for the different tuning parameters are:

  • weight_a: weight to assign to semantic search
  • weight_b: weight to assign to keyword search

You must ensure that sum(weight_a, weight_b,..., weight_n) for all n weights is equal to 1. The equality has an error tolerance of 0.001 to account for possible floating point issues.

In order to use hybrid search for a customer across a set of documents, two flags need to be enabled:

  1. Use the /modify_user_configuration endpoint to to enable sparse_vectors for the customer. The payload body for this request is below:
{
  "configuration_key_name": "sparse_vectors",
  "value": {
    "enabled": true
  }
}
  1. Make sure hybrid search is enabled for the documents across which you want to perform the search. For the /uploadfile endpoint, this can be done by setting the following query parameter: generate_sparse_vectors=true

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

const getDocumentsResponse = await carbon.embeddings.getDocuments({
  query: "query_example",
  k: 1,
  include_all_children: false,
  media_type: "TEXT",
  embedding_model: "OPENAI",
  include_file_level_metadata: false,
  high_accuracy: false,
  exclude_cold_storage_files: false,
});

⚙️ Parameters

query: string

Query for which to get related chunks and embeddings.

k: number

Number of related chunks to return.

tags: Record<string, Tags1>

A set of tags to limit the search to. Deprecated and may be removed in the future.

query_vector: number[]

Optional query vector for which to get related chunks and embeddings. It must have been generated by the same model used to generate the embeddings across which the search is being conducted. Cannot provide both query and query_vector.

file_ids: number[]

Optional list of file IDs to limit the search to

parent_file_ids: number[]

Optional list of parent file IDs to limit the search to. A parent file describes a file to which another file belongs (e.g. a folder)

include_all_children: boolean

Flag to control whether or not to include all children of filtered files in the embedding search.

tags_v2: object

A set of tags to limit the search to. Use this instead of tags, which is deprecated.

include_tags: boolean

Flag to control whether or not to include tags for each chunk in the response.

include_vectors: boolean

Flag to control whether or not to include embedding vectors in the response.

include_raw_file: boolean

Flag to control whether or not to include a signed URL to the raw file containing each chunk in the response.

hybrid_search: boolean

Flag to control whether or not to perform hybrid search.

hybrid_search_tuning_parameters: HybridSearchTuningParamsNullable
media_type: FileContentTypesNullable
embedding_model: EmbeddingGeneratorsNullable
include_file_level_metadata: boolean

Flag to control whether or not to include file-level metadata in the response. This metadata will be included in the content_metadata field of each document along with chunk/embedding level metadata.

high_accuracy: boolean

Flag to control whether or not to perform a high accuracy embedding search. By default, this is set to false. If true, the search may return more accurate results, but may take longer to complete.

rerank: RerankParamsNullable
file_types_at_source: AutoSyncedSourceTypesPropertyInner[]

Filter files based on their type at the source (for example help center tickets and articles)

exclude_cold_storage_files: boolean

Flag to control whether or not to exclude files that are not in hot storage. If set to False, then an error will be returned if any filtered files are in cold storage.

🔄 Return

DocumentResponseList

🌐 Endpoint

/embeddings POST

🔙 Back to Table of Contents


carbon.embeddings.getEmbeddingsAndChunks

Retrieve Embeddings And Content

🛠️ Usage

const getEmbeddingsAndChunksResponse =
  await carbon.embeddings.getEmbeddingsAndChunks({
    order_by: "created_at",
    order_dir: "desc",
    filters: {
      user_file_id: 1,
      embedding_model: "OPENAI",
    },
    include_vectors: false,
  });

⚙️ Parameters

filters: EmbeddingsAndChunksFilters
pagination: Pagination
order_by: EmbeddingsAndChunksOrderByColumns
order_dir: OrderDir
include_vectors: boolean

🔄 Return

EmbeddingsAndChunksResponse

🌐 Endpoint

/text_chunks POST

🔙 Back to Table of Contents


carbon.embeddings.list

Retrieve Embeddings And Content V2

🛠️ Usage

const listResponse = await carbon.embeddings.list({
  order_by: "created_at",
  order_dir: "desc",
  filters: {
    include_all_children: false,
    non_synced_only: false,
  },
  include_vectors: false,
});

⚙️ Parameters

filters: OrganizationUserFilesToSyncFilters
pagination: Pagination
order_by: OrganizationUserFilesToSyncOrderByTypes
order_dir: OrderDir
include_vectors: boolean

🔄 Return

EmbeddingsAndChunksResponse

🌐 Endpoint

/list_chunks_and_embeddings POST

🔙 Back to Table of Contents


carbon.embeddings.uploadChunksAndEmbeddings

Upload Chunks And Embeddings

🛠️ Usage

const uploadChunksAndEmbeddingsResponse =
  await carbon.embeddings.uploadChunksAndEmbeddings({
    embedding_model: "OPENAI",
    chunks_and_embeddings: [
      {
        file_id: 1,
        chunks_and_embeddings: [
          {
            chunk_number: 1,
            chunk: "chunk_example",
          },
        ],
      },
    ],
    overwrite_existing: false,
    chunks_only: false,
  });

⚙️ Parameters

embedding_model: EmbeddingGenerators
chunks_and_embeddings: SingleChunksAndEmbeddingsUploadInput[]
overwrite_existing: boolean
chunks_only: boolean
custom_credentials: { [key: string]: object; }

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/upload_chunks_and_embeddings POST

🔙 Back to Table of Contents


carbon.files.createUserFileTags

A tag is a key-value pair that can be added to a file. This pair can then be used for searches (e.g. embedding searches) in order to narrow down the scope of the search. A file can have any number of tags. The following are reserved keys that cannot be used:

  • db_embedding_id
  • organization_id
  • user_id
  • organization_user_file_id

Carbon currently supports two data types for tag values - string and list<string>. Keys can only be string. If values other than string and list<string> are used, they're automatically converted to strings (e.g. 4 will become "4").

🛠️ Usage

const createUserFileTagsResponse = await carbon.files.createUserFileTags({
  tags: {
    key: "string_example",
  },
  organization_user_file_id: 1,
});

⚙️ Parameters

tags: Record<string, Tags1>
organization_user_file_id: number

🔄 Return

UserFile

🌐 Endpoint

/create_user_file_tags POST

🔙 Back to Table of Contents


carbon.files.deleteFileTags

Delete File Tags

🛠️ Usage

const deleteFileTagsResponse = await carbon.files.deleteFileTags({
  tags: ["tags_example"],
  organization_user_file_id: 1,
});

⚙️ Parameters

tags: string[]
organization_user_file_id: number

🔄 Return

UserFile

🌐 Endpoint

/delete_user_file_tags POST

🔙 Back to Table of Contents


carbon.files.deleteMany

Deprecated

Delete Files Endpoint

🛠️ Usage

const deleteManyResponse = await carbon.files.deleteMany({
  delete_non_synced_only: false,
  send_webhook: false,
  delete_child_files: false,
});

⚙️ Parameters

file_ids: number[]
sync_statuses: ExternalFileSyncStatuses[]
delete_non_synced_only: boolean
send_webhook: boolean
delete_child_files: boolean

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_files POST

🔙 Back to Table of Contents


carbon.files.deleteV2

Delete Files V2 Endpoint

🛠️ Usage

const deleteV2Response = await carbon.files.deleteV2({
  send_webhook: false,
  preserve_file_record: false,
});

⚙️ Parameters

filters: OrganizationUserFilesToSyncFilters
send_webhook: boolean
preserve_file_record: boolean

Whether or not to delete all data related to the file from the database, BUT to preserve the file metadata, allowing for resyncs. By default preserve_file_record is false, which means that all data related to the file as well as its metadata will be deleted. Note that even if preserve_file_record is true, raw files uploaded via the uploadfile endpoint still cannot be resynced.

🔄 Return

GenericSuccessResponse

🌐 Endpoint

/delete_files_v2 POST

🔙 Back to Table of Contents


carbon.files.getParsedFile

Deprecated

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const getParsedFileResponse = await carbon.files.getParsedFile({
  fileId: 1,
});

⚙️ Parameters

fileId: number

🔄 Return

PresignedURLResponse

🌐 Endpoint

/parsed_file/{file_id} GET

🔙 Back to Table of Contents


carbon.files.getRawFile

Deprecated

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const getRawFileResponse = await carbon.files.getRawFile({
  fileId: 1,
});

⚙️ Parameters

fileId: number

🔄 Return

PresignedURLResponse

🌐 Endpoint

/raw_file/{file_id} GET

🔙 Back to Table of Contents


carbon.files.modifyColdStorageParameters

Modify Cold Storage Parameters

🛠️ Usage

const modifyColdStorageParametersResponse =
  await carbon.files.modifyColdStorageParameters({});

⚙️ Parameters

filters: OrganizationUserFilesToSyncFilters
enable_cold_storage: boolean
hot_storage_time_to_live: number

🌐 Endpoint

/modify_cold_storage_parameters POST

🔙 Back to Table of Contents


carbon.files.moveToHotStorage

Move To Hot Storage

🛠️ Usage

const moveToHotStorageResponse = await carbon.files.moveToHotStorage({});

⚙️ Parameters

filters: OrganizationUserFilesToSyncFilters

🌐 Endpoint

/move_to_hot_storage POST

🔙 Back to Table of Contents


carbon.files.queryUserFiles

For pre-filtering documents, using tags_v2 is preferred to using tags (which is now deprecated). If both tags_v2 and tags are specified, tags is ignored. tags_v2 enables building complex filters through the use of "AND", "OR", and negation logic. Take the below input as an example:

{
    "OR": [
        {
            "key": "subject",
            "value": "holy-bible",
            "negate": false
        },
        {
            "key": "person-of-interest",
            "value": "jesus christ",
            "negate": false
        },
        {
            "key": "genre",
            "value": "religion",
            "negate": true
        }
        {
            "AND": [
                {
                    "key": "subject",
                    "value": "tao-te-ching",
                    "negate": false
                },
                {
                    "key": "author",
                    "value": "lao-tzu",
                    "negate": false
                }
            ]
        }
    ]
}

In this case, files will be filtered such that:

  1. "subject" = "holy-bible" OR
  2. "person-of-interest" = "jesus christ" OR
  3. "genre" != "religion" OR
  4. "subject" = "tao-te-ching" AND "author" = "lao-tzu"

Note that the top level of the query must be either an "OR" or "AND" array. Currently, nesting is limited to 3. For tag blocks (those with "key", "value", and "negate" keys), the following typing rules apply:

  1. "key" isn't optional and must be a string
  2. "value" isn't optional and can be any or list[any]
  3. "negate" is optional and must be true or false. If present and true, then the filter block is negated in the resulting query. It is false by default.

🛠️ Usage

const queryUserFilesResponse = await carbon.files.queryUserFiles({
  order_by: "created_at",
  order_dir: "desc",
  presigned_url_expiry_time_seconds: 3600,
});

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserFilesToSyncOrderByTypes
order_dir: OrderDir
filters: OrganizationUserFilesToSyncFilters
include_raw_file: boolean

If true, the query will return presigned URLs for the raw file. Only relevant for the /user_files_v2 endpoint.

include_parsed_text_file: boolean

If true, the query will return presigned URLs for the parsed text file. Only relevant for the /user_files_v2 endpoint.

include_additional_files: boolean

If true, the query will return presigned URLs for additional files. Only relevant for the /user_files_v2 endpoint.

presigned_url_expiry_time_seconds: number

The expiry time for the presigned URLs. Only relevant for the /user_files_v2 endpoint.

🔄 Return

UserFilesV2

🌐 Endpoint

/user_files_v2 POST

🔙 Back to Table of Contents


carbon.files.queryUserFilesDeprecated

Deprecated

This route is deprecated. Use /user_files_v2 instead.

🛠️ Usage

const queryUserFilesDeprecatedResponse =
  await carbon.files.queryUserFilesDeprecated({
    order_by: "created_at",
    order_dir: "desc",
    presigned_url_expiry_time_seconds: 3600,
  });

⚙️ Parameters

pagination: Pagination
order_by: OrganizationUserFilesToSyncOrderByTypes
order_dir: OrderDir
filters: OrganizationUserFilesToSyncFilters
include_raw_file: boolean

If true, the query will return presigned URLs for the raw file. Only relevant for the /user_files_v2 endpoint.

include_parsed_text_file: boolean

If true, the query will return presigned URLs for the parsed text file. Only relevant for the /user_files_v2 endpoint.

include_additional_files: boolean

If true, the query will return presigned URLs for additional files. Only relevant for the /user_files_v2 endpoint.

presigned_url_expiry_time_seconds: number

The expiry time for the presigned URLs. Only relevant for the /user_files_v2 endpoint.

🔄 Return

UserFile

🌐 Endpoint

/user_files POST

🔙 Back to Table of Contents


carbon.files.resync

Resync File

🛠️ Usage

const resyncResponse = await carbon.files.resync({
  file_id: 1,
  force_embedding_generation: false,
  skip_file_processing: false,
});

⚙️ Parameters

file_id: number
chunk_size: number
chunk_overlap: number
force_embedding_generation: boolean
skip_file_processing: boolean

🔄 Return

UserFile

🌐 Endpoint

/resync_file POST

🔙 Back to Table of Contents


carbon.files.upload

This endpoint is used to directly upload local files to Carbon. The POST request should be a multipart form request. Note that the set_page_as_boundary query parameter is applicable only to PDFs for now. When this value is set, PDF chunks are at most one page long. Additional information can be retrieved for each chunk, however, namely the coordinates of the bounding box around the chunk (this can be used for things like text highlighting). Following is a description of all possible query parameters:

  • chunk_size: the chunk size (in tokens) applied when splitting the document
  • chunk_overlap: the chunk overlap (in tokens) applied when splitting the document
  • skip_embedding_generation: whether or not to skip the generation of chunks and embeddings
  • set_page_as_boundary: described above
  • embedding_model: the model used to generate embeddings for the document chunks
  • use_ocr: whether or not to use OCR as a preprocessing step prior to generating chunks. Valid for PDFs, JPEGs, and PNGs
  • generate_sparse_vectors: whether or not to generate sparse vectors for the file. Required for hybrid search.
  • prepend_filename_to_chunks: whether or not to prepend the filename to the chunk text

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

const uploadResponse = await carbon.files.upload({
  skipEmbeddingGeneration: false,
  setPageAsBoundary: false,
  useOcr: false,
  generateSparseVectors: false,
  prependFilenameToChunks: false,
  parsePdfTablesWithOcr: false,
  detectAudioLanguage: false,
  transcriptionService: "assemblyai",
  includeSpeakerLabels: false,
  mediaType: "TEXT",
  splitRows: false,
  enableColdStorage: false,
  generateChunksOnly: false,
  storeFileOnly: false,
  file: fs.readFileSync("/path/to/file"),
});

⚙️ Parameters

file: Uint8Array | File | buffer.File
chunkSize: number

Chunk size in tiktoken tokens to be used when processing file.

chunkOverlap: number

Chunk overlap in tiktoken tokens to be used when processing file.

skipEmbeddingGeneration: boolean

Flag to control whether or not embeddings should be generated and stored when processing file.

setPageAsBoundary: boolean

Flag to control whether or not to set the a page's worth of content as the maximum amount of content that can appear in a chunk. Only valid for PDFs. See description route description for more information.

embeddingModel: EmbeddingModel

Embedding model that will be used to embed file chunks.

useOcr: boolean

Whether or not to use OCR when processing files. Valid for PDFs, JPEGs, and PNGs. Useful for documents with tables, images, and/or scanned text.

generateSparseVectors: boolean

Whether or not to generate sparse vectors for the file. This is required for the file to be a candidate for hybrid search.

prependFilenameToChunks: boolean

Whether or not to prepend the file's name to chunks.

maxItemsPerChunk: number

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parsePdfTablesWithOcr: boolean

Whether to use rich table parsing when use_ocr is enabled.

detectAudioLanguage: boolean

Whether to automatically detect the language of the uploaded audio file.

transcriptionService: TranscriptionServiceNullable

The transcription service to use for audio files. If no service is specified, 'deepgram' will be used.

includeSpeakerLabels: boolean

Detect multiple speakers and label segments of speech by speaker for audio files.

mediaType: FileContentTypesNullable

The media type of the file. If not provided, it will be inferred from the file extension.

splitRows: boolean

Whether to split tabular rows into chunks. Currently only valid for CSV, TSV, and XLSX files.

enableColdStorage: boolean

Enable cold storage for the file. If set to true, the file will be moved to cold storage after a certain period of inactivity. Default is false.

hotStorageTimeToLive: number

Time in days after which the file will be moved to cold storage. Must be one of [1, 3, 7, 14, 30].

generateChunksOnly: boolean

If this flag is enabled, the file will be chunked and stored with Carbon, but no embeddings will be generated. This overrides the skip_embedding_generation flag.

storeFileOnly: boolean

If this flag is enabled, the file will be stored with Carbon, but no processing will be done.

🔄 Return

UserFile

🌐 Endpoint

/uploadfile POST

🔙 Back to Table of Contents


carbon.files.uploadFromUrl

Create Upload File From Url

🛠️ Usage

const uploadFromUrlResponse = await carbon.files.uploadFromUrl({
  url: "url_example",
  skip_embedding_generation: false,
  set_page_as_boundary: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  use_textract: false,
  prepend_filename_to_chunks: false,
  parse_pdf_tables_with_ocr: false,
  detect_audio_language: false,
  transcription_service: "assemblyai",
  include_speaker_labels: false,
  media_type: "TEXT",
  split_rows: false,
  generate_chunks_only: false,
  store_file_only: false,
});

⚙️ Parameters

url: string
file_name: string
chunk_size: number
chunk_overlap: number
skip_embedding_generation: boolean
set_page_as_boundary: boolean
embedding_model: EmbeddingGenerators
generate_sparse_vectors: boolean
use_textract: boolean
prepend_filename_to_chunks: boolean
max_items_per_chunk: number

Number of objects per chunk. For csv, tsv, xlsx, and json files only.

parse_pdf_tables_with_ocr: boolean
detect_audio_language: boolean
transcription_service: TranscriptionServiceNullable
include_speaker_labels: boolean
media_type: FileContentTypesNullable
split_rows: boolean
cold_storage_params: ColdStorageProps
generate_chunks_only: boolean

If this flag is enabled, the file will be chunked and stored with Carbon, but no embeddings will be generated. This overrides the skip_embedding_generation flag.

store_file_only: boolean

If this flag is enabled, the file will be stored with Carbon, but no processing will be done.

🔄 Return

UserFile

🌐 Endpoint

/upload_file_from_url POST

🔙 Back to Table of Contents


carbon.files.uploadText

Carbon supports multiple models for use in generating embeddings for files. For images, we support Vertex AI's multimodal model; for text, we support OpenAI's text-embedding-ada-002 and Cohere's embed-multilingual-v3.0. The model can be specified via the embedding_model parameter (in the POST body for /embeddings, and a query parameter in /uploadfile). If no model is supplied, the text-embedding-ada-002 is used by default. When performing embedding queries, embeddings from files that used the specified model will be considered in the query. For example, if files A and B have embeddings generated with OPENAI, and files C and D have embeddings generated with COHERE_MULTILINGUAL_V3, then by default, queries will only consider files A and B. If COHERE_MULTILINGUAL_V3 is specified as the embedding_model in /embeddings, then only files C and D will be considered. Make sure that the set of all files you want considered for a query have embeddings generated via the same model. For now, do not set VERTEX_MULTIMODAL as an embedding_model. This model is used automatically by Carbon when it detects an image file.

🛠️ Usage

const uploadTextResponse = await carbon.files.uploadText({
  contents: "contents_example",
  skip_embedding_generation: false,
  embedding_model: "OPENAI",
  generate_sparse_vectors: false,
  generate_chunks_only: false,
  store_file_only: false,
});

⚙️ Parameters

contents: string
name: string
chunk_size: number
chunk_overlap: number
skip_embedding_generation: boolean
overwrite_file_id: number
embedding_model: EmbeddingGeneratorsNullable
generate_sparse_vectors: boolean
cold_storage_params: ColdStorageProps
generate_chunks_only: boolean

If this flag is enabled, the file will be chunked and stored with Carbon, but no embeddings will be generated. This overrides the skip_embedding_generation flag.

store_file_only: boolean

If this flag is enabled, the file will be stored with Carbon, but no processing will be done.

🔄 Return

UserFile

🌐 Endpoint

/upload_text POST

🔙 Back to Table of Contents


carbon.github.getIssue

Issue

🛠️ Usage

const getIssueResponse = await carbon.github.getIssue({
  issueNumber: 1,
  includeRemoteData: false,
});

⚙️ Parameters

issueNumber: number
includeRemoteData: boolean
dataSourceId: number
repository: string

🔄 Return

Issue

🌐 Endpoint

/integrations/data/github/issues/{issue_number} GET

🔙 Back to Table of Contents


carbon.github.getIssues

Issues

🛠️ Usage

const getIssuesResponse = await carbon.github.getIssues({
  data_source_id: 1,
  include_remote_data: false,
  repository: "repository_example",
  page: 1,
  page_size: 30,
  order_by: "created",
  order_dir: "asc",
});

⚙️ Parameters

data_source_id: number
repository: string

Full name of the repository, denoted as {owner}/{repo}

include_remote_data: boolean
page: number
page_size: number
next_cursor: string
filters: IssuesFilter
order_by: IssuesOrderBy
order_dir: OrderDirV2Nullable

🔄 Return

IssuesResponse

🌐 Endpoint

/integrations/data/github/issues POST

🔙 Back to Table of Contents


carbon.github.getPr

Get Pr

🛠️ Usage

const getPrResponse = await carbon.github.getPr({
  pullNumber: 1,
  includeRemoteData: false,
});

⚙️ Parameters

pullNumber: number
includeRemoteData: boolean
dataSourceId: number
repository: string

🔄 Return

PullRequestExtended

🌐 Endpoint

/integrations/data/github/pull_requests/{pull_number} GET

[🔙 **Back to Table o