npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

bakana

v3.0.1

Published

Backend for kana's single-cell analyses. This supports single or multiple samples, execution in Node.js or the browser, in-memory caching of results for iterative analyses, and serialization to/from file for redistribution.

Downloads

32

Readme

Backend for kana

Overview

bakana provides the compute backend for the kana application. It provides a pipeline for a routine single-cell RNA-seq analysis, starting from the count matrix and finishing with the usual results (markers, clusters, t-SNE and so on). Datasets involving multiple samples in one or multiple matrices can also be analyzed with blocking and batch correction. The pipeline can be executed both in the browser and on Node.js. It supports in-memory caching of the analysis state for fast iterative re-analysis, as well as serialization of the state for storage and distribution to other machines.

Getting started

Install the package from NPM using the usual method:

npm install bakana

See the reference documentation for details on available functions.

Running analyses

We perform an analysis with the following commands:

import * as bakana from "bakana";
await bakana.initialize({ numberOfThreads: 8 }); 

let state = await bakana.createAnalysis();
let params = bakana.analysisDefaults();

await bakana.runAnalysis(state, 
    // Specify files using paths (Node.js) or File objects (browser).
    { my_data: new bakana.TenxHdf5Dataset("/some/file/path.h5") },
    params
);

Each step is represented by a *State class instance as a property of state, containing the analysis results. We can extract results from each state for further inspection.

state.rna_normalization.fetchNormalizedMatrix(); 
// ScranMatrix {}

state.rna_quality_control.fetchMetrics(); 
// PerCellRnaQcMetricsResults {}

state.rna_pca.fetchPCs(); 
// RunPCAResults {}

We can also supply a callback that processes summaries from each step as soon as it finishes. This can be useful for, e.g., posting diagnostics to another system once they become available.

function finishCallback(step) {
    if (state[step].changed) {
        console.log("Running " + step);
        // Do other stuff with the state's results...
    }
}

await bakana.runAnalysis(state, 
    { my_data: new bakana.TenxHdf5Dataset("/some/file/path.h5") }
    params,
    { finishFun: finishCallback }
);

If the analysis is re-run with different parameters, bakana will only re-run the affected steps. This includes all steps downstream of any step with changed parameters.

params.rna_pca.num_pcs = 15;
await bakana.runAnalysis(state, 
    { my_data: new bakana.TenxHdf5Dataset("/some/file/path.h5") }
    params,
    { finishFun: finishCallback }
);
// Running rna_pca
// Running combine_embeddings
// Running batch_correction
// Running neighbor_index
// Running kmeans_cluster
// Running snn_graph_cluster
// ...

Saving results

Given an analysis state, we can dump its contents into a SingleCellExperiment for further examination:

await bakana.saveSingleCellExperiment(state, "sce", { directory: "output" });

This stores the data and results into various fields of the SingleCellExperiment:

  • The assays contain the sparse (QC-filtered) count matrix and its corresponding log-transformed normalized matrix as a DelayedArray.
  • The row data contains gene identifiers along with variance modelling and marker detection results in nested DataFrames.
  • The column data contains quality control metrics in nested DataFrames, along with other pieces like the clustering and blocking.
  • The reduced dimensions contains PCA, t-SNE and UMAP results.
  • The alternative experiments contains further nested SingleCellExperiments for other modalities.

We use the alabaster representation of a SingleCellExperiment to provide multi-language access to the results. For example, we can load the SingleCellExperiment back into an R session via the loadObject() function:

library(alabaster)
info <- acquireMetadata("output", "sce")
sce <- loadObject(info, "output")

Saving configurations

Given an analysis state, we can save its configuration via the serializeAnalysis() function. This returns an object that contains the analysis parameters, which can then be converted to JSON and saved to file for later use.

let saved = [];
let saveFileHandler = (k, f, file) => {
    saved.push(file.buffer());
    return String(saved.length);
};

let config = await bakana.serializeConfiguration(state, saveFileHandler);

Applications are responsible for deciding how to handle the input data files. In the example above, we just store the file contents in a saved array of Uint8Arrays, e.g., for inclusion in a tarball with the configuration JSON. More complex applications may create a staging directory on the file system in which to store the files (e.g., for Node.js), or may register the file contents in a database for later extraction.

These configurations can be used to create a new analysis state via the unserializeConfiguration() function. This will extract the parameters/data files and rerun the entire analysis via runAnalysis(), allowing us to recover the same analysis state that went into serializeConfiguration(). The example below uses a loading handler that just undoes the effect of saveFileHandler.

let loadFileHandler = id => saved[Number(id) - 1];
let reloaded = await bakana.unserializeConfiguration(config, loadFileHandler);

Terminating analyses

Once a particular analysis is finished, we should free the resources of its state.

bakana.freeAnalysis(state);
bakana.freeAnalysis(reloaded);

If all analyses are complete, we can terminate the bakana session to release memory/workers.

bakana.terminate();

Developer notes

See here for instructions on adding custom dataset readers. This allows us to use bakana's analysis pipeline and serialization capabilities on datasets from other sources such as in-house databases.

Testing can be done with npm run test with Node 16+. For older versions of Node, it requires some combination of the options below:

node --experimental-vm-modules \
    --experimental-wasm-threads \
    --experimental-wasm-bulk-memory \
    --experimental-wasm-bigint \
    node_modules/jest/bin/jest.js \
    --testTimeout=100000000 \
    --runInBand