npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

apr144-bam

v2.0.2-rs-053124-020

Published

Parser for BAM and BAM index (bai) files

Downloads

324

Readme

NPM version Coverage Status Build Status

Install

$ npm install --save @gmod/bam

Usage

const { BamFile } = require('@gmod/bam')
// or import {BamFile} from '@gmod/bam'

const t = new BamFile({
  bamPath: 'test.bam',
})

// note: it's required to first run getHeader before any getRecordsForRange
var header = await t.getHeader()

// this would get same records as samtools view ctgA:1-50000
var records = await t.getRecordsForRange('ctgA', 0, 50000)

The bamPath argument only works on nodejs. In the browser, you should pass bamFilehandle with a generic-filehandle e.g. RemoteFile

const { RemoteFile } = require('generic-filehandle')
const bam = new BamFile({
  bamFilehandle: new RemoteFile('yourfile.bam'), // or a full http url
  baiFilehandle: new RemoteFile('yourfile.bam.bai'), // or a full http url
})

Input are 0-based half-open coordinates (note: not the same as samtools view coordinate inputs!)

Usage with htsget

Since 1.0.41 we support usage of the htsget protocol

Here is a small code snippet for this

const { HtsgetFile } = require('@gmod/bam')

const ti = new HtsgetFile({
  baseUrl: 'http://htsnexus.rnd.dnanex.us/v1/reads',
  trackId: 'BroadHiSeqX_b37/NA12878',
})
await ti.getHeader()
const records = await ti.getRecordsForRange(1, 2000000, 2000001)

Our implementation makes some assumptions about how the protocol is implemented, so let us know if it doesn't work for your use case

Documentation

BAM constructor

The BAM class constructor accepts arguments

  • bamPath/bamUrl/bamFilehandle - a string file path to a local file or a class object with a read method
  • csiPath/csiUrl/csiFilehandle - a CSI index for the BAM file, required for long chromosomes greater than 2^29 in length
  • baiPath/baiUrl/baiFilehandle - a BAI index for the BAM file
  • cacheSize - limit on number of chunks to cache. default: 50
  • yieldThreadTime - the interval at which the code yields to the main thread when it is parsing a lot of data. default: 100ms. Set to 0 to performed no yielding

Note: filehandles implement the Filehandle interface from https://www.npmjs.com/package/generic-filehandle. This module offers the path and url arguments as convenience methods for supplying the LocalFile and RemoteFile

async getRecordsForRange(refName, start, end, opts)

Note: you must run getHeader before running getRecordsForRange

  • refName - a string for the chrom to fetch from
  • start - a 0-based half open start coordinate
  • end - a 0-based half open end coordinate
  • opts.signal - an AbortSignal to indicate stop processing
  • opts.viewAsPairs - re-dispatches requests to find mate pairs. default: false
  • opts.pairAcrossChr - control the viewAsPairs option behavior to pair across chromosomes. default: false
  • opts.maxInsertSize - control the viewAsPairs option behavior to limit distance within a chromosome to fetch. default: 200kb

async *streamRecordsForRange(refName, start, end, opts)

This is a async generator function that takes the same signature as getRecordsForRange but results can be processed using

for await (const chunk of file.streamRecordsForRange(
  refName,
  start,
  end,
  opts,
)) {
}

The getRecordsForRange simply wraps this process by concatenating chunks into an array

async getHeader(opts: {....anything to pass to generic-filehandle opts})

This obtains the header from HtsgetFile or BamFile. Retrieves BAM file and BAI/CSI header if applicable, or API request for refnames from htsget

async indexCov(refName, start, end)

  • refName - a string for the chrom to fetch from
  • start - a 0-based half open start coordinate (optional)
  • end - a 0-based half open end coordinate (optional)

Returns features of the form {start, end, score} containing estimated feature density across 16kb windows in the genome

async lineCount(refName: string)

  • refName - a string for the chrom to fetch from

Returns number of features on refName, uses special pseudo-bin from the BAI/CSI index (e.g. bin 37450 from bai, returning n_mapped from SAM spec pdf) or -1 if refName not exist in sample

async hasRefSeq(refName: string)

  • refName - a string for the chrom to check

Returns whether we have this refName in the sample

Returned features

The returned features from BAM are lazy features meaning that it delays processing of all the feature tags until necessary.

You can access data feature.get('field') to get the value of a feature attribute

Example

feature.get('seq_id') // numerical sequence id corresponding to position in the sam header
feature.get('start') // 0-based half open start coordinate
feature.get('end') // 0-based half open end coordinate

Fields

feature.get('name') // QNAME
feature.get('seq') // feature sequence
feature.get('qual') // qualities
feature.get('cigar') // cigar string
feature.get('MD') // MD string
feature.get('SA') // supplementary alignments
feature.get('template_length') // TLEN
feature.get('length_on_ref') // derived from CIGAR using standard algorithm

Flags

feature.get('flags') // see https://broadinstitute.github.io/picard/explain-flags.html

Tags

BAM tags such as MD can be obtained via

feature.get('MD')

A full list of tags that can be obtained can be obtained via

feature._tags()

The feature format may change in future versions to be more raw data records, but this will be a major version bump

Note

The reason that we hide the data behind this ".get" function is that we lazily decode records on demand, which can reduce memory consumption.

License

MIT © Colin Diesh