@gmod/bed

v2.1.3

Published

9 months ago

A BED file format parser with autoSql support

Downloads

2,612

0High
0Medium
0Low

bed autosql genomics bionode biojs

bed-js

Performs parsing of BED files including autoSql

Usage

Example

You can pipe your file through this programs parseLine function

import BED from '@gmod/bed'

// you might require compatibility with node.js to use the default export with require e.g.
// const BED = require('@gmod/bed').default

var parser = new BED()
var text = fs.readFileSync('file.txt', 'utf8')
var results = text.split('\n').map(line => parser.parseLine(line))

API

Constructor

The BED constructor accepts an opts object with options

opts.autoSql - a optional autoSql schema for parsing lines
opts.type - a string representing one of a list of predefined types

The predefined types can include

bigInteract
bigMaf
bigPsl
bigNarrowPeak
bigGenePred
bigLink
bigChain
mafFrames
mafSummary

If neither autoSql or type is specified, the default BED schema is used (see here)

parseLine(line, opts)

Parses a BED line according to the currently loaded schema

line: string|Array<string> - is a tab delimited line with fields from the schema, or an array that has been pre-split by tab with those same contents
opts: Options - an options object

An Options object can contain

opts.uniqueId - an indication of a uniqueId that is not encoded by the BED line itself

The default instantiation of the parser with new BED() simply parses lines assuming the fields come from the standard BED schema. Your line can just contain just a subset of the fields e.g. chrom, chromStart, chromEnd, name, score

Examples

Parsing BED with default schema

const p = new BED()

p.parseLine('chr1\t0\t100')
// outputs { chrom: 'chr1', chromStart: 0, chromEnd: 100, strand: 0 }

Parsing BED with a built in schema e.g. bigGenePred

If you have a BED format that corresponds to a different schema, you can specify from the list of default built in schemas

Specify this in the opts.type for the BED constructor

const p = new BED({ type: 'bigGenePred' })
const line = 'chr1\t11868\t14409\tENST00000456328.2\t1000\t+\t11868\t11868\t255,128,0\t3\t359,109,1189,\t0,744,1352,\tDDX11L1\tnone\tnone\t-1,-1,-1,\tnone\tENST00000456328.2\tDDX11L1\tnone'
p.parseLine(line)
// above line outputs
      { chrom: 'chr1',
        chromStart: 11868,
        chromEnd: 14409,
        name: 'ENST00000456328.2',
        score: 1000,
        strand: 1,
        thickStart: 11868,
        thickEnd: 11868,
        reserved: '255,128,0',
        blockCount: 3,
        blockSizes: [ 359, 109, 1189 ],
        chromStarts: [ 0, 744, 1352 ],
        name2: 'DDX11L1',
        cdsStartStat: 'none',
        cdsEndStat: 'none',
        exonFrames: [ -1, -1, -1 ],
        type: 'none',
        geneName: 'ENST00000456328.2',
        geneName2: 'DDX11L1',
        geneType: 'none' }

Parsing BED with a supplied autoSql

If you have a BED format with a custom alternative schema with autoSql, or if you are using a BigBed file that contains autoSql (e.g. with @gmod/bbi then you can get it from header.autoSql) then you initialize the schema in the constructor and then use parseLine as normal

const {BigBed} = require('@gmod/bbi')
const bigbed = new BigBed({path: 'yourfile'})
const {autoSql} = await bigbed.getHeader()
const p = new BED({ autoSql })
p.parseLine(line)
// etc.

Important notes

Does not parse "browser" or "track" lines and will throw an error if parseLine receives one of these
By default, parseLine parses only tab delimited text, if you want to use spaces as is allowed by UCSC then pass an array to line for parseLine
Converts strand from {+,-,.} to {1,-1,0} and also sets strand 0 even if no strand is in the autoSql

Academic Use

This package was written with funding from the NHGRI as part of the JBrowse project. If you use it in an academic project that you publish, please cite the most recent JBrowse paper, which will be linked from jbrowse.org.

License

based on https://genome-source.gi.ucsc.edu/gitlist/kent.git/blob/master/src/hg/autoSql/autoSql.doc

also see http://genomewiki.ucsc.edu/index.php/AutoSql and https://www.linuxjournal.com/article/5949