proteins

v1.1.0

Published

3 years ago

standardized system of protein nomenclature and identification

Downloads

0High
0Medium
0Low

archimedespi

biology crypto proteins

Proteins

A project to aggressively refactor protein identification and data storage

The current state of protein identification codes is that of many disparate systems. The most well-known system as of this writing is the GenBank Accession Code. However, GenBank codes correspond to a particular observation of a protein, meaning that for a single protein in a single organism, there are usually >10,000 accession codes with the same sequence. This is terrible for computer programs operating on an idealized protein, since a "standard" GenBank accession will have to be cherry-picked for each needed protein.

This project sets out to refactor the situation of identification codes and data storage formats. We propose a JSON-based system storing protein metadata, and a dual identifier system containing two sets of identifiers:

a global, long hexadecimal identifier
and a short, human-readable identifier

We've implemented this system in this repository. After installing with npm install -g proteins: To generate an ID from a JSON protein description file:

mkproteinid < ./test.json

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

proteins

v1.1.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Proteins

A project to aggressively refactor protein identification and data storage