fastaspeciesfilter
v0.2.0
Published
This tool allows the selection of a taxonomy ID and filters all sequences from a FASTA file that belong to clades subordinate to the selected ID.
Downloads
3
Maintainers
Readme
Purpose
The aim of this tool is to filter a large FASTA formatted file containing taxonomy information in the definition line by a selected taxonomy level. E.g., extract all proteins pertaining to green plants from UniRef.
Installation
npm install -g fastaspeciesfilter
-g so that you can run the tool directly from the terminal.
fastaSpeciesFilter
Input a FASTA formatted file.
Output a FASTA formatted file (all sequence on one line).
run fastaspeciesfilter without arguments to see all available parameters.
Example:
fastaSpeciesFilter -i nr.fasta -o filtered.fasta -no nodes.dmp -na names.dmp -t 10090
This will result in the extraction of all headers containing mouse or a subspecies of mouse.
nodes.dmp and names.dmp are files available at NCBI. These files need to be downloaded from NCBI and provided as input. If they are not specified with -na or -no, they are assumed to exist in the current directory. The taxonomy IDs are also available from NCBI.
Support
You can submit errors or feature requests here: https://bitbucket.org/allmer/ios/src/master/