npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

vsm-dictionary-bioportal

v1.2.1

Published

Implementation of a VSM-dictionary that interfaces with the BioPortal REST API

Downloads

18

Readme

vsm-dictionary-bioportal

Node.js CI codecov npm version Downloads License

Summary

vsm-dictionary-bioportal is an implementation of the 'VsmDictionary' parent-class/interface (from the package vsm-dictionary), that communicates with BioPortal's REST API and translates the provided terms+IDs into a VSM-specific format.

Install

Run: npm install

Example use

You first need to have a BioPortal account to use this dictionary. Once you have an account, you will be given an API key to authorise your access to the provided REST API's resources.

Node.js

Create a directory test-dir and inside run npm install vsm-dictionary-bioportal. Then, create a test.js file and include this code for example:

const DictionaryBioPortal = require('vsm-dictionary-bioportal');
const apiKeyString = 'a-valid-API-key-string';
const dict = new DictionaryBioPortal({apiKey: apiKeyString});

dict.getEntryMatchesForString('melanoma',
  { filter: { dictID : [
        'http://data.bioontology.org/ontologies/RH-MESH',
        'http://data.bioontology.org/ontologies/MCCL',
        'http://data.bioontology.org/ontologies/MEDDRA'
      ]},
    sort: { dictID : ['http://data.bioontology.org/ontologies/RH-MESH'] },
    z: true,
    page: 1,
    perPage: 10
  }, (err, res) => {
    if (err) 
      console.log(JSON.stringify(err, null, 4));
    else
      console.log(JSON.stringify(res, null, 4));
  }
);

Then, run node test.js

Browsers

<script src="https://unpkg.com/vsm-dictionary-bioportal@^1.2.0/dist/vsm-dictionary-bioportal.min.js"></script>

after which it is accessible as the global variable VsmDictionaryBioPortal.

Tests

Run npm test, which runs the source code tests with Mocha.
If you want to live test the BioPortal API and the main functions provided by DictionaryBioPortal, go to the test directory and run:

node getDictInfos.test.js
node getEntries.test.js
node getEntryMatchesForString.test.js

Browser Demo

Run npm run demo to start the interactive demo. The demo currently supports only the getMatchesForString() function. This command automatically opens a browser page with 6 input-fields to search on BioPortal ontology data. The 6 inputs fields represent:

  1. The string to search results for
  2. The dictionary abbreviations (e.g. GO,MCCL), comma separated, that will be used as arguments for filtering the results
  3. The preferred dictionary abbreviations, comma separated, that will be used as arguments for sorting the results
  4. The z-object's properties to be kept in the result (if left empty, all properties will be shown)
  5. The page number - note that only for page=1 there is a possibility of sorting the results taking into account the preferred dictionaries
  6. The page size returned, which is the maximum number of returned terms/results in the demo.

The demo works by making a Webpack dev-server bundle all source code and serve it to the browser.

'Build' configuration & demo

To use a VsmDictionary in Node.js, one can simply run npm install and then use require(). But it is also convenient to have a version of the code that can just be loaded via a <script>-tag in the browser.

Therefore, we included webpack.config.js, which is a Webpack configuration file for generating such a browser-ready package.

By running npm build, the built file will appear in a 'dist' subfolder. A demo-use of this file can then be seen by opening demo-build.html (in the 'demo' folder). (It includes a <script src="../dist/vsm-dictionary-bioportal.min.js"></script> tag). So after the build step, demo-build.html does not need Webpack to run.

Specification

Like all VsmDictionary subclass implementations, this package follows the parent class specification. In the next sections we will explain the mapping between BioPortal's API terms (as specified in the API documentation) and the corresponding VSM objects. First, some info about the API itself:

Most of the queries we launch against BioPortal use the search and property_search endpoints: /search?q={search query} and /property_search?q={search query}, along with different parameters. One of the most important parameters is the filtering on the ontologies' abbreviation names: ontologies={ontologyAbbrev1,ontologyAbbrev2,ontologyAbbrev3}.

The returned terms have fields which are matched against the search query (q parameter). The fields are searched based on the following order (match rank priority):

  • id
  • prefLabelExact (match on the full pref label)
  • prefLabel (match on partial pref label)
  • synonymExact (match on the full synonym(s))
  • synonym (match on the partial synonym(s))
  • notation (last fragment of id)
  • cui (for UMLS ontologies)
  • semantic_types

Note that BioPortal's partial matching means that it can match by default single words in a multi-word expression, and these results are usually ranked lower than the full expression matches. It does not mean though that it performs starts with partial matches. Because curators wanted to have partial-word matches, we enabled that feature by default and every search query has the suggest=true URL parameter added. You can disable it in the constructor as follows:

const dict = new DictionaryBioPortal({apiKey: apiKeyString, suggest: false});

If the URL has the ontologies parameter, then when you get results that have the same field from the list above (e.g. same id), these are ordered according to an internal (BioPortal) ontology ranking, which is updated every week. At the time of writing these lines, the latest ranking was stored here for reference.

Note also that we implement strict error handling in the sense that whenever we launch multiple parallel queries to BioPortal's REST API (see the functions specifications below), if one of them returns an error (either a string or an error JSON object response), then the result will be an error object (no matter if all the rest of the calls returned proper results).

If the error response in not a JSON string that we can parse, we formulate the error as a JSON object ourselves in the following format:

{
  status: <number>,
  error: <response> 
}

where the response from the server is JSON stringified.

Map BioPortal to DictInfo VSM object

This specification relates to the function:
getDictInfos(options, cb)

If the options.filter.id is properly defined and none of the ids used for filtering are BioPortal-related (meaning that they do not have the data.bioontology.org/ontologies as a substring), then getDictInfos returns an empty object result.

Otherwise, an example of a URL string that is being built and send to BioPortal is:

http://data.bioontology.org/ontologies/GO?display_context=false

If options.filter.id is empty or not properly defined, then we query all of BioPortal's ontologies with:

http://data.bioontology.org/ontologies/?display_context=false

The options.page and options.perPage are used to trim the number of the results. If these options are not properly defined, then the default values from the BioPortal API are used (1 and 50 respectively).

After sending a query to ask for information about a specific ontology, the returned JSON result is mapped to a VSM dictInfo object. The mapping is fully detailed in the table below:

BioPortal ontology property | Type | Required | VSM dictInfo object property | Notes
:---:|:---:|:---:|:---:|--- @id | URL | YES | id | the unique ontology URI acronym | String | YES | abbrev | the unique ontology acronym name | String | YES | name | the full name of the ontology

Map BioPortal to Entry VSM object

This specification relates to the function:
getEntries(options, cb)

If the options.filter.dictID is properly defined and none of the dictIDs used for filtering are BioPortal-related (meaning that they do not have the data.bioontology.org/ontologies as a substring), then getEntries returns an empty object result.

Depending on the options.filter.id and options.filter.dictID properties and following the vsm-dictionary parent class specification, there can be only be 4 cases of queries that are send to BioPortal:

  • Non proper filter.id and filter.dictID or options.filter is an empty object

By default we search for all terms in all ontologies in BioPortal:

http://data.bioontology.org/search?ontologies=&ontology_types=ONTOLOGY&pagesize=1&display_context=false

Note that because the query above has neither a search string, nor multiple ontologies to rank against, the returned results have no deterministic order.

  • Non proper filter.id but proper filter.dictID property

We use the following query to get all terms within a set of ontologies:

http://data.bioontology.org/search?ontologies=NCIT,GO&ontology_types=ONTOLOGY&pagesize=1&display_context=false

Also here, no sorting is done.

  • Proper filter.id but non proper filter.dictID property

We use the following queries to find a term by id (without any given ontologies):

http://data.bioontology.org/search?q=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FDOID_1909&ontologies=&require_exact_match=true&also_search_obsolete=true&display_context=false
http://data.bioontology.org/property_search?q=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FDOID_1909&ontologies=&require_exact_match=true&display_context=false

Note that in case of querying for specific id(s), we ask also for obsolete terms in the search endpoint but not in the property_search one, since it does not support them.

Furthermore, because of multiple ontologies having terms with the same id, we never sort the returned results. Instead, we have implemented a workaround that infers the ontology abbreviation name from the id in some cases. In the case of multiple entries with the same id, the entry for which we can correctly infer their source ontology will be ranked first in the returned result.

  • Both proper filter.id and filter.dictID properties

We use the following query to find a term by id within any of the given ontologies:

http://data.bioontology.org/search?q=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FDOID_1909&ontologies=BAO,DOID&require_exact_match=true&also_search_obsolete=true&display_context=false
http://data.bioontology.org/property_search?q=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FDOID_1909&ontologies=BAO,DOID&require_exact_match=true&display_context=false

Same as the previous case, we never sort results (it's optional nonetheless) and the obsolete terms are also retrieved in the search query option.

So, after sending one query from the 4 categories above to BioPortal, the returned JSON result object includes a collection property which has as a value, an array of objects. Each object/element of that array is an entry which is mapped to a VSM entry object. The mapping is fully detailed in the tables below for the different endpoints:

  • /search endpoint:

BioPortal entry's property | Type | Required | VSM entry object property | Notes
:---:|:---:|:---:|:---:|:---: @id | URL | YES | id | the concept-ID links.ontology | URL | YES | dictID | the unique identifier of the ontology definition | Array | NO | descr | we map the first definition only synonym | Array | NO | terms[i].str | we map the whole array, first element of terms array is an object with property str and value the prefLabel links.ontology | URL | YES | z.dictAbbrev | the unique ontology acronym cui | Array | NO | z.cui | Concept Unique Identifier semanticType | Array | NO | z.tui | Type Unique Identifier obsolete | Boolean | NO | z.obsolete | This z option is returned only when requesting for specific entry id(s)

  • /property_search endpoint:

BioPortal entry's property | Type | Required | VSM entry object property | Notes
:---:|:---:|:---:|:---:|:---: @id | URL | YES | id | the property's ID links.ontology | URL | YES | dictID | the unique identifier of the ontology definition | Array | NO | descr | we map the first definition only label, labelGenerated | Arrays | NO, YES | terms[i].str | we map the whole label array if it exists, otherwise the labelGenerated one (which always exists) links.ontology | URL | YES | z.dictAbbrev | the unique ontology acronym

Map BioPortal to Match VSM object

This specification relates to the function:
getEntryMatchesForString(str, options, cb)

If the options.filter.dictID is properly defined and none of the dictIDs used for filtering are BioPortal-related (meaning that they do not have the data.bioontology.org/ontologies as a substring), then getEntryMatchesForString returns an empty object result.

An example of a URL string that is being built and send to BioPortal is:

http://data.bioontology.org/search?q=melanoma&ontologies=RH-MESH,MCCL&page=1&pagesize=40&display_context=false&suggest=true

Note that every query is being duplicated by using also the search_property endpoint, so we have at least 2 queries (URLs) per one string search.

The parameters are as follows:

  • str maps to q=str
  • page is the options.page and pagesize is options.perPage
  • The ontologies part of the URL corresponds to the sub-dictionaries from where we want to get terms. This parameter is being built according to the values of options.filter.dictID and options.sort.dictID as well as the specification of the vsm-dictionary parent class. Note that there can be cases where 4 URLs are fired (simultaneously) during a string search to get results for preferred dictionaries and all the rest for example (in each seperate case, both search and property_search endpoints are queried).
  • suggest=true is used to perform a search specifically geared towards type-ahead suggestions

Most of the above are optional URL parameters, meaning that if for example the options object is empty, then the default BioPortal API values will be used instead for page and pagesize (1 and 50 respectively), while the search will be done on all ontologies available at BioPortal's repository (the ontologies= part of the URL will be pruned):

http://data.bioontology.org/search?q=melanoma&display_context=false&suggest=true

The search string is obligatory though (if you want to get non-empty results :), the suggest=true is enabled by default and the display_context=false is always added since the additional context does not provide any useful data to be mapped to a VSM match object.

After sending such a query, the returned JSON result object includes a collection property which has as a value, an array of objects. Each object/element of that array is an entry which is mapped to a VSM match object. The mapping is fully detailed in the tables below for the different endpoints:

  • /search endpoint:

BioPortal entry's property | Type | Required | VSM match object property | Notes
:---:|:---:|:---:|:---:|:---: @id | URL | YES | id | the concept-ID links.ontology | URL | YES | dictID | the unique identifier of the ontology prefLabel | String | YES | str,terms[0].str | the string representation of the term definition | Array | NO | descr | we map the first definition only synonym | Array | NO | terms[i].str | we map the whole array, first element of terms array is an object with property str and value the prefLabel links.ontology | URL | YES | z.dictAbbrev | the unique ontology acronym cui | Array | NO | z.cui | Concept Unique Identifier semanticType | Array | NO | z.tui | Type Unique Identifier

  • /property_search endpoint:

BioPortal entry's property | Type | Required | VSM match object property | Notes
:---:|:---:|:---:|:---:|:---: @id | URL | YES | id | the concept-ID links.ontology | URL | YES | dictID | the unique identifier of the ontology label, labelGenerated | Arrays | NO, YES | str,terms[0].str | the string representation of the term: we map the first element of the label array if it exists, otherwise the first element of the labelGenerated one (which always exists) definition | Array | NO | descr | we map the first definition only label, labelGenerated | Arrays | NO, YES | terms[i].str | we map the whole label array if it exists, otherwise the labelGenerated one (which always exists) links.ontology | URL | YES | z.dictAbbrev | the unique ontology acronym

License

This project is licensed under the AGPL license - see LICENSE.md.