npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

barnard59-cube

v1.4.9

Published

Build and check RDF cubes in Linked Data pipelines

Downloads

2,947

Readme

barnard59-cube

This package provides operations and commands for RDF cubes in Barnard59 Linked Data pipelines. The manifest.ttl file contains a full list of all operations included in this package.

Installation

npm install barnard59-cube

Operations

toObservation

A pipeline can use toObservation operation to create cube observations according to the Cube Schema specification.

Each input chunk is expected to be a dataset (or an array of quads) with data for a single observation (usually obtained from a CSVW mapping). The operation cleans up the data, adding suitable defaults and identifiers.

buildCubeShape

The buildCubeShape operation derives a SHACL shape from a stream of observations.

Each input chunk is expected to be a dataset (or an array of quads) representing a single observation (usually obtained with toObservation). The operation collects information about the values of observation properties and propagates the chunk. Once the input stream is over, it creates and emits a shape describing the properties of the observations.

Commands

The package provides pipelines to retrieve and validate cube observations and metadata.

Once barnard59-cube is installed, run barnard59 cube to list the pipelines available as CLI commands. You can get help on each one, for example, barnard59 cube fetch-cube --help

fetch cube

Pipeline fetch-cube queries a given SPARQL endpoint to retrieve all the data and metadata of a given cube.

barnard59 cube fetch-cube \
    --cube https://agriculture.ld.admin.ch/agroscope/PRIFm8t15/2 \
    --endpoint https://int.lindas.admin.ch/query \

For big cubes, it is advisable to use the other fetch commands to get the cube parts separately (for example fetch-metadata and fetch-observations).

fetch constraint

Pipeline fetch-constraint queries a given SPARQL endpoint to retrieve the constraint shape of a given cube.

barnard59 cube fetch-constraint \
  --cube https://agriculture.ld.admin.ch/agroscope/PRIFm8t15/2 \
  --endpoint https://int.lindas.admin.ch/query

fetch metadata

Pipeline fetch-metadata queries a given SPARQL endpoint to retrieve cube metadata and its constraint shape (excluding the observations).

barnard59 cube fetch-metadata \
  --cube https://agriculture.ld.admin.ch/agroscope/PRIFm8t15/2 \
  --endpoint https://int.lindas.admin.ch/query \
> metadata.ttl

fetch observations

Pipeline fetch-observations queries a given SPARQL endpoint to retrieve the observations of a given cube.

barnard59 cube fetch-observations \
    --cube https://agriculture.ld.admin.ch/agroscope/PRIFm8t15/2 \
    --endpoint https://int.lindas.admin.ch/query \
 > observations.ttl

The output of all the fetch pipelines is written to stdout and can be redirected to a local file as in the last examples.

check metadata

Pipeline check-metadata validates the input metadata against the shapes provided with the --profile option (the default profile is https://cube.link/latest/shape/standalone-constraint-constraint).

The pipeline reads the metadata from stdin, allowing input from a local file (as in the following example), as well as directly from the output of the fetch-metadata pipeline. In most cases it's useful to have the metadata in a local file because it's needed also for the check-observations pipeline.

cat metadata.ttl \
| barnard59 cube check-metadata \
    --profile https://cube.link/v0.1.0/shape/standalone-constraint-constraint

SHACL reports with violations are written to stdout.

In cases when --profile is a remote address which does not include a correct content-type header, the pipeline will fail. In such cases, it is possible to use the --profileFormat option to select the correct RDF parser. Its value must be a media type, such as text/turtle.

The value of --profile can also be a local file.

check observations

Pipeline check-observations validates the input observations against the shapes provided with the --constraint option.

The pipeline reads the observations from stdin, allowing input from a local file (as in the following example) as well as directly from the output of the fetch-observations pipeline.

cat observations.ttl \
| barnard59 cube check-observations \
    --constraint metadata.ttl

To enable validation, the pipeline adds to the constraint a sh:targetClass property with value cube:Observation, requiring that each observation has an explicit rdf:type.

To leverage streaming, input is split and validated in little batches of adjustable size (the default is 50 and likely it's appropriate in most cases). This allows the validation of very big cubes because observations are not loaded in memory all at once. To ensure triples for the same observation are adjacent (hence processed in the same batch), the input is sorted by subject (and in case the input is large the sorting step relies on temporary local files).

SHACL reports are written to stdout.

To limit the output size, there is also a maxViolations option to stop validation when the given number of violations is reached.

Report Summary

The validation pipelines write a machine-readable standard report to stdout. The barnard59-shacl package provides an additional report-summary pipeline to produce a human-readable summary of this report:

cat observations.ttl \
| barnard59 cube check-observations --constraint metadata.ttl \
| barnard59 shacl report-summary

Known issues

  • Command check-metadata may fail if there are sh:in constraints with too many values.
  • sh:class constraints require all cube data in memory at once (--batchSize 0).