json_schema_experiments

v1.7.0

Published

21 days ago

Some experiments to understand json schema features and how to better encode the data we need to deal with.

Downloads

155

0High
0Medium
0Low

octavio_duarte

json schema data validation consistency conventions farmos

Key Links

The following artifacts are generated automatically when you finish creating your Collection by running gitlab's CI pipeline. Links for staging (development branch) and main (currently deployed) branches are shown

Input Collection
- (main) Collection Schemas - Final published collection of conventions and overlays
- (staging) Collection Schemas - Final published collection of conventions and overlays
Output Collection
- (main) Collection Schemas - Final published collection of conventions and overlays
- (main) Collection Documentation - Human Readable documetation for the Input Collection
- (staging) Collection Schemas - Final published collection of conventions and overlays
- (staging) Collection Documentation - Human Readable documetation for the Input Collection
Validation - Both Input and Output Collections can be validated in a single NPM package here
- (main) NPM Package
- (main) NPM Package - Browser Version

Description

This repo centralizes several technical experiments OurSci and OpenTeam are performing to use JSON schemas to advance the Ag Data Wallet concept. Our aim is to make it easy to:

create high quality schemas for an Ag Data Wallet
create new Collections of schemas which are compatable with existing schemas
publish Collections of schemas in human and machine readable formats

How to use it

@octavio when we've completed the change to npm deployment... we can update this list, but basically:

Start a new Git repo. (git init)
Start an NPM module in that repo. ( npm init ).
Install the two involved libraries, npm install conventionbuilder conventionschemapublisher ).
Initialize the repo calling the conventionschemapublisher helper functions, as it is currently done to initialize projects in React, Vue or docusaurus. This will provide all the basic configuration, structure and files.
Set up the required CI variables (repo name, npm publication credentials, convention set name, etc). Of particular interest is the source of the basic FarmOS schemata, which can either be a farm or another repo.
(optional) If you want to inherit conventions from another repo, list them in a very simple JSON or YAML config file, which will have a format in which for each repo a user can either request all conventions, all conventions of several types (like "everything starting with 'log–activity' and everything starting with 'asset') or cherrypick individual conventions.
Writing definitions as explained in our tutorial and commiting them into either staging or main will trigger the CI process, test that the examples are adequate, publish the validation code, publish the conventions and publish the wiki containing all documentation. The links will also be clearly exhibited for the user.

@octavio after change to npm deployment, move this all to Documentation

Configure CI environment variables

You need a source for the FarmOS schemas, which are the base for all the conventions in this repo. There are two different ways to get them.

If you have a FarmOS aggregator and credentials for it, you can retrieve the schemata directly from a farm.
If you don't want/can't log in into a FarmOS aggregator instance or if you want to follow the entity selection from a preexisting source repo (another fork of this json_schema_disribution), you can point to it.

If you want to publish a validator package, you need to publish your npm toke in the .env file as NPM_AUTH_TOKEN.

Configure the sources for FarmOS schemas and conventions

Both should be parametrized in the .env file.

Getting FarmOS schemata from a farm, via FarmOS.js and the farmOS aggregator

It only has two mandatory variables:

FARM_DOMAIN: the url of a valid FarmOS instance known to be up to date with the expected data standards. A model instance.
FARMOS_KEY: The has key to your aggregator instance, which of course needs to have access to the target farm.

Getting FarmOS schemata from a source repo

SOURCE_REPO_PROJECT_ID: The GitLab project id of a git repo hosting a set of conventions with the format we expect here (typically, another fork of this same project). You can find this ID by navigating to the repo, it is shown right below the repo name in the main repo contents view as Project ID: XXXXX.

Configure Package Publication

A third, non mandatory variable will be required if you want to publish your validators as an npm package:

PUBLISHED_PACKAGE_NAME

Publishing will also require you to configure a private CI variable with your NPM credentials for a valid target repo.

Structure of the `output/conventions` folder

For each convention, a folder with the convention name is expected.
Inside the folder, one subdirectory is examples, in which two further paths should be correct and incorrect. These will be used to test is the schema is accepting the entitites we want it to accept and rejecting the entities we want it to reject. We will know if a schema is working properly if it accepts the corrent examples and rejects the incorrect examples.
In the root path, a file named schema.json should have a schema describing our convention. Typically, the convention will have several fields and each field should have atype/bundle from the parametrized farm, with a name unique to the schema. It is paired with an object.json file, which contains our richer JavaScript object describing the convention, which is ideal if, for example, the user intends to take a preexisting convention and modify it further.

Run Initialization Scripts

scripts/getAllSchemas.js will provide you with the source FarmOS entity schemata.
Build all the validators and schema files for te basic entities with scripts/compileAllValidators.js.
Build all conventions using scripts/rebuildCollection.sh (attention, this one is a BASH script, not node). Conventios are obtained from the definitions folder.

If you want to start from scratch, erase the conventions you obtained from your source repo and the output/collection, output/documentation, output/validators folders, write your definitions and run the scripts again.

Where are conventions stored?

All your work should happen inside the definitions folder. You can see how we use our schema_builder library to define conventions in the definitions folder of this repo. More details in the documentation for schenma_builder. Definitions describe how you intend your conventions to be. The conventions will be compiled based on your files and stored in the output/collection folder in a hierarchical folder structure.

What the repo will do for you

It will download and tidy (de drupalize) all schemas from the selected farm, using FarmOS.js (the script doing that is ./scripts/getAllSchemas.js).
It will test each provided convention against each correct and incorrect and inform if the results were corresponding to what was expected (see ./test/conventions_against_examples.test.js).
If you like all the changes you've done, your tests are passing and you want to merge into main, it will publish the new version of the schemas as artifacts and it will publish validation code for all your schemas and conventions in a uniquely named package (./scripts/compileAllValidators.js and ./scripts/transpilation.js to get a browser compatible version of the code, also publised in the library).
Offers an easy to use gitpods interface to allow you to work with the needed libraries with 0 configuration and get acquainted with the whole process.

How to define new Conventions

We created a library which helps build a convention in the proposed style using high level commands and facilitating testing against examples, documenting, etc. Its functionality is contained in the convention_builder module. See documentation here. A fairly detailed tutorial on how to use it shown here.

Other Contents of the Repo

Experiments

Experiments are documented in a further document inside the examples folder.
We explored several technical pathways we need to achieve the aims we stated above. Besides, we documented our explorations as much as possible, because we believe there will be a need to make JSON schema usage easy and immediate for developers of this spaces and the tool is simple enough, just lacking in documentation around the kind of usage we envision.

Centralized Publication of Schemas and Conventions

This repo incorporates a full CI/CD workflow that retrieves schemas from a model farm (chosen by the user with a parameter) and publishes each schema in a de drupalized form into folders structured by type and bundle. Besides it compiles autonomous schema validator functions using AJV, allowing to check if a file adheres to a schema in an as efortless way as possible. The schemas and validators are published into NPM to allow easy retrieval both in backend and frontend applications (via a node connected CDN). Besides we will offer a model Express endpoint to allow to share the schemas via any server with an easy model query using the type and bundle as parameters.

How is this done

A CI/CD pipeline publishes both the schemas as static files and the validators as an NPM library. Schemas are store inside the output/schemata/reference_collection archive structure.

Improving existing schemas

It is important to offer a de drupalized version of the schema, which is the one effectively working in the validation of a payload sent to farmOS, works with JSON schema validation tools and is also easier to read. The de drupalization is performed by FarmOS.js.
We also want to make some currently implicit constraints explicit. An example is the measure field in quantity shemas, which is currently listed as string but is really constrained to a fixed list of possible values.

How we plan to do this

We will store overlays next to the de drupalized schemas, containing the known implicit conventions. These will be applied when building the validator code.

Development of a Conventions Schema Format

Currently, the FarmOS community and the open Ag Wallet community are working on the concept of conventions as constructs involving many entities and several conditions around how they are filled and interconected. We believe being able to express these conventions around a schema would be fairly valuable as even filling one entity can be demanding when working out of the railguards of the FarmOS web interface. Also, having a strong validation tool would allow conventions to be more precise, strict and specific without demanding more effort from users.

How is this done

We already have propositions of functional convention schemas between the examples, as well as validators for these schemas.

Documentation

Latest version of detailed code documentation.

Examples From SurveyStack

We are working with data submitted primarily via SurveStack and adapting pre existing scripts in which conventions were mostly procedurally implied. A number of helper functions and structures are added to aid in this task.

Where and How to store Survey Stack Examples

A tutorial showing this process exists in the experiments/usingExamplesImporter.js file. It is heavily commented and intendede to be read and run linewise while looking at the effects of different steps in the code.

Raw json files as obtained directly from the survey API Compose tab in the scripts section should be stored in the definitions/examples/raw folder, using a meaningful name that clearly indicates the name of the convention. If several examples for a unique convention are used, a suffix index can be added. A valid path could look similar to definitions/examples/raw/log--activity--tillage_4.json.
The json files as obtained from SurveyStack (raw) are not ready to be used. We need to shape them as the schema expects. To do this, it is recomendable to load the file using the function exampleImporter from the libraty stored in src/schemaUtilities. It will read and sanitize the data into an array of valid entities. The user will need to structure it according to the convention and typically test and fix the convention until it works with the example. The script should have the same name as the example (but a .js termination, as it is JavaScript code) and simply be stored together with it.
The finished, processed example will be stored in a separate folder, definitinos/examples/processed, with a meaningful name. This is the file that should be called in the definition for a convention.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Key Links

Description

How to use it

Configure CI environment variables

Configure the sources for FarmOS schemas and conventions

Getting FarmOS schemata from a farm, via FarmOS.js and the farmOS aggregator

Getting FarmOS schemata from a source repo

Configure Package Publication

Structure of the output/conventions folder

Run Initialization Scripts

Where are conventions stored?

What the repo will do for you

How to define new Conventions

Other Contents of the Repo

Experiments

Centralized Publication of Schemas and Conventions

How is this done

Improving existing schemas

How we plan to do this

Development of a Conventions Schema Format

How is this done

Documentation

Examples From SurveyStack

Where and How to store Survey Stack Examples

Structure of the `output/conventions` folder