json_schema_experiments
v1.7.0
Published
Some experiments to understand json schema features and how to better encode the data we need to deal with.
Downloads
155
Maintainers
Readme
Key Links
The following artifacts are generated automatically when you finish creating your Collection by running gitlab's CI pipeline. Links for staging
(development branch) and main
(currently deployed) branches are shown
- Input Collection
- (
main
) Collection Schemas - Final published collection of conventions and overlays - (
staging
) Collection Schemas - Final published collection of conventions and overlays
- (
- Output Collection
- (
main
) Collection Schemas - Final published collection of conventions and overlays - (
main
) Collection Documentation - Human Readable documetation for the Input Collection - (
staging
) Collection Schemas - Final published collection of conventions and overlays - (
staging
) Collection Documentation - Human Readable documetation for the Input Collection
- (
- Validation - Both Input and Output Collections can be validated in a single NPM package here
Description
This repo centralizes several technical experiments OurSci and OpenTeam are performing to use JSON schemas to advance the Ag Data Wallet concept. Our aim is to make it easy to:
- create high quality schemas for an Ag Data Wallet
- create new Collections of schemas which are compatable with existing schemas
- publish Collections of schemas in human and machine readable formats
How to use it
@octavio when we've completed the change to npm deployment... we can update this list, but basically:
- Start a new Git repo. (
git init
) - Start an NPM module in that repo. (
npm init
). - Install the two involved libraries,
npm install conventionbuilder conventionschemapublisher
). - Initialize the repo calling the conventionschemapublisher helper functions, as it is currently done to initialize projects in React, Vue or docusaurus. This will provide all the basic configuration, structure and files.
- Set up the required CI variables (repo name, npm publication credentials, convention set name, etc). Of particular interest is the source of the basic FarmOS schemata, which can either be a farm or another repo.
- (optional) If you want to inherit conventions from another repo, list them in a very simple JSON or YAML config file, which will have a format in which for each repo a user can either request all conventions, all conventions of several types (like "everything starting with 'log–activity' and everything starting with 'asset') or cherrypick individual conventions.
- Writing definitions as explained in our tutorial and commiting them into either
staging
ormain
will trigger the CI process, test that the examples are adequate, publish the validation code, publish the conventions and publish the wiki containing all documentation. The links will also be clearly exhibited for the user.
@octavio after change to npm deployment, move this all to Documentation
Configure CI environment variables
You need a source for the FarmOS schemas, which are the base for all the conventions in this repo. There are two different ways to get them.
- If you have a FarmOS aggregator and credentials for it, you can retrieve the schemata directly from a farm.
- If you don't want/can't log in into a FarmOS aggregator instance or if you want to follow the entity selection from a preexisting source repo (another fork of this json_schema_disribution), you can point to it.
- If you want to publish a validator package, you need to publish your npm toke in the
.env
file as NPM_AUTH_TOKEN.
Configure the sources for FarmOS schemas and conventions
Both should be parametrized in the .env
file.
Getting FarmOS schemata from a farm, via FarmOS.js and the farmOS aggregator
It only has two mandatory variables:
FARM_DOMAIN
: the url of a valid FarmOS instance known to be up to date with the expected data standards. A model instance.FARMOS_KEY
: The has key to your aggregator instance, which of course needs to have access to the target farm.
Getting FarmOS schemata from a source repo
SOURCE_REPO_PROJECT_ID
: The GitLab project id of a git repo hosting a set ofconventions
with the format we expect here (typically, another fork of this same project). You can find this ID by navigating to the repo, it is shown right below the repo name in the main repo contents view asProject ID: XXXXX
.
Configure Package Publication
A third, non mandatory variable will be required if you want to publish your validators as an npm package:
PUBLISHED_PACKAGE_NAME
Publishing will also require you to configure a private CI variable with your NPM credentials for a valid target repo.
Structure of the output/conventions
folder
- For each convention, a folder with the convention name is expected.
- Inside the folder, one subdirectory is
examples
, in which two further paths should becorrect
andincorrect
. These will be used to test is the schema is accepting the entitites we want it to accept and rejecting the entities we want it to reject. We will know if a schema is working properly if it accepts the corrent examples and rejects the incorrect examples. - In the root path, a file named
schema.json
should have a schema describing our convention. Typically, the convention will have several fields and each field should have atype/bundle from the parametrized farm, with a name unique to the schema. It is paired with anobject.json
file, which contains our richer JavaScript object describing the convention, which is ideal if, for example, the user intends to take a preexisting convention and modify it further.
Run Initialization Scripts
scripts/getAllSchemas.js
will provide you with the source FarmOS entity schemata.- Build all the validators and schema files for te basic entities with
scripts/compileAllValidators.js
. - Build all conventions using
scripts/rebuildCollection.sh
(attention, this one is a BASH script, not node). Conventios are obtained from thedefinitions
folder.
- If you want to start from scratch, erase the conventions you obtained from your source repo and the
output/collection
,output/documentation
,output/validators
folders, write your definitions and run the scripts again.
Where are conventions stored?
All your work should happen inside the definitions
folder. You can see how we use our schema_builder
library to define conventions in the definitions
folder of this repo. More details in the documentation for schenma_builder
.
Definitions describe how you intend your conventions to be. The conventions will be compiled based on your files and stored in the output/collection folder in a hierarchical folder structure.
What the repo will do for you
- It will download and tidy (de drupalize) all schemas from the selected farm, using FarmOS.js (the script doing that is
./scripts/getAllSchemas.js
). - It will test each provided convention against each correct and incorrect and inform if the results were corresponding to what was expected (see
./test/conventions_against_examples.test.js
). - If you like all the changes you've done, your tests are passing and you want to merge into
main
, it will publish the new version of the schemas as artifacts and it will publish validation code for all your schemas and conventions in a uniquely named package (./scripts/compileAllValidators.js
and./scripts/transpilation.js
to get a browser compatible version of the code, also publised in the library). - Offers an easy to use
gitpods
interface to allow you to work with the needed libraries with 0 configuration and get acquainted with the whole process.
How to define new Conventions
We created a library which helps build a convention in the proposed style using high level commands and facilitating testing against examples, documenting, etc. Its functionality is contained in the convention_builder
module. See documentation here. A fairly detailed tutorial on how to use it shown here.
Other Contents of the Repo
Experiments
Experiments are documented in a further document inside the examples folder.
We explored several technical pathways we need to achieve the aims we stated above. Besides, we documented our explorations as much as possible, because we believe there will be a need to make JSON schema usage easy and immediate for developers of this spaces and the tool is simple enough, just lacking in documentation around the kind of usage we envision.
Centralized Publication of Schemas and Conventions
This repo incorporates a full CI/CD workflow that retrieves schemas from a model farm (chosen by the user with a parameter) and publishes each schema in a de drupalized form into folders structured by type and bundle. Besides it compiles autonomous schema validator functions using AJV, allowing to check if a file adheres to a schema in an as efortless way as possible. The schemas and validators are published into NPM to allow easy retrieval both in backend and frontend applications (via a node connected CDN). Besides we will offer a model Express endpoint to allow to share the schemas via any server with an easy model query using the type and bundle as parameters.
How is this done
A CI/CD pipeline publishes both the schemas as static files and the validators as an NPM library. Schemas are store inside the output/schemata/reference_collection
archive structure.
Improving existing schemas
- It is important to offer a de drupalized version of the schema, which is the one effectively working in the validation of a payload sent to farmOS, works with JSON schema validation tools and is also easier to read. The de drupalization is performed by
FarmOS.js
. - We also want to make some currently implicit constraints explicit. An example is the
measure
field inquantity
shemas, which is currently listed asstring
but is really constrained to a fixed list of possible values.
How we plan to do this
We will store overlays next to the de drupalized schemas, containing the known implicit conventions. These will be applied when building the validator code.
Development of a Conventions Schema Format
Currently, the FarmOS community and the open Ag Wallet community are working on the concept of conventions
as constructs involving many entities
and several conditions around how they are filled and interconected.
We believe being able to express these conventions around a schema
would be fairly valuable as even filling one entity can be demanding when working out of the railguards of the FarmOS web interface. Also, having a strong validation tool would allow conventions
to be more precise, strict and specific without demanding more effort from users.
How is this done
We already have propositions of functional convention schemas between the examples, as well as validators for these schemas.
Documentation
Examples From SurveyStack
We are working with data submitted primarily via SurveStack and adapting pre existing scripts in which conventions were mostly procedurally implied. A number of helper functions and structures are added to aid in this task.
Where and How to store Survey Stack Examples
- A tutorial showing this process exists in the
experiments/usingExamplesImporter.js
file. It is heavily commented and intendede to be read and run linewise while looking at the effects of different steps in the code.
- Raw json files as obtained directly from the survey
API Compose
tab in the scripts section should be stored in thedefinitions/examples/raw
folder, using a meaningful name that clearly indicates the name of the convention. If several examples for a unique convention are used, a suffix index can be added. A valid path could look similar todefinitions/examples/raw/log--activity--tillage_4.json
. - The json files as obtained from SurveyStack (raw) are not ready to be used. We need to shape them as the schema expects. To do this, it is recomendable to load the file using the function
exampleImporter
from the libraty stored insrc/schemaUtilities
. It will read and sanitize the data into an array of valid entities. The user will need to structure it according to the convention and typically test and fix the convention until it works with the example. The script should have the same name as the example (but a.js
termination, as it is JavaScript code) and simply be stored together with it. - The finished, processed example will be stored in a separate folder,
definitinos/examples/processed
, with a meaningful name. This is the file that should be called in the definition for a convention.