@entryscape/csvw-js
v0.1.1
Published
The library csvw-js provides tools for RDF-generation and validation of csv-files against schemas according to W3Cs specifications.
Downloads
4
Readme
csvw model
This library provides tools for RDF-generation and validation of csv-files against schemas according to W3Cs specifications.
Installing
npm install @entryscape/csvm-js
Usage
The following two validation-commands can be executed from the command line:
validatecsv csv schema rows
Or if the library has not been installed globally:
node bin/validateCSV.js csv schema rows
This command validates given csv-data against a given schema. The parameter "csv" accepts a filepath to csv-data and the parameter "schema" accepts a filepath to a schema-file. The parameter "rows" is optional and accepts a number that specifies the maximum amount of csv rows to validate. It also accepts "undefined" meaning unlimited, this is the default value. The command returns a report table with potential invalidations and/or warnings.
An example of how the report table looks is found below.
generateRDF csv schema options
This commands generates RDF from given csv-data and a given schema. The parameter "csv" accepts a filepath to csv-data and the parameter "schema" accepts a filepath toa schema-file. the parameter "options" is optional and accepts an object with "dontIgnoreInvalidRows" and "dontIgnoreWarningRows". GenerateRDF will return with only a report table if any row was invalid and "dontIgnoreInvalidRows" is true or any row has warnings and "dontIgnoreWarningRows" is true. The default value for both is false.
The command return an array with three elements: rdfxml, graph and report. Rdfxml is the generated RDF in the form of RDF/XML, graph is the generated RDF as a graph. The command also validates the data the same way as the command "validate does. The third element is therefore a report table identical to the one described earlier.
In order to view all possible parameters for a command, type:
validate -h
or
generateRDF -h
Examples
A report table may look like this:
=== Invalids ===
┌─────────┬────────────┬─────────────────────────────────┬───────────┬────────┐
│ (index) │ Source │ Message │ Row │ Column │
├─────────┼────────────┼─────────────────────────────────┼───────────┼────────┤
│ 0 │ 'Datatype' │ 'length not equal to 5, got 10' │ 1 │ 'date' │
└─────────┴────────────┴─────────────────────────────────┴───────────┴────────┘
=== Warnings ===
┌─────────┐
│ (index) │
├─────────┤
└─────────┘
A validation command may look like this:
validatecsv ./test/skoldata-test.csv ./test/skoldata-test.json
An RDF-generation-command may look like this:
generateRDF ./example.csv ./example.json --dontIgnoreInvalidRows
Future improvements
Here are some potential improvements to the library:
- clear more tests from W3C
- implement ability to generate rdf from multiple different csv-files
- improve flexibility within RDF-generation, example: -option to switch case-sensitivity on and off for specific columns
- support RDF-generation for relative URIs
- allow different character encoding standards besides utf-8
- implement browser support
Testing the library
Tests are downloaded from the W3C CSVW repository on GitHub via the following command:
yarn synctests
After the tests are downloaded they can be run by executing the following command:
yarn test
Note that at the time of writing 87 tests fail while 301 test pass.