rdf-parse
v4.0.0
Published
Parses RDF from any serialization
Downloads
34,009
Maintainers
Readme
RDF Parse
This library parses RDF streams based on content type (or file name) and outputs RDF/JS-compliant quads as a stream.
This is useful in situations where you have RDF in some serialization, and you just need the parsed triples/quads, without having to concern yourself with picking the correct parser.
The following RDF serializations are supported:
| Name | Content type | Extensions |
| -------- | ---------------- | ------------- |
| TriG | application/trig
| .trig
|
| N-Quads | application/n-quads
| .nq
, .nquads
|
| Turtle | text/turtle
| .ttl
, .turtle
|
| N-Triples | application/n-triples
| .nt
, .ntriples
|
| Notation3 | text/n3
| .n3
|
| JSON-LD | application/ld+json
, application/json
| .json
, .jsonld
|
| RDF/XML | application/rdf+xml
| .rdf
, .rdfxml
, .owl
|
| RDFa and script RDF data tags HTML/XHTML | text/html
, application/xhtml+xml
| .html
, .htm
, .xhtml
, .xht
|
| Microdata | text/html
, application/xhtml+xml
| .html
, .htm
, .xhtml
, .xht
|
| RDFa in SVG/XML | image/svg+xml
,application/xml
| .xml
, .svg
, .svgz
|
| SHACL Compact Syntax | text/shaclc
| .shaclc
, .shc
|
| Extended SHACL Compact Syntax | text/shaclc-ext
| .shaclce
, .shce
|
Internally, this library makes use of RDF parsers from the Comunica framework, which enable streaming processing of RDF.
Internally, the following fully spec-compliant parsers are used:
- N3.js
- jsonld-streaming-parser.js
- microdata-rdf-streaming-parser.js
- rdfa-streaming-parser.js
- rdfxml-streaming-parser.js
- shaclcjs
Installation
$ npm install rdf-parse
or
$ yarn add rdf-parse
This package also works out-of-the-box in browsers via tools such as webpack and browserify.
Require
import { rdfParser } from "rdf-parse";
or
const { rdfParser } = require("rdf-parse");
Usage
Parsing by content type
The rdfParser.parse
method takes in a text stream containing RDF in any serialization,
and an options object, and outputs an RDFJS stream that emits RDF quads.
const textStream = require('streamify-string')(`
<http://ex.org/s> <http://ex.org/p> <http://ex.org/o1>, <http://ex.org/o2>.
`);
rdfParser.parse(textStream, { contentType: 'text/turtle', baseIRI: 'http://example.org' })
.on('data', (quad) => console.log(quad))
.on('error', (error) => console.error(error))
.on('end', () => console.log('All done!'));
Parsing by file name
Sometimes, the content type of an RDF document may be unknown, for those cases, this library allows you to provide the path/URL of the RDF document, using which the extension will be determined.
For example, Turtle documents can be detected using the .ttl
extension.
const textStream = require('streamify-string')(`
<http://ex.org/s> <http://ex.org/p> <http://ex.org/o1>, <http://ex.org/o2>.
`);
rdfParser.parse(textStream, { path: 'http://example.org/myfile.ttl', baseIRI: 'http://example.org' })
.on('data', (quad) => console.log(quad))
.on('error', (error) => console.error(error))
.on('end', () => console.log('All done!'));
Getting all known content types
With rdfParser.getContentTypes()
, you can retrieve a list of all content types for which a parser is available.
Note that this method returns a promise that can be await
-ed.
rdfParser.getContentTypesPrioritized()
returns an object instead,
with content types as keys, and numerical priorities as values.
// An array of content types
console.log(await rdfParser.getContentTypes());
// An object of prioritized content types
console.log(await rdfParser.getContentTypesPrioritized());
Obtaining prefixes
Using the 'prefix'
event, you can obtain the prefixes that were available when parsing from documents in formats such as Turtle and TriG.
rdfParser.parse(textStream, { contentType: 'text/turtle' })
.on('prefix', (prefix, iri) => console.log(prefix + ':' + iri))
Obtaining contexts
Using the 'context'
event, you can obtain all contexts (@context
) when parsing JSON-LD documents.
Multiple contexts can be found, and the context values that are emitted correspond exactly to the context value as included in the JSON-LD document.
rdfParser.parse(textStream, { contentType: 'application/ld+json' })
.on('context', (context) => console.log(context))
License
This software is written by Ruben Taelman.
This code is released under the MIT license.