urns
v0.6.0
Published
An RFC 8141 compliant URN library with some interesting type related functionality
Downloads
7,597
Readme
Installation
You can install this library with:
$ yarn add urns
The library includes TypeScript types.
Functionality
Parse URNs
Ensure that a given URN is valid according to RFC 8141 and extract all the relevant bits:
const parsed = parseURN("example:a:b");
This includes the ability to parse URNs with q
, r
and f
components,
e.g., urn:example:a123,0%7C00~&z456/789?+abc?=xyz#12/3
.
URN Types
In addition to the parsing functionality, it easy to define subtypes of string
for
representing specific classes of URNs, e.g.,
export type MyURN = BaseURN<"mydomain">;
// You can then use this type for strings, but only those that really
// fit the expected URN syntax, e.g.,
const a: MyURN = "urn:mydomain:anything"; // Conforms
const b: MyURN = "urn:wrongdomain:anything"; // TypeScript will flag this as an error!
You can further specialize these URNs with a second type parameter to specity the type for the namespace specific string (NSS), e.g.,
export type MySpecificURN = BaseURN<"mydomain", "foo" | "bar">;
// We now are restricted in what the NSS can be
const a: MySpecifcURN = "urn:mydomain:foo"; // Conforms
const b: MyURN = "urn:mydomain:buz"; // TypeScript will flag this as an error!
But the main functionality of this library is related to the use of URNSpace
s...
Create URN "spaces"
This library provides the notion of a URNSpace
. This is basically a way of
identifying URNs with a common NID (namespace identifier). Defining such a space
not only gives a simple means of "constructing" URNs associated with that NID,
it gives you methods for parsing and narrowing types via TypeScript's is
functionality.
Examples
Basics
const mongoIds = new URNSpace("mongoId");
const record1: BaseURN<"mongoId", string> = mongoIds.urn("1569-ab32-9f7a-15b3-9ccd"); // OK
const record2: string = mongoIds.urn("1569-ab32-9f7a-15b3-9ccd"); // Also fine, but loses type information
const record3: BaseURN<"mongoId", string> = "urn:mongoId:1569-ab32-9f7a-15b3-9ccd"; // works too
const record4: BaseURN<"mongoId", string> = "urn:postgres:1569-ab32-9f7a-15b3-9ccd"; // Nope
const record5: BaseURN<"mongoId", string> = "1569-ab32-9f7a-15b3-9ccd"; // Also nope
This also allows casting, e.g.,
// This narrows the type of `record3` from string to a more specific URN syntax string
if (mongoIds.is(record3)) {
const id = mongoIds.nss(record3); // Extract the embedded hex id
}
Predicates
When creating a URNSpace
you can provide a predicate function to perform further semantic checks on the NSS and/or
narrow the potential types for the namespace specific string (NSS). TypeScript's compiler will infer this from
the return type of the predicate. For example, we might define our URNSpace
like this:
const space = new URNSpace("example", {
pred: (s: string): s is "a" | "b" => s === "a" || s === "b",
});
...in which case, we get the following behavior:
space.is("urn:example:b")) // True (and narrows the type)
space.is("urn:example:c")) // False, TypeScript can "see" this isn't allowed!
Decoding
It is also possible to provide a decode
function when defining a URNSpace
. This allows us to perform
an additional decode
step during URN parsing which saves us the step of having to perform that decoding as
an additional step but also provides an additional semantic check (like the predicate) for testing whether
the URN truly belongs to the URNSpace
, e.g.,
const space = new URNSpace("customer", {
decode: (nss) => {
const v = parseInt(nss);
if (Number.isNaN(v)) throw new Error(`NSS (${nss}) is not a number!`);
return v;
},
});
space.decode("urn:customer:25")); // Evaluates to the number 25
space.decode("urn:customer:twenty-five")); // Throws an error
Motivation
Why URNs?
I've been vaguely aware of URNs for some time. But I never quite understood, what is the point? I mean a URL seems so much more useful. After all, a URN only names something, a URL tells you where to find it? Isn't the latter always better than the former? And then I had several realizations in quick succession.
Add some identity
The first was about the value of encoding in a URN. Yes, a URN is just a name/string. But it is a qualified name. How many times have I written code that looks like this:
function fetchRecord(server: string, id: string): string {
...
}
fetchRecord("example.com", "1569-ab32-9f7a-15b3-9ccd");
The first issue is how do I know what the heck "1569-ab32-9f7a-15b3-9ccd"
even
is?
So just in terms of attaching a bit more meaning to these things, what if,
instead of "1569-ab32-9f7a-15b3-9ccd"
I had used a string like this
"urn:mongoid:1569-ab32-9f7a-15b3-9ccd"
or, even better,
"urn:mongoid:user:1569-ab32-9f7a-15b3-9ccd"
. Now if I'm debugging this code or
looking at error messages I have a better sense of what that cryptic identifier
actually is (e.g., this is a mongo document id or, better yet, a mongo
document id that resolves to a user record)
So that's already a good reason to use URNs, i.e., they give you some context with which to interpret otherwise non-descript identifiers.
Not all strings are equal
This was around the time that TypeScript's template literal types came out. If you aren't familiar with template literal types, they let you do things like this:
type EventType = "create" | "update" | "delete";
type EventName = `event-${EventType}`;
This means that you can define a type that is a narrow set of possible strings (without having to enumerate them all). But you can also create types like this:
type URN = `urn:${string}`;
or, more specifically,
type MongoID = `urn:mongoid:${string}`;
or even,
type MongoUserID = `urn:mongoid:user:${string}`;
So what's the big deal here? The big deal is that now you have type safety.
Recall my previous fetchRecord
example but rewritten slightly:
function fetchRecord(server: string, id: MongoID): string {
...
}
fetchRecord("example.com", "urn:mongoid:1569-ab32-9f7a-15b3-9ccd");
Yes, the id
argument can't be passed directly to a Mongo call because it has
that extra "urn:mongoid:"
in front of it. But that is easily stripped away
either by using slice
or (better yet) by parsing the URN and extracting the ID
(which is one of the things this library takes care of).
What's really great about this is that now you can't mix up your string
arguments! If I accidentally called fetchRecord
with:
fetchRecord("urn:mongoid:1569-ab32-9f7a-15b3-9ccd", "example.com");
In this way, you can create a specially type constrained string type for pretty
much anything and keep them straight. This is especially useful if you find
yourself definiting functions with multiple (generic) string
arguments to them
and you want to avoid the situation where you mix things up. Once defined, each
of these URN types partitions the potential space of string values nicely into
disjoint sets.
Caveats
RFC 8141
I tried to stay as close as possible to RFC
8141. This includes processing
r-components
, q-components
and f-components
. If you find anything in this
library that deviates from that, let me know.
Encoding
One note...you need to be careful about encoding. URNs require encoding of
certain non-ASCII characters. As a result, even though you may assume that the
NSS portion of the URN is some subset of strings, e.g., " " | "a" | "b"
based on TypeScript types, once encoded the NSS portion may appear encoded,
e.g. "%20" | "a" | "b"
. So the actual strings may not strictly satisfy the
types implied by the type definitions. But the strings that go in
(pre-encoding) and come out (post-decoding) should so I don't think this is a
big deal.