nmatch
v0.1.0
Published
combine multiple pattern-matching indices
Downloads
1
Maintainers
Readme
NMatch lets you combine multiple pattern matchers.
Say you have a set of URL patterns, like "/foo"
and "/foo/{id}"
,
and a set of HTTP method patterns, like "GET"
, "POST"
, and "ANY"
,
and a set of user role patterns, like "{ user: "alice" }
and
{ group: "admins" }
. You can bundle these into NMatch and then ask it
"what's the handler for this user doing this method to this URL?"
You do this in three steps.
First you define what your rules will look like by choosing a set of matchers:
const NMatch = require('nmatch')
const router = new NMatch([
() => new UrlMatcher(),
() => new MethodMatcher(),
() => new UserRoleMatcher()
])
Then you set rules that map patterns to results. Each rule takes one pattern per matcher, plus the result that the rule should return when it matches:
router.set('/foo', 'GET', { group: 'public' }, getFoosHandler)
router.set('/foo/{id}', 'GET', { group: 'public' }, getAFooHandler)
router.set('/foo/{id}', 'PUT', { group: 'admins' }, editFooHandler)
router.set('{default+}', 'ANY', { group: 'public' }, defaultHandler)
Now you can match terms against the ruleset, one term per matcher. NMatch will return the best result that matches all the terms: the first term against the first matcher, the second term against the second matcher, and so on.
const alice = { 'username': 'alice', 'groups': ['public', 'admins'] }
const anon = { 'groups': ['public'] }
tap.is(router.match('/foo/bar', 'GET', anon), getAFooHandler)
tap.is(router.match('/foo/bar', 'PUT', anon), defaultHandler)
tap.is(router.match('/foo/bar', 'PUT', alice), editFooHandler)
API
The NMatch constructor takes for argument an array of functions, each of
which should return a new instance of a matcher. They will be called as
needed when more patterns are added. Their order is the same as that of the
set
and match
methods' arguments.
const Id = require('nmatch/matchers/id') // "matches" using `===`
const bookNotes = new NMatch([
() => new Id(), // Author
() => new Id() // Title
])
The set
method takes as many arguments as you've defined matchers, plus
one for the value being set. set
will associate that value to the given
patterns. What counts as a pattern depends on the kind of matcher you use.
bookNotes.set('Homer', 'Odyssey', 'Really old')
bookNotes.set('James Joyce', 'Ulysses', 'Just here for the puns')
The match
method takes as many arguments as you've defined matchers.
It will pass the first argument to the first matcher, the second argument to
the second matcher, and so on. It returns the best match, or null
if no
matches are found.
tap.is(bookNotes.match('Homer', 'Odyssey'), 'Really old')
tap.is(bookNotes.match('Homer', 'Ulysses'), null) // no match
The NMatch constructor does some basic validation on its arguments
tap.throws(() => new NMatch(), 'At least one matcher is required')
tap.throws(() => new NMatch([]), 'At least one matcher is required')
tap.throws(() => new NMatch(['herp']), 'Matchers must be functions')
tap.throws(() => new NMatch([() => 'derp']),
'Matchers must have a `pattern` and a `match` function')
tap.throws(() => new NMatch([() => ({ pattern: () => 1 })]),
'Matchers must have a `pattern` and a `match` function')
tap.throws(() => new NMatch([() => ({ match: () => 1 })]),
'Matchers must have a `pattern` and a `match` function')
set
and match
will check that you pass in the right number of arguments,
but they can't tell whether you've given them in the right order. Make sure
to be consistent.
tap.throws(() => bookNotes.set('Homer', 'Iliad'),
'Too few arguments')
tap.throws(() => bookNotes.set('Homer', 'Iliad', 'Trojan war', 'Prequel'),
'Too many arguments')
tap.throws(() => bookNotes.get('Homer'),
'Too few arguments')
tap.throws(() => bookNotes.get('Homer', 'Iliad', 'Odyssey'),
'Too many arguments')
bookNotes.set('Iliad', 'Homer', 'Trojan war') /* Oops, arguments in wrong order,
but NMatch has no way of knowing */
tap.is(bookNotes.match('Homer', 'Iliad'), null, 'No match: order matters')
Built-in Matchers
NMatch is powered by matchers, objects whose job it is to store patterns and perform matching against them. Let's look at the built-in matchers before discussing how you can build your own.
Id
The most basic matcher is Id
, whose "patterns" can be any object, and
whose "matching" is just applying the strict equality (===
) operator. It's
the equivalent of a Map
, and in fact uses a Map
internally to store
patterns and values.
const Id = require('nmatch/matchers/id')
const id = new NMatch([ () => new Id() ])
id.set(123, 123)
id.set('Abc', 'Abc')
tap.is(id.match(123), 123)
tap.is(id.match(456), null)
tap.is(id.match('Abc'), 'Abc')
tap.is(id.match('A'), null)
const arr = [1, 2, 3]
id.set(arr, 'arr')
const obj = { a: 1, b: 2 }
id.set(obj, 'obj')
tap.is(id.match(arr), 'arr')
tap.is(id.match(arr.slice()), null, 'In javascript, `[] !== []`')
tap.is(id.match(obj), 'obj')
tap.is(id.match({ a: 1, b: 2 }), null, 'In javascript, `{} !== {}`')
Calling set
a second time with the same pattern overrides the old value
id.set('Abc', 789)
tap.is(id.match('Abc'), 789, 'Used to be "Abc"')
You can use any object for patterns and/or values
const dub = a => 2 * a
id.set('double', dub)
id.set(dub, 2)
tap.is(id.match('double'), dub)
tap.is(id.match(dub), 2)
tap.is(id.match(a => 2 * a), null, 'In javascript, `() => {} !== () => {}`')
Matchers are essentially enhanced Maps: they bind keys to values. The
difference is that the key-matching operation for Matchers can be anything,
whereas for Maps it's always ===
. Still, Maps are so useful that we need
a matcher that replicates their behavior. Id
is that matcher. It's a simple
wrapper around a Map
.
class Id {
constructor () {
this._map = new Map()
}
Maps have a get(key)
method that returns the value bound to key
.
Matchers need two such methods: one for pattern matching, and one for
getting the value for a pattern (literally). These are the match
and
pattern
methods, respectively.
match
is a generator that yields results in order from best match to
worst. In our case here we will either yield one value or none: either
there's a value at the given key, or there isn't. But more generally, match
methods can yield 0 or more results.
* match (obj) {
const v = this._map.get(obj)
if (v !== undefined) {
yield v
}
}
pattern
is a getter that's used to bind a value to a pattern. Matchers
don't store these values directly; they store "container" objects on which
the caller can attach values. This layer of indirection is what allows
NMatch to chain matchers. pattern
returns the container object associated
to the given pattern, first creating it if it doesn't exist.
pattern (obj) {
let v = this._map.get(obj)
if (v === undefined) {
v = {}
this._map.set(obj, v)
}
return v
}
}
Paths
Paths
matches paths against patterns that can contain wildcards. It makes
use of the hierarchical nature of paths to store and match patterns
efficiently. If you run a benchmark test please share your results.
const NMatch = require('nmatch')
const Paths = require('nmatch/matchers/paths')
const p = new NMatch([ () => new Paths() ])
Patterns are made up of literals, wildcards, and super-wildcards. Literals match when strings are exactly equal, obv:
p.set('lit', 1)
p.set('lit/foo', 2)
p.set('lit/bar', 3)
tap.equals(p.match('lit/bar'), 3)
tap.equals(p.match('lit/quux'), null)
Wildcards (*
) will match anything in a single path segment
p.set('wild/*', 4)
p.set('wild/*/foo', 4.1)
tap.equals(p.match('wild/foo'), 4)
tap.equals(p.match('wild/bar/foo'), 4.1)
tap.equals(p.match('wild/foo/bar'), null)
Super-wildcards (**
) will match anything including further path segments
p.set('super/**', 5)
tap.equals(p.match('super/foo'), 5)
tap.equals(p.match('super/foo/bar'), 5)
It's an error if you try to add more patterns after a super-wildcard
tap.throws(() => p.set('super/**/useless suffix', 5))
Wildcards and super-wildcards apply to path segments as a whole; there are no partial string matches. The following example patterns are just literals:
p.set('lit/w*', 6)
p.set('lit/w**', 7)
tap.equals(p.match('lit/wat'), null)
tap.equals(p.match('lit/w*'), 6)
tap.equals(p.match('lit/w**'), 7)
Only patterns can have wildcards. When you call match
, the argument is just
a literal.
p.set('foo/bar', 8)
p.set('foo/baz', 9)
tap.equals(p.match('foo/*'), null)
When more than one pattern matches, they are sorted in order from tightest match to loosest match. A literal matches more tightly than a wildcard, which in turn matches more tightly than a super-wildcard. NMatch will return the first hit.
p.set('foo/bar', 'foobar')
p.set('foo/*', 'splat')
p.set('foo/*/**', 'splat super')
p.set('foo/**', 'super')
tap.equals(p.match('foo/bar'), 'foobar')
tap.equals(p.match('foo/foo'), 'splat')
tap.equals(p.match('foo/bar/baz'), 'splat super')
Customizing
The default path separator is /
, but you can change that
const ns = new NMatch([ () => new Paths('::') ])
ns.set('Foo::*', 1)
tap.equals(ns.match('Foo::Bar'), 1)
You can use a regex to specify the separator
const words = new NMatch([ () => new Paths(/\s+/) ])
words.set('Ho * Ho!', 1)
tap.equals(words.match('Ho Ho Ho!'), 1)
Or you can define your own separator function
const urls = new NMatch([ () => new Paths(path => {
const parts = path.replace(/\?.*/, '').split('/').slice(1)
return parts.length > 0 ? parts : ['']
})])
urls.set('/foo/bar', 1)
tap.equals(urls.match('/foo/bar?baz=quux'), 1)
You can use a falsy separator to turn off path separation altogether
const lit = new NMatch([ () => new Paths('') ])
lit.set('foo/*', 1)
tap.equals(lit.match('foo/bar'), null)
tap.equals(lit.match('foo/*'), 1)
The default wildcard symbol is *
, but you can change that
const madLibs = new NMatch([ () => new Paths(' ', '____') ])
madLibs.set('The ____ brown ____', 1)
tap.equals(madLibs.match('The quick brown fox'), 1)
You can use a regex to specify what counts as a wildcard
const ruby = new NMatch([ () => new Paths('/', /^:\w+/) ])
ruby.set('foos/:id', 1)
tap.equals(ruby.match('foos/123'), 1)
Or you can pass in your own function to determine if a string is a wildcard
const params = ['size', 'color']
const fw = new NMatch([ () => new Paths(':', str => params.includes(str)) ])
fw.set('towels:size:color', 1)
tap.equals(fw.match('towels:large:green'), 1)
You can use a falsy value to turn off wildcards altogether
const quiteLit = new NMatch([ () => new Paths('.', null) ])
quiteLit.set('*.*', 1)
tap.equals(quiteLit.match('cmd.exe'), null)
tap.equals(quiteLit.match('*.*'), 1)
The default super-wildcard symbol is **
, but you can change that
const trunc = new NMatch([ () => new Paths(' ', '____', '...') ])
trunc.set('The ____ brown ...', 1)
tap.equals(trunc.match('The quick brown fox jumped'), 1)
You can use a regex to specificy what counts as a super-wildcard
const apiGateway = new NMatch([ () => new Paths('/', /^\{\w+\}$/, /^\{\w+\+\}$/) ])
apiGateway.set('{proxy+}', 1)
tap.equals(apiGateway.match('foo/bar'), 1)
Or you can pass in your own function to determine if a string is a super-wildcard
const anything = new NMatch([ () => new Paths(0, 0, str => true) ])
anything.set('really anything', 1)
tap.equals(anything.match('welp'), 1)
You can use a falsy value to turn off super-wildcards altogether
const http = new NMatch([ () => new Paths('/', '*', false) ])
http.set('*', 1)
http.set('**', 2)
tap.equals(http.match('wild'), 1)
tap.equals(http.match('not/wild'), null)
tap.equals(http.match('**'), 2)
Consider the difference between
find / -path /usr/bin/node
and
ls /usr/bin/node
The first visits every file on the system and checks if their name is "/usr/bin/node". It takes 15s to run on my laptop.
The second takes 3ms to run. It doesn't visit every file. It visits /usr
,
/usr/bin
, and /usr/bin/node
. It can do this because directories are
organized in a tree structure, and file names represent paths through that
tree. It just needs to walk the path one /
-delimited node at a time.
Inspired by this, we can build a Paths matcher that stores its patterns in a tree, and does its matching by traversing that tree.
We'll support three kinds of path elements: literals, wildcards, and super-wildcards. Literals are strings that match only the same string, wildcards match any string, and super-wildcards match any string, even across path separators. Take for instance this four pattern set:
| Pattern | "/foo/bar" | "/foo/baz" | "/foo/bar/baz" |
| ---------- | ---------- | ---------- | -------------- |
| /foo/bar
| Match | No match | No match |
| /foo/baz
| No match | Match | No match |
| /foo/*
| Match | Match | No match |
| /foo/**
| Match | Match | Match |
We can represent these four patterns as a tree with six nodes:
/
└── foo/
├── bar
├── baz
├── *
└── **
Before we get around to writing our matcher, let's implement that Node object. It will handle tree traversal and matching individual path segments.
We'll need a way for a Node to reference a child Node:
const nextKey = Symbol('nmatch/path nextNode')
And we'll need a way to tell whether a Node is the end of a path.
This isn't as simple as checking whether it has any children. In the
pattern set ("/foo", "/foo/bar")
, the Node "/foo" has a child Node,
but it's also the end of a path. We need to attach a flag on every
node that's a valid endpoint:
const endKey = Symbol('nmatch/path isEnd')
A Node needs to keep a list of its children. We can simply use an Object for
this purpose: the key is the child's path part string (e.g. "foo" or
"bar"), and the value is the child Node. We could treat *
and **
as
special keys for wildcards and super-wildcards, but because we let users
define custom wildcard tokens (via test functions), it's possible for *
or
**
to be legitimate literal path segments. So we'll use separate variables
to hold those two.
class Node {
constructor (wildcardTest, superWildcardTest) {
this._isWildcard = wildcardTest
this._isSuperWildcard = superWildcardTest
this._literals = {}
this._wildcard = undefined
this._superWildcard = undefined
}
Matching is simple: we want to find an exact match, a wildcard match, and/or a super-wildcard match, in that order. So we'll ask three questions: "is there an exact match? a wildcard? a super-wildcard?". For any "yes", we yield the matching child Node. If we're at the end of the path, this means yielding the child directly; otherwise it means recursively matching the rest of the path against the child Node.
* match (pathParts) {
for (let m of [this._literals[pathParts[0]], this._wildcard, this._superWildcard]) {
if (m === undefined) continue
const isLeaf = (pathParts.length === 1 || m === this._superWildcard) &&
m[endKey] !== undefined
if (isLeaf) yield m
const isBranch = pathParts.length > 1 && m[nextKey] !== undefined
if (isBranch) yield * m[nextKey].match(pathParts.slice(1))
}
}
When adding paths to our Paths matcher via its pattern
method, we'll
need an analogous part
method on each Node: a function that returns a
handle to the part's container, and creates it as needed. There are two
modes: if the part is an endpoint, we just set the endpoint flag on the
container. Otherwise we put a new child Node in the container.
part (part, isEndpoint = false) {
let [p, partType] = this._getOrCreate(part)
if (isEndpoint) {
p[endKey] = true
} else {
if (partType === 'super-wildcard') {
throw new Error('Super wildcards must be the last path element')
}
if (p[nextKey] === undefined) {
p[nextKey] = new Node(this._isWildcard, this._isSuperWildcard)
}
}
return p
}
Our part containers are just fresh Objects, i.e. {}
.
_getOrCreate (part) {
let partType = this._parseType(part)
let p = this._getExisting(part, partType)
if (p === undefined) {
if (partType === 'super-wildcard') {
this._superWildcard = {}
p = this._superWildcard
} else if (partType === 'wildcard') {
this._wildcard = {}
p = this._wildcard
} else {
this._literals[part] = {}
p = this._literals[part]
}
}
return [p, partType]
}
_getExisting (part, type) {
switch (type) {
case 'super-wildcard':
return this._superWildcard
case 'wildcard':
return this._wildcard
case 'literal':
return this._literals[part]
}
}
_parseType (part) {
if (this._isSuperWildcard(part)) {
return 'super-wildcard'
}
if (this._isWildcard(part)) {
return 'wildcard'
}
return 'literal'
}
}
The Node class we've just defined does all the heavy lifting when matching: it's the one that walks the tree and finds matches. The Paths class' main job is to split paths into parts, but that's done with a user-defined function. So there's not much left for it to do.
class Paths {
constructor (separator = '/', wildcard = '*', superWildcard = '**') {
this._separate = typeof separator === 'function'
? separator : this._getDefaultSeparator(separator)
const isWildcard = typeof wildcard === 'function'
? wildcard : this._getDefaultPartTest(wildcard)
const isSuperwildcard = typeof superWildcard === 'function'
? superWildcard : this._getDefaultPartTest(superWildcard)
this._pathStart = new Node(isWildcard, isSuperwildcard)
}
When adding a new pattern, we walk along the tree and create missing Nodes
as needed with the Node.part
method.
pattern (pattern) {
const pathParts = this._separate(pattern)
var step = this._pathStart
while (pathParts.length > 1) {
step = step.part(pathParts.shift())[nextKey]
}
return step.part(pathParts.shift(), true)
}
* match (path) {
yield * this._pathStart.match(this._separate(path))
}
_getDefaultSeparator (pattern) {
if (!pattern) {
return path => [path]
}
return path => path.split(pattern)
}
_getDefaultPartTest (pattern) {
if (!pattern) {
return part => false
}
return pattern instanceof RegExp
? part => pattern.test(part)
: part => part === pattern
}
}
Any
The Any
matcher is a wrapper that lets you match more than one instance at
a time. You pass its constructor another matcher (e.g. Id
or Paths
), and
you pass its match
method an array of instances. It will loop through that
array and call match
on the underlying matcher, returning the first match
it finds. This is useful when you need to support fallbacks or aliases.
const NMatch = require('nmatch')
const Any = require('nmatch/matchers/any')
const Paths = require('nmatch/matchers/paths')
const any = new NMatch([ () => new Any(new Paths()) ])
any.set('/foo/*', 'foo')
any.set('/bar/*', 'bar')
tap.is(any.match(['/foo/123', '/bar/456']), 'foo')
tap.is(any.match(['/bar/456', '/foo/123']), 'bar',
"Matches are searched for in the order they're given")
tap.is(any.match(['/baz/789', '/quux/012']), null)
tap.is(any.match(['/baz/789', '/quux/012', '/foo/345']), 'foo')
Any.match
takes an iterable for argument, and will throw an Error
otherwise. It doesn't count strings as iterables.
tap.is(any.match([]), null,
'Attempting to match nothing at all is the same as finding no match')
tap.throws(() => any.match(123),
'Argument must be a non-string iterable')
tap.throws(() => any.match('/foo/123'),
'Argument must be a non-string iterable')
If you don't specify a matcher, Any
will default to using Id
.
const defaultAny = new NMatch([ () => new Any() ])
defaultAny.set(any, any)
defaultAny.set(tap, tap)
tap.is(defaultAny.match([any, 123, tap]), any)
tap.is(defaultAny.match([123, tap, any]), tap)
All Any
does is one type check and one for loop; the real work is delegated
to the matcher it wraps.
class Any {
constructor (matcher = new Id()) {
this._matcher = matcher
}
pattern (pattern) {
return this._matcher.pattern(pattern)
}
* match (values) {
if (values == null || typeof values[Symbol.iterator] !== 'function' || typeof values === 'string') {
throw new Error('Argument must be a non-string iterable')
}
for (let value of values) {
yield * this._matcher.match(value)
}
}
}
Creating your own Matcher
Any object that implements the following two methods is a matcher:
function pattern (pattern)
stores the given pattern in the matcher, and
returns a container object on which we can set its value. NMatch.set
calls
this method.
function * match (instance)
yields a list of matches for the given
instance, ordered from best match to worst. NMatch.match
calls this
method.
There's no restriction on the type of argument taken by these functions. The one limitation is that they always get called with a single argument. The rest is up to you, and the semantics are pretty flexible.
For example, here's the implementation of the three matchers used in the intro:
class UrlMatcher extends Paths {
constructor () {
super('/', /^\{\w+\}$/, /^\{\w+\+\}$/)
}
}
class MethodMatcher extends Paths {
constructor () {
super(null, 'ANY')
}
}
class UserRoleMatcher {
constructor () {
this._users = new Map()
}
pattern (user) {
let u = this._users.get(user)
if (u === undefined) {
u = {}
this._users.set(user, u)
}
return u
}
* match (term) {
for (let [user, u] of this._users.entries()) {
if (term.groups.includes(user.group)) {
yield u
}
}
}
}
NMatch stores patterns in a tree-like structure, with the leftmost matcher as the root of the tree, and each subsequent matcher one level down in the tree. If you're working with a lot of patterns, you'll typically want to put your most discriminating matchers first to speed up matching operations.
const valueKey = Symbol('nmatch-value')
class NMatch {
constructor (matcherMakers) {
this._checkMatcherMakers(matcherMakers)
this._matcherMakers = matcherMakers
this._root = this._matcherMakers[0]()
}
To set a value, we walk down the pattern tree, creating missing nodes as needed, and set the value on the end node.
set (...patterns) {
const value = patterns.pop()
this._checkParts(patterns)
let cur = this._root
const nexts = this._matcherMakers.slice(1)
let pattern, next
while ((pattern = patterns.shift(), next = nexts.shift())) {
cur = cur.pattern(pattern)
if (cur[valueKey] === undefined) {
cur[valueKey] = next()
}
cur = cur[valueKey]
}
cur.pattern(pattern)[valueKey] = value
}
To match an instance, we also walk down the pattern tree, but each step forward is a matching operation that returns 0 or more child nodes. If the first child node doesn't match, we try the second, and so on until we know for sure whether we have a match or not.
match (...instances) {
this._checkParts(instances)
return this._rMatch(this._root, instances)
}
_rMatch (matcher, instances) {
const rest = instances.slice(1)
for (let candidate of matcher.match(instances[0])) {
if (rest.length === 0) {
return candidate[valueKey]
}
let subMatch = this._rMatch(candidate[valueKey], rest)
if (subMatch !== null) {
return subMatch
}
}
return null
}
_checkMatcherMakers (matcherMakers) {
if (matcherMakers.length < 1) {
throw new Error('At least one matcher is required')
}
for (let matcher of matcherMakers) {
if (typeof matcher !== 'function') {
throw new Error('MatcherMakers must be functions')
}
let m = matcher()
if (typeof m.pattern !== 'function' || typeof m.match !== 'function') {
throw new Error('MatcherMakers must have a `pattern` and a `match` function')
}
}
}
_checkParts (parts) {
if (parts.length !== this._matcherMakers.length) {
throw new Error('Bad part count. ' +
`Expected ${this._matcherMakers.length}, have ${parts.length}`)
}
}
}
Contributing
This project is deliberately left imperfect to encourage you to participate in its development. If you make a Pull Request that
- explains and solves a problem,
- follows standard style, and
- maintains 100% test coverage
it will be merged: this project follows the C4 process.
To make sure your commits follow the style guide and pass all tests, you can add
./.pre-commit
to your git pre-commit hook.