range-group
v1.1.0
Published
Data structure to hold a collection of ranges. It has builtin support for numbers, dates, and strings, fast O(log(log(N))) bounds, and provides comprehensive set operations.
Downloads
5
Maintainers
Readme
range-group
A RangeGroup
is a data structure that holds a collection of one-dimensional ranges. It has builtin
support for integers, reals/floats, dates, unicode characters, and unicode strings. You can also
create your own Range
type to integrate your own data types. It supports both inclusive (closed)
and exclusive (open) range bounds. It has a comprehensive API for operations, such as searching,
set operations (like union or intersection), filtering, random sampling, and more. It supports
interpolation search for O(log(log(N)))
membership tests and O(N*log(log(M)))
set operations. It
can also fallback to binary search for similar log(N)
bounds.
Internally, it is storing a sorted, non-intersecting list of ranges. In many cases, it will
give you better memory usage and more efficient set operations over the builtin Set
type.
API documentation | npm package | GitHub source code
Installation
npm i range-group
This project uses ES 2015+ class features. A Babel transpiled and minified version is provided as
rangegroup.compat.min.js
, with exports under RangeGroup
; though I highly recommend building
a bundle yourself to customize the target and reduce code size. A plain minified version is provided
as rangegroup.min.js
.
The builtin range types, like IntType
or DateType
, are tree-shakeable. Feel free to submit a
pull request to include another general purpose types you've made.
Quickstart
Take a look at the API for more comprehensive class and method documentation.
import { RangeGroup, ComparisonModes, IntType, RealType, ... } from "range-group";
If not using a bundler, you'll need to import from the minified version, which is pre-bundled.
A simple example:
const group = new RangeGroup([0,5], {type:IntType});
group.has(5); // true
group.subtract([3,6]);
group.ranges; // [{start:0,end:3,endExcl:true}]
Most of the methods on RangeGroup
require the group to be normalized (exceptions being things
like copy
, filter
, isEmpty
, etc). This means the ranges are sorted and non-intersecting. The
sort
and selfUnion
methods can do this, or the normalize
method which calls these. You can
also pass a normalize
flag to the constructor:
const group = new RangeGroup(
// also demonstrating another syntax for specifying the initial ranges
[{start:-2,end:-5}, {start:4,end:8}, {start:3,end:10}, {start:11,end:16}],
{type: IntType, normalize: true}
);
group.ranges; // [{start: 3, end: 16}]
Note the first range was removed, since its end came before its start (IntType
uses ascending
order). Additionally, the last two ranges were merged as there are no integers between 10 and 11.
These behaviors result from the implementation of IntType.compare
.
The RangeGroup.diff
method is the most powerful, and most operations are implemented as calls to
it. It can give you the difference between two groups:
const a = new RangeGroup([0,7], {type:IntType});
const b = new RangeGroup([3,10], {type:IntType});
const diff = a.diff(b, {copy:true, track_sources:true, self_union:false});
console.log(diff.ranges);
/* [
{start:0, end:3, endExcl:true, a:0, b:null},
{start:3, end:7, a:0, b:0},
{start:7, end:10, startExcl:true, a:null, b:0},
]
*/
See how filtering these results can give you the various set operations. This is supported with the
filter
option. For this and other options, logic has been added to optimize the work performed.
Setting self_union=false
keeps the classified ranges separate rather than merging them
automatically. If later you want to perform additional diff operations, you'll need to put it into
normalized form by calling selfUnion
(or normalize
, though it will perform an extra unnecessary
sort).
A Sampler
class has been added which can randomly sample from a RangeGroup
. For this to be efficient, it caches a cumulative summation on creation for use with binary search. If you want to update
the RangeGroup
, you'll need to create another Sampler
:
const group = new RangeGroup([[0,5],[10,15]], {type:IntType});
const sampler = new Sampler(group);
// uniform sample
sampler.sample();
// non-uniform sample
sampler.sample(my_distribution());
// percentile function
sampler.sample(.2);
Builtin Types
A RangeGroup
needs to be passed a RangeType
object, which acts as interface between your Range
type and the methods inside RangeGroup
. You can set a default type by changing
RangeGroup.default_type
. The following builtin types are available:
| Type | Example
|-------------------|------------------
| IntType* | [0,5) [12,15] (20,Infinity]
| RealType | [0,5.23) [12.5,16]
| FloatNormType | Same as RealType
, but normalizes exclusive bounds to be inclusive
| DateType* | (new Date("04:30 1/12"), new Date("04:36 2/15)]
| DateFloorType* | Provides Second
, Minute
, Hour
, and Day
types, clamping to those time units
| UnicodeType* | [a, z] [f0, f9) [👦🏻, 👦🏿]
| StringType | Helpers for converting string ranges to UnicodeType
| CommonType | Helpers for implementing your own type
Ones marked with an asterisk/*
also support a Norm
type, e.g. IntNormType
or
DateFloorNormType
. These will automatically normalize the range bounds to be inclusive. By
default, ranges can be both open or closed, like the half-open interval [0,3)
that got output in
the example above. If we instead used IntNormType
, this would get automatically normalized to
[0,2]
instead. This can be simpler to work with, and can improve performance.
There is no normalization for RealType
, since there is no "next real number" you can modify to.
You can use the FloatNormType
however, albeit with negative performance impact, to normalize to
the next representable floating point number.
Caution: The Norm types should not be passed pre-constructed ranges when constructing a RangeGroup. For optimization, it is assumed preconstructed ranges are already normalized. Pass arguments instead if you wish to have them normalized on creation (see API for options).
Custom Types
A design choice for the library was to decouple the Range
and RangeType
interfaces. This allows
Range
to be a plain JSON object, with operations on those objects described by RangeType
. To
create a new type, simply create an object with methods matching the RangeType
interface. The
methods provided in CommonType
can provide a starting place, or you might wrap any of the other
builtin types like IntType
.
Additionally there are helpers like CommonType.compareEpsilon
. It can wrap a type and define a
threshold to merge two ranges:
const EpsilonType = CommonType.compareEpsilon(0.25, RealType)
const group = new RangeGroup(
[[0,.1],[.3,.6]],
{type:EpsilonType, normalize:true}
);
group.ranges; // {start:0, end:.6}
There is also CommonType.compareBinarySearch
, which forces the type to use binary search instead
of interpolation search. Binary search has better worst case time complexity, and a little less
overhead, so can be a better choice in some scenarios.
Method Reference
The following methods are available on a RangeGroup
:
General
copy
sort
normalize
: a combination ofsort
andselfUnion
Set iteration
iterate
,@@iterator
Set membership
contains
orhas
search
: provides extra context about the found location
Set modification
diff
clear
,toCleared
filter
,toFiltered
,hasFilter
union
oradd
,toUnioned
,hasUnion
selfUnion
,toSelfUnioned
,hasSelfUnion
difference
orsubtract
ordelete
,toDifferenced
,hasDifference
intersect
,toIntersected
,hasIntersection
symmetricDifference
,toSymmetricDifferenced
,hasSymmetricDifference
You'll notice there are three variants for many:
- The
modify
methods perform the operation in-place - The
toModified
methods perform the operation on a copy, following the naming syntax introduced in ECMAScript 2023 - The
hasModification
methods give a boolean whether the result would be non-empty, without actually calculating the full results.
Most the operations are implemented using the diff
method. It supports numerous options, like
customizing in-place vs copy, boolean results, filtering, and labeling the source RangeGroup
. You
can use the diff
calculate additional operations that have not been given aliases.
Set characteristics
isEqual
: deep equality checkisEmpty
,cardinality
orsize
,isProperSubset
orisStrictSubset
,isSubset
isProperSuperset
orisStrictSuperset
,isSuperset