@nodeshapes/core
v0.2.0
Published
Nodeshapes is a TypeScript / JavaScript tree data structure and manipulation library, optimized for large dataset and tagged nodes.
Downloads
1
Maintainers
Readme
Node Shapes
Nodeshapes is a TypeScript / JavaScript tree data structure and manipulation library, optimized for large dataset and tagged nodes.
It has following features:
- Separation of tree shape and node data
- Partitioned node and tag data
- Explict tags wrapping node
- Plain JavaScript object data structure and functional manipulation
It is developed with TypeScript, but can be used by JavaScript as well.
It is tiny, with less than 1000 lines of code, comments included.
Motivation
Tree is often the ideal structure for many domain. Take website as an example. An entire site can be viewed as a tree, not only each HTML page, which is plain text but rendered to DOM tree. Manipulation, and more importantly, persistence of tree aren't easy, especially when tree gets deep and nodes' data get heavier.
siteRoot |- page1
|- page2
|- html |- head |- meta
|- link
|- title
|- ...
|- body |- nav
|- aside
|- article
|- foot
|- div |- div
|- p |- em
|- a
|- img
|- ...
|- ...
With mordern UI frameworks such as React
and technology such as css-in-js
, style that is traditionally loaded in css files, now is recommened to be embedded in each tag. And node is not only stardard HTML tags any more, it could be React
or VUE
component, wrapped in formatting <div>
.
Now we can have a tree of nodes that each one is self contained, with CSS style and data. Suddenly, it is possible to really use just one single tree to model an entire site. And each node and its subnodes would be modulized that can be considered an independent component, that could be copied, modified and reused elsewhere, and rendered by different engines, at client side or at server side.
This is great, except tree will be come deep and heavy. And performance of tree data persistence would be a problem, when data is heavy and tree is deep.
This library aims to use a carefully designed tree data structure and a set of pure functions, to address the problems.
Node
Node is recursive tree with explicit tags. Each node and tags of it has an unique id, which should be unique globally, and a partition id, which is to indicate in which partition its data should be put in.
/**
* Type definition for tagged node tree
*
* @typeParam T - Supported tag's types, should be string or string literal union type.
* @typeParam N - Supported node's types, should be string or string literal union type.
*
* All the types should be serializable to json.
* Implementations can use TypeScripts
* @public
*/
export interface Node<T = unknown, N = unknown> {
/** The node's type, in literal string union */
type: N
/**The node's id, globally unique, among all types' ids. */
id: IdType
/**
* The node's tags, from outmost to inner most
*
* Example: for HTML image link, the tags array would be:
* [{type: 'a', id: '<uuid-1>'}, {type: 'image', id: '<uuid-2>'}]
* if the node has no tag, for example, plain text, tags is undefined or empty array
*
*/
tags?: Tag<T>[]
/*
* Node's children. If no chldren, it should be empty array
*/
children: Node<T, N>[]
/**
* The node's partition, used to locate data quickly.
* Once created it is never changed.
*/
partition: IdType
}
Tag
Tag is anything that decorates a node. A node can have multiple tags. Each tag can have its own data, which could be very heavy, depending on how you design and use tags.
/**
* Type definition for tag
*
* @typeParam T - The tag's type, should be string or string literal union type
* @public
*/
export interface Tag<T = unknown> {
/**
* The tag's type
*/
type: T
/**
* The tag's id, globally unique, cross tag and node
*/
id: IdType
/**
* Partition of the tag's data.
* Partition id is designed to support data persistency optimization for tree data structure with large dataset
* Partition id can be recalculated.
*/
partition: IdType
}
Shape
Shape is a lean forest that stores separately its trees' shape, and node and tag data in data partitions. This is the single most important design of NodeShapes.
Node and tag data is indexed by unique node and tag id, and corresponding data partition key in a separate data field. Partition id should be generated by a function of node's path, node, and tag, and node and tag data. Lib user needs to carefully plan the data partition strategy according to use cases and infrastructure. For example, for a website, it is naturally to use each page's id as partition id, so that each page's data can be loaded independent to other pages, and also can be loaded in one fetch when to render a page.
Node's data is also flattened no matter how deep the node is in the tree.
/**
* Thge separation of tree shape, node data with a partition key is to optimize tree data persistency management.
* Tree shape and node data can be stored and manipulated separately according to your use case, data set and backend architecture.
* For example, when use NodeShapes for a large websie's pages, each page can have a unique id that is used as partition key for the page's data. Read, write and query of the page's data can then be optimized.
* @public
*/
export interface Shape<T = unknown, N = unknown> {
/*
* Children of the shapes. If no children, it shoul be an empty array.
* Each node here is root of its own tree.
*/
children: Node<T, N>[]
/**
* Data partitions by partition id.
*/
data: Record<IdType, Record<IdType, Data>>
}
PartitionIdGenerator
PartitionIdGenerator is the signature for the function to generate partition id. User of the lib must define his / her own partition id generator.
/**
*
* The singature of the function to calculate partition id. It is entirely up to the user to decide how to caculate partition id.
*
* Data partition can be used to optimize data storage and manipulation.
*
* For example, for web applications, a page's id can be used as the partition id so a page's data required for initial renderring can be fetched in one request.
*
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @param params - Parameters to be passed to the generator when shapes library uses it to generate parition id
* @return IdType - Partition id
*/
export type PartitionIdGenerator<T = unknown, N = unknown> = (params: {
/** path of the node */
path?: IdType[]
/** current node */
node: Node<T, N>
/** current node's data */
nodeData?: Data
/** current tag */
tag?: Tag<T>
/** current tag's data */
tagData?: Data
}) => IdType
IdGenerator
IdGenerator is the signature for the function to generate node or tag id. Mostly a uuid is sufficient.
/**
* Signature for the function to generate globally unique id.
*
* It is used for cloning node for copy, paste, insert etc. If not to provide custom id generator, a uuid is generated automatically.
*
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @return IdType
*/
export type IdGenerator<T = unknown, N = unknown> = (
node: Node<T, N>,
tag?: Tag<T>
) => IdType
Manipulation
A group of manipulation functions are provided. All the functions are pure that don't have side effects.
emptyShape
/**
* Create a new empty Shape instance
* @typeParam T - tag type
* @typeParam N - node type
* @return Shape - empty shape with only an empty data property
*/
export const emptyShape = <T = unknown, N = unknown>(): Shape<T, N> => ({
data: {},
children: [],
})
newLeafShape
/**
* create a new Shape with only one leaf node instance
* @typeParam T - tag type
* @typeParam N - node type
* @return Shape
*/
export const newLeafShape = <T = unknown, N = unknown>(
node: Node<T, N>,
data: Record<IdType, Record<IdType, Data>>
): Shape<T, N>
setData
/**
* **Curried**
*
* Set node / tag data by id and partition.
* @typeParam T - tag type
* @typeParam N - node type
* @param params - Parameters
* @param shape - Origional shape
* @return Shape - new shape
*/
export const setData = <T = unknown, N = unknown>(params: {
/** node or tag id */
id: IdType
/** node or tag's partition */
partition: IdType
/** node or tag's data */
data?: Data
}) => (shape: Shape<T, N>): Shape<T, N>
shiftTag
/*
* Shift tag
* Shift tag before or after
* @param params - parameters
* {
* path: IdType[] // tag's path, including the node this tag belongs to
* id: IdType // tag's id
* before: boolean // move before or after. default after.
* }
* @return Shape | undefined
*/
export const shiftTag = <T = unknown, N = unknown>(params: {
path: IdType[]
id: IdType
before?: boolean
}) => (shape: Shape<T, N>): Shape<T, N> | undefined
shiftNode
/**
* Shift node
* @param params - parameters
* @return Shape | undefined
*/
export const shiftNode = <T = unknown, N = unknown>(params: {
/** node or tag's path. if node, should not include the node itself. if tag, should include the tag's node */
path: IdType[]
/** id of the node or tag */
id: IdType
/** shift before or after */
before?: boolean
}) => (shape: Shape<T, N>): Shape<T, N> =>
extracShape
This the heavy lifting function that extract a node,all subnodes, and data from a shape. Parameter can be set to clone from source, delete existing, generate new id or generate new partition id.
/**
* **Curried**
*
* Extract a node a tree, and all the nodes' and tags' data of a shape, form a new shape with one single tree
*
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @param params - parameters
* @param shape - shape to extract from
* @return Shape
*/
export const extractShape = <T = unknown, N = unknown>(params: {
/** path of the root node to be extracted. Not including the node itself */
path: IdType[]
/** id of the root node to be extracted. */
id: IdType
/** if to clone extracted node */
clone?: boolean
/** if to delete from source */
deleteSource?: boolean
/** function to generate new id */
idGenerator?: IdGenerator<T, N>
}) => (shape: Shape<T, N>): { shape: Shape<T, N>; extracted: Shape<T, N> }
extratTag
Extract a tag and its data from a shape. Parameter can be set to deleteSource.
/**
* **Curried**
*
* Extract tag and data. Id remains unchanged.
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @param params - parameters
* @param shape - Origional shape
* @return TagAndData | undefined
* @public
*/
export const extractTag = <T = unknown, N = unknown>(params: {
/** path to the tag, including the tag's node's id */
path: IdType[]
/** tag's id */
id: IdType
/** if to delete from source */
deleteSource?: boolean
}) => (shape: Shape<T, N>): ExtractTagResult<T, N> | undefined
mountShape
/**
* **Curried**
* Mount node from one shape to a path and location at another shape. New ids will be generated for the nodes if idGenerator is provided. New partitionId will be generated for data if partitionIdGenerator is provided. Other wise, existing partion id of the node is used.
* @param params - parameters
* @param shape - Origional shape
* @return Shape
*/
export const mountShape = <T = unknown, N = unknown>(params: {
/** source node's path, not including the node itself */
sourcePath: IdType[]
/** source shape */
sourceShape: Shape<T, N>
/** source node's id */
sourceNodeId: IdType
/** target mount parent node's path. If to mount to root, use empty array */
targetPath?: IdType[]
/** id of the node to be mounted next to. if not provided, will be appended to the parent's children list */
position?: IdType
/** mount before or after */
before?: boolean
/** function to generate new id */
idGenerator?: IdGenerator
/** function to generate new partition */
partitionIdGenerator?: PartitionIdGenerator<T, N>
/** if to clone the source node tree */
clone?: boolean
}) => (shape: Shape<T, N>): Shape<T, N>
mountTag
/**
* **Curried**
*
* Mount tag to shape
* @typeParam T - tag type
* @typeParam N - node type
* @param params - parameters
* @param shape - Origional shape
* @return Shape | undefined
*/
export const mountTag = <T = unknown, N = unknown>(params: {
/** path of tag's node, including the node itself */
path: IdType[]
/** tag is to be next to */
position?: IdType
/** insert before or after */
before?: boolean
/** tag to be mounted */
tag: Tag<T>
/** Tag's data if any */
data?: Data
/** function to generate new partition id */
partitionIdGenerator: PartitionIdGenerator<T, N>
/** function to generate node and tag id */
idGenerator?: IdGenerator
}) => (shape: Shape<T, N>): Shape<T, N> | undefined
repartition
/**
* Generate new partition for a shape
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @param params - parameters
* @param shape - Origional shape
* @return Shape
*/
export const repartition = <T = unknown, N = unknown>(params: {
/** function to generate new partition id */
partitionIdGenerator: PartitionIdGenerator<T, N>
/** function to generate node and tag id */
idGenerator?: IdGenerator
/** the planned path this shape to be mounted to. If this is the root shape, set to empty or undefined */
plannedPath?: IdType[]
}) => (shape: Shape<T, N>): Shape
cloneShape
/**
* Generate new id for node and tag.
* @typeParam T - Tag's type, should be string or string literal union type
* @typeParam N - Node's type, should be string or string literal union type
* @param shape - Origional shape
& @param idGenerator - Function to generate new id
* @return Shape
*/
export const cloneShape = <T = unknown, N = unknown>(
shape: Shape<T, N>,
idGenerator?: IdGenerator<T, N>
): Shape<T, N>
Core features further explained
Seperation of tree shape and node data
In NodeShapes, each node or tag has a unique key. Tree shape is maintained through children list of node id only. Node data is maintained in a partitioned data map for high performance data storage and access.
In classic tree structure, each node contains its own data. In most usecases, once a tree's shape is determined and a node is located, node data is accessed and updated frequently but tree shape changes less frequently.
In mordern functional programming world, a new tree is generated even when just a minor update. In NodeShapes, as data is flattened, access and update will be more efficient and less error prone. Through partitioned structure, storage, access and manipulation in data store would be more efficient as well.
Partitioned node and tag data
NodeShapes requires node or tag data to be assigned to a partition identified by a unique partition id. The reason is that in many cases, a subtree has semantic meaning and the data should be grouped together for batch loading and writing.
And also different data might need to be stored in different repositories. For example, a node that represents a long article, the article might be a flat file, while a node that represents some statistics, its data might be stored in a datawarehouse. With partition key design, user can patition data storage and access according to the nature of data and infrastructure.
How parition id is generated is entirely up to you. A simple example is that if you use NodeShapes for web site data structure, you can partition data by web page id so that an entire page's data can be retrieved in one round fetch for effective page renderring.
Explict tags wrapping node
When working on UI tree, it is often needed to wrap multiple tags on a node, for example, <a href='https://a.nice.site.com'><em>Nice</em></a>
. And each tag also has its own data, such as style and attributes.
NodeShapes explicitly support tags wrapping around node and provdes functions to manipulate tags and data. It is ideal for UI data structure or any other use cases where flexible decoration on node is needed.
Once wrapping is decided, UI designer needs to tune tag styles constantly but not so often to change tag wrappings. NodeShapes requries each tag has an unique UUID as well and put its data inside the flattened data partition too. This makes it easy and efficient to update tag's styles and attributes.
Plain JavaScript object data structure and functional manipulation
NodeShapes uses plain JavaScript object and all manipulation functions built in are functional. So that it can be used with any TypeScript / JavaScript library, such as React Redux. And you can also use your own favorite data manipulation library to manipulate the data as well.
Internally Nodeshapes uses the excellent optics library optics-ts.