fast-k-bucket
v5.1.0
Published
Kademlia DHT K-bucket implementation as a binary tree
Downloads
8
Readme
fast-k-bucket
Kademlia DHT K-bucket implementation as a binary tree.
Contributors
@tristanls, @mikedeboer, @deoxxa, @feross, @nathanph, @allouis, @fanatid, @robertkowalski, @nazar-pc, @jimmywarting, @achingbrain
Installation
npm install k-bucket
Tests
npm test
Usage
const KBucket = require('k-bucket')
const kBucket1 = new KBucket({
localNodeId: Buffer.from('my node id')
})
// or without using Buffer (for example, in the browser)
const id = 'my node id'
const nodeId = new Uint8Array(id.length)
for (let i = 0, len = nodeId.length; i < len; ++i) {
nodeId[i] = id.charCodeAt(i)
}
const kBucket2 = new KBucket({
localNodeId: nodeId // default: random data
})
Overview
A Distributed Hash Table (DHT) is a decentralized distributed system that provides a lookup table similar to a hash table.
k-bucket is an implementation of a storage mechanism for keys within a DHT. It stores contact
objects which represent locations and addresses of nodes in the decentralized distributed system. contact
objects are typically identified by a SHA-1 hash, however this restriction is lifted in this implementation. Additionally, node ids of different lengths can be compared.
This Kademlia DHT k-bucket implementation is meant to be as minimal as possible. It assumes that contact
objects consist only of id
. It is useful, and necessary, to attach other properties to a contact
. For example, one may want to attach ip
and port
properties, which allow an application to send IP traffic to the contact
. However, this information is extraneous and irrelevant to the operation of a k-bucket.
arbiter function
This k-bucket implementation implements a conflict resolution mechanism using an arbiter
function. The purpose of the arbiter
is to choose between two contact
objects with the same id
but perhaps different properties and determine which one should be stored. As the arbiter
function returns the actual object to be stored, it does not need to make an either/or choice, but instead could perform some sort of operation and return the result as a new object that would then be stored. See kBucket._update(node, index, contact) for detailed semantics of which contact
(incumbent
or candidate
) is selected.
For example, an arbiter
function implementing a vectorClock
mechanism would look something like:
// contact example
var contact = {
id: Buffer.from('contactId'),
vectorClock: 0
};
function arbiter(incumbent, candidate) {
if (incumbent.vectorClock > candidate.vectorClock) {
return incumbent;
}
return candidate;
};
Alternatively, consider an arbiter that implements a Grow-Only-Set CRDT mechanism:
// contact example
const contact = {
id: Buffer.from('workerService'),
workerNodes: {
'17asdaf7effa2': { host: '127.0.0.1', port: 1337 },
'17djsyqeryasu': { host: '127.0.0.1', port: 1338 }
}
};
function arbiter(incumbent, candidate) {
// we create a new object so that our selection is guaranteed to replace
// the incumbent
const merged = {
id: incumbent.id, // incumbent.id === candidate.id within an arbiter
workerNodes: incumbent.workerNodes
}
Object.keys(candidate.workerNodes).forEach(workerNodeId => {
merged.workerNodes[workerNodeId] = candidate.workerNodes[workerNodeId];
})
return merged
}
Notice that in the above case, the Grow-Only-Set assumes that each worker node has a globally unique id.
Documentation
KBucket
Implementation of a Kademlia DHT k-bucket used for storing contact (peer node) information.
For a step by step example of k-bucket operation you may find the following slideshow useful: Distribute All The Things.
KBucket starts off as a single k-bucket with capacity of k. As contacts are added, once the k+1 contact is added, the k-bucket is split into two k-buckets. The split happens according to the first bit of the contact node id. The k-bucket that would contain the local node id is the "near" k-bucket, and the other one is the "far" k-bucket. The "far" k-bucket is marked as don't split in order to prevent further splitting. The contact nodes that existed are then redistributed along the two new k-buckets and the old k-bucket becomes an inner node within a tree data structure.
As even more contacts are added to the "near" k-bucket, the "near" k-bucket will split again as it becomes full. However, this time it is split along the second bit of the contact node id. Again, the two newly created k-buckets are marked "near" and "far" and the "far" k-bucket is marked as don't split. Again, the contact nodes that existed in the old bucket are redistributed. This continues as long as nodes are being added to the "near" k-bucket, until the number of splits reaches the length of the local node id.
As more contacts are added to the "far" k-bucket and it reaches its capacity, it does not split. Instead, the k-bucket emits a "ping" event (register a listener: kBucket.on('ping', function (oldContacts, newContact) {...});
and includes an array of old contact nodes that it hasn't heard from in a while and requires you to confirm that those contact nodes still respond (literally respond to a PING RPC). If an old contact node still responds, it should be re-added (kBucket.add(oldContact)
) back to the k-bucket. This puts the old contact on the "recently heard from" end of the list of nodes in the k-bucket. If the old contact does not respond, it should be removed (kBucket.remove(oldContact.id)
) and the new contact being added now has room to be stored (kBucket.add(newContact)
).
Public API
- KBucket.arbiter(incumbent, candidate)
- KBucket.distance(firstId, secondId)
- new KBucket(options)
- kBucket.add(contact)
- kBucket.closest(id [, n = Infinity])
- kBucket.count()
- kBucket.get(id)
- kBucket.metadata
- kBucket.remove(id)
- kBucket.toArray()
- kBucket.toIterable()
KBucket.arbiter(incumbent, candidate)
incumbent
: Object Contact currently stored in the k-bucket.candidate
: Object Contact being added to the k-bucket.- Return: Object Contact to updated the k-bucket with.
Default arbiter function for contacts with the same id
. Uses contact.vectorClock
to select which contact to update the k-bucket with. Contact with larger vectorClock
field will be selected. If vectorClock
is the same, candidat
will be selected.
KBucket.distance(firstId, secondId)
firstId
: Uint8Array Uint8Array containing first id.secondId
: Uint8Array Uint8Array containing second id.- Return: Integer The XOR distance between
firstId
andsecondId
.
Default distance function. Finds the XOR distance between firstId and secondId.
new KBucket(options)
options
:arbiter
: Function (Default: vectorClock arbiter)function (incumbent, candidate) { return contact; }
An optionalarbiter
function that given twocontact
objects with the sameid
returns the desired object to be used for updating the k-bucket. For more details, see arbiter function.distance
: Functionfunction (firstId, secondId) { return distance }
An optionaldistance
function that gets twoid
Uint8Arrays and return distance (as number) between them.localNodeId
: Uint8Array An optional Uint8Array representing the local node id.numberOfNodesPerKBucket
: Integer (Default: 20) The number of nodes that a k-bucket can contain before being full or split.
Creates a new KBucket.
kBucket.add(contact)
contact
: Object The contact object to add.id
: Uint8Array Contact node id.- Any satellite data that is part of the
contact
object will not be altered, onlyid
is used.
- Return: Object The k-bucket itself.
Adds a contact
to the k-bucket.
kBucket.closest(id [, n = Infinity])
id
: Uint8Array Contact node id.n
: Integer (Default: Infinity) The maximum number of closest contacts to return.- Return: Array Maximum of
n
closest contacts to the node id.
Get the n
closest contacts to the provided node id. "Closest" here means: closest according to the XOR metric of the contact
node id.
kBucket.count()
- Return: Number The number of contacts held in the tree
Counts the total number of contacts in the tree.
kBucket.get(id)
id
: Uint8Array The ID of thecontact
to fetch.- Return: Object The
contact
if available, otherwise null
Retrieves the contact
.
kBucket.remove(id)
id
: Uint8Array The ID of thecontact
to remove.- Return: Object The k-bucket itself.
Removes contact
with the provided id
.
kBucket.toArray()
- Return: Array All of the contacts in the tree, as an array
Traverses the tree, putting all the contacts into one array.
kBucket.toIterable()
- Return: Iterable All of the contacts in the tree, as an iterable
Traverses the tree, yielding contacts as they are encountered.
Releases
Policy
We follow the semantic versioning policy (semver.org) with a caveat:
Given a version number MAJOR.MINOR.PATCH, increment the:
MAJOR version when you make incompatible API changes, MINOR version when you add functionality in a backwards-compatible manner, and PATCH version when you make backwards-compatible bug fixes.
caveat: Major version zero is a special case indicating development version that may make incompatible API changes without incrementing MAJOR version.
Sources
The implementation has been sourced from: