dbscan

v0.0.1

Published

3 years ago

dbscan clustering algorithm

Downloads

572

0High
0Medium
0Low

fishrock123

dbscan cluster unsupervized

Clustering - DBScan algorithm

A node module, that uses DBScan unsupervised clustering algorithm, to return centroids and their cluster

This algorithm doesn't handle well the following:

Large datasets [computational complexity]
Number of dimensions ( > 16) - more computaitons, "curse of dimensionality"

about (2), given a fixed amount of points, the density of the points decreases exponentially. Meaning you won't be able to find cluster as you'll be wandering a lot. About "the curse", it means that Complexity: O(n^2) - space, O(n^2) - time

You'll find a pre-made 100 points 16-features vector sample file Uses stream, readline node modules

using jSHint, matchdep , stream, grunt.js

Use this with my permission only

Main app

points over map:

Initialization

we need to initialize the distance object, you can add any distance metric you wish to distance.js

var Distance 	 = require("./lib/distance"),
	distances    = new Distance(),
	// DBScan section
	DBScan       = require('./lib/dbscan.js'),
	dbscan       = new DBScan(distances)

after initialization, you need to create a multi-dimensional vector, an array of arrays: [[1,2],[1,4],[2,5],[5,9],...,[10,12]]

in code we grab it via stream from a line-by-line [newline] structured flat file [so we won't have limit on memory space]

	readline     = require('readline'), // using the UNSTABLE readline built-in node module
	// Stream section
	stream       = require('stream'),
	points       = [],
	rl, // read-line
	in_stream;
in_stream = fs.createReadStream('./points.txt'),
rl = readline.createInterface({
							input: in_stream,
							terminal: false
						  })

rl.on('line', function(line) {
	points.push(JSON.parse(line))
});

finally we run the clustering:

	var clustering_obj = dbscan.cluster(points,distanceFunction)
    console.log('FINISHED reading ' + points.length + ' and clustering them');

Pkg
Stats

Discover Tips

General search

Package details

User packages

Sponsor

About

Twitter

GitHub

Twitter

GitHub

Site

Open Software & Tools

Framework

Server

Data Store

Caching

CSS / Styling

Typeface

Avatars

Data Viz

Date formatting

Infinite scrolling

Markdown rendering

Repository url parsing

User data

Compiling

Types

Odds & Ends

dbscan

v0.0.1

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Clustering - DBScan algorithm

ToC

Main app