wellness
v4.0.6
Published
A module to handle useful health checks for clustered production applications.
Downloads
232
Readme
wellness
This module provides a healthcheck framework for existing applications using express or applications having no web framework (it can create its own express server). It does the following:
- provides an optional timeout check for works doing useful work
- for clustered servers, distrubutes the health info to all workers, so any worker can respond to the health check
- enables you to create your own custom health checks and use existing wellness health check modules
- shows valuable information on the heathcheck response
Designed for production systems to ensure they are available. Wellness, has a
single check, a timeout (adjustable with the workerTimeOut option parameter) if
a worker fails to call workerIsWorking
in before the timeout expires.
With workers, if half the workers must have not done any useful work in the time
period to be considered failed. The time period is adjustable by setting the
option, workerTimeOut
.
NOTES:
- If the option
workerTimeOut
is 0, the default, there is no worker timeout. - If there is no expressApp option parameter, wellness creates an express and adds it healthcheck to that route.
- The health check only returns 500 when
NODE_ENV === 'production'
. - To force the healthcheck to always fail, set the environment variable
HEALTHCHECK_ALWAYS_FAILS
to "true". - If
process.platform
=== "linux", the linux distribution information is shown on the output. - All the healthchecks appear in an array called "healthChecks".
- If any of the healthChecks fail, the response is an HTTP status of 500 and 200 otherwise.
Example Usage, Non-Clustered Express Server
var wellness = require('wellness');
var express = require('express');
var app = express();
var opts = {
healthCheckUriPath: '/healthcheck',
expressApp: app,
workerTimeOut: 50000
};
function doSomethingUseful() { wellness.workerIsWorking(); }
wellness.nonClusterInit(opts, function(err) {
if (err)
return console.error(err.message);
app.listen(3000, function () {
console.log('Example app listening on port 3000!');
doSomethingUseful();
setInterval(doSomethingUseful, 1000);
});
});
Running the above code, you can test with:
$ curl http://localhost:3000/healthcheck
Which supplies the following output:
{
"healthChecks": [
"checkWorkers: 8 workers were okay out of 8"
],
"nodeVersions": {
"ares": "1.10.1-DEV",
"http_parser": "2.7.0",
"icu": "57.1",
"modules": "48",
"node": "6.2.2",
"openssl": "1.0.2h",
"uv": "1.9.1",
"v8": "5.0.71.52",
"zlib": "1.2.8"
},
"platform": {
"code": "xenial",
"name": "Ubuntu 16.04.1 LTS",
"os": "Ubuntu",
"release": "16.04"
},
"version": "4.0.0"
}
NOTE: The platform
output for Linux has more information than the windows
or OS X targets.
Example Usage, Clustered Express Server
var wellness = require('wellness');
var express = require('express');
var app = express();
var numCPUs = require('os').cpus().length;
var opts = {
healthCheckUriPath: '/healthcheck',
expressApp: app,
workerTimeOut: 2000,
numWorkers: numCPUs
};
function doSomethingUseful() {
wellness.workerIsWorking();
}
var cluster = require('cluster');
if (cluster.isMaster) {
for (var i = 0; i < numCPUs; i++)
cluster.fork();
wellness.clusterPostForkInit();
} else {
wellness.clusterPostForkInit(opts, function(err) {
if (err) {
console.error(err.message);
return;
}
app.listen(3000, function () {
console.log('Example app listening on port 3000!');
doSomethingUseful();
setInterval(doSomethingUseful, 1000);
});
});
}
Example Usgae, Non-clustered Express Server
var wellness = require('wellness');
var express = require('express');
var app = express();
var opts = {
healthCheckUriPath: '/healthcheck',
expressApp: app,
workerTimeOut: 3000
};
function doSomethingUseful() {
wellness.workerIsWorking();
}
wellness.nonClusterInit(opts, function(err) {
if (err)
return console.error(err.message);
app.listen(3000, function () {
doSomethingUseful();
setInterval(doSomethingUseful, 1000);
});
});
API
nonClusterInit(opts, callback)
The init method sets up wellness for use with a non-clustered server. The following properties can be set on the opts object to configure wellness:
- healthCheckUriPath - a string URI path, e.g. '/healthcheck' OPTIONAL default is '/healthcheck'
- workerPercentFailed - A decimal percentage, (e.g. 0.50, which is 50%), where is the number of workers that failed is greater, the worker healthcheck fails. The default is 0.50 (50%).
- expressApp - The express application to add the health check upon. OPTIONAL will create its own express application when none given
- port - If there is no expressApp, create an express instance, listening on port OPTIONAL default value 8889.
- workerTimeOut - A number of milliseconds, which if the worker has not called workerIsWorking(), is considered "dead" OPTIONAL default is 0, meaning no checks for worker timeout.
- logger - Supply a logging option with info, warn, and error for logging. OPTIONAL uses console output by default.
- packageJsonPath - Path to package.json file, so health check can report on
application version. If not supplied
process.cwd()
is assumed.
clusterPostForkInit(opts, callback)
The init method sets up wellness for use with a clustered server. The properties that can be set on the opts object to configure wellness are the same as on nonClusterInit.
addCheck(func)
Adds a check to the list of health checks to be performed. The health check needs to return a callback that returns an error as the first argument and an optional sucessful status as the second argument. Here is the check cpu usage function as example:
var diskspace = require('diskspace');
var wellness = require('wellness');
var is = require('is2');
/**
* The check for free diskspace.
* @param {function} cb - A standard call back for async.parallel
*/
function checkDiskSpace(cb) {
if (!is.func(cb)) {
var err = new Error('Bad cb argument checkDiskSpace: '+inspect(cb));
return cb(err);
}
diskspace.check('/', function (err, total, free) {
if (err)
return cb(err);
var freePercent = Math.floor((free / total)*1000) / 10;
if (freePercent < 20) {
wellness.setError();
return cb(null, 'Low diskspace: '+freePercent+'%', false);
}
wellness.clearError();
return cb(null, 'Free diskspace: '+freePercent+'%', true);
});
}
wellness.addCheck(checkDiskSpace);
Any health check function is called by async.parallel, so it must NOT return an error in the first argument, if you want all the health checks to run.
Also, for every success case, add a non-null truthy value. If there is no value or the value is "falsey", the status code returned from is 500, indicating server failure.
workerIsWorking()
A call that workers, or non-clustered master processes make to signal it is
doing useful work. Once over half the workers have not signalled using this
function within the time period workerTimeOut
, the health check fails.
You must set workerTimeOut
to a value greater than 0 for this function to be
useful.