weighted-reservoir-sampler
v1.0.0
Published
Samples random subsets from streams.
Downloads
9
Maintainers
Readme
weighted-reservoir-sampler
Samples random subsets from streams.
npm install weighted-reservoir-sampler
This package is an implementation of the A-ES algorithm as described in Weighted Random Sampling over Data Streams.
Basic Usage
var Sampler = require('weighted-reservoir-sampler');
// Example of a sampler which is twice as likely to select odd numbers
var sampler = new Sampler({
sampleSize: 9,
weightFunction: function(item) { return (item % 2) + 1; }
});
for (var i = 0; i < 150; i++) { sampler.push(i); }
var sample = sampler.end();
Class: WeightedReservoirSampler
new WeightedReservoirSampler([options])
The WeightedReservoirSampler constructor takes an optional options argument containing configuration options detailed in the Configuration section.
weightedReservoirSampler.config([options])
If options argument is present, merges the options object into the configuration of this weightedReservoirSampler instance and returns the instance.
If options is missing, returns the configuration object of this weightedReservoirSampler instance.
weightedReservoirSampler.setConfig(config)
Replaces this instance's configuration object with config (supplying defaults for missing options).
weightedReservoirSampler.push(item)
Pushes an item to this instance's sample buffer. This function should be called for every item you wish to consider for inclusion in the sample.
weightedReservoirSampler.end()
Returns the sample, and resets this instance's sample buffer for reuse. This should be called when you have pushed all items you wish to be considered in the sample.
Configuration
The following sections document the different options that can be passed to the config(), and setConfig() functions.
sampleSize
The size of the random subset to be retained when pushing items to the weightedReservoirSampler.
var weightedReservoirSampler = new WeightedReservoirSampler({
sampleSize: 10
});
Default: 1
weightFunction
A weight function which is applied to every item pushed to the weightedReservoirSampler. The returned weight from this function determines how likely an item is to be selected in the sample. An item with a weight of 10 is ten times more likely to be selected than an item with a weight of 1.
Note: The weight function should return a number greater than 0, otherwise the corresponding item is ignored.
var weightedReservoirSampler = new WeightedReservoirSampler({
weightFunction: function(item) {
return item.length * item.width;
}
});
Default: function() { return 1; }
random
The function to use for random number generation. The output of this function should be a number in the range [0, 1).
var weightedReservoirSampler = new WeightedReservoirSampler({
random: function() {
var randomNumber;
// Code to set randomNumber
return randomNumber;
}
});
Default: Math.random