mongoose-map-reduce-profit
v1.0.4
Published
A mongoose plugin to help ease mongo increment map-reduce jobs
Downloads
15
Maintainers
Readme
mongoose-map-reduce-profit
A mongoose plugin to help ease mongo/mongoose incremental map-reduce jobs
Creating incremental map-reduce jobs is a pain because testing is difficult. This plugin helps you by creating a way to define a job and then offering a way to test your job outside of mongo where you can set breakpoints and use console.log. This plugin also offers automatic job tracking so you don't have to keep track of the last run date. You may define as many jobs as necessary on a given collection.
Defining Map Reduce Jobs
You can define your job in your mongoose schema or in separate files. This plugin adds a defineIncrementalMapReduceJob()
function to mongoose schemas. Start by calling defineIncrementalMapReduceJob
with a unique job
name and then defining your job details with the returned Job
object. You should define all of your jobs at the same time you define all of your mongoose models.
var job1 = TaskSchema.defineIncrementalMapReduceJob( "job1" );
job1.out = "job1-results";
job1.verbose = true;
job1.buildQuery = function( runSpec ){
if( runSpec.incremental ){
// This is an incremental job, fetch only the new documents
return { createdDate: { $gt: runSpec.runDate } };
}else{
// This is a full job, fetch all the things
return {};
}
};
job1.map = function(){
var MS_IN_DAY = 24*60*60*1000;
var startDate = new Date( this.openedDate.getFullYear(), this.openedDate.getMonth(), this.openedDate.getDate(), 0, 0, 0, 0 );
var endDate = new Date( this.closedDate.getFullYear(), this.closedDate.getMonth(), this.closedDate.getDate(), 0, 0, 0, 0 );
var currentDate = startDate;
while( currentDate <= endDate ){
var dayString = currentDate.getFullYear() + "-" + currentDate.getMonth() + "-" + currentDate.getDate();
if( currentDate.getTime() === endDate.getTime() ){
emit( dayString, { open: 0, closed: 1 } );
}else{
emit( dayString, { open: 1, closed: 0 } );
}
currentDate = new Date( currentDate.getTime() + MS_IN_DAY );
}
};
job1.reduce = function( key, values ){
var totals = { open: 0, closed: 0 };
for( var n = 0; n < values.length; n++ ){
var record = values[n];
totals.open += record.open;
totals.closed += record.closed;
}
return totals;
};
The job object has the following customizable properties:
- buildQuery( runSpec ) - Required. The build query should return a mongoose query based on the passed in
runSpec
. TherunSpec
has anincremental
property which is a boolean for whether or not a full run is needed. Your query should use therunSpec.runDate
property to build a query to return the subset of records for your incremental job whenrunSpec.incremental
is true. - map - Required. The map function which will run inside mongo. The map function is responsible for calling
emit( key, value )
zero or more times for each record. The map function is called once per record and the current record is thethis
object. more info - reduce( key, values ) - Required. The reduce function which will run inside mongo. The reduce function is responsible for returning some aggregate value for the passed in values array. more info
- finalize( key, value ) - Optional. The finalize function which will run inside mongo. The finalize function should return a modified object from the passed in reducedValue. more info
- out - Optional. This can be the name of your output collection as a string or an object which specifies a mongo map-reduce action. more info
- limit - Optional. The maximum number of records to map-reduce.
- scope - Optional. An object whose properties will be available in the global scope of the map, reduce, & finalize functions when run inside mongo.
- scope - Optional. An object whose properties will be available in the global scope of the map, reduce, & finalize functions when run inside mongo. more info
- verbose - Optional. When true, mongo will output additional info to its log.
Running Map Reduce Jobs
This plugin adds a performIncrementalMapReduceJob
function to your models. Run a job by calling performIncrementalMapReduceJob
with the proper job name. Be sure you have already defined your jobs by the time your try to run one.
When you call performIncrementalMapReduceJob
, the plugin will query to see if the job has ever run before. If the job has been run, the buildQuery
function will be called with a runSpec that has runSpec.incremental
set to true and
runSpec.runDate
equal to the last date the job was run.
Task.performIncrementalMapReduceJob( "job1", function( error, results ){
// console.log( "GOT RESULTS", results, error ); // This won't work because it is run inside of mongo
});
Testing Map Reduce Jobs
This plugin adds a debugIncrementalMapReduceJob
function to your models. Test a job by calling debugIncrementalMapReduceJob
with the proper job name. Be sure you have already defined your jobs by the time your try to test one.
Task.debugIncrementalMapReduceJob( "job1", function( error, results ){
console.log( "GOT RESULTS", results, error ); // THIS WORKS!
});
Since the job tests are not run inside of mongo you can set breakpoints and use console.log
.
##Example
You can find an example of testing a job using mocha in the /test/main.test.js file. To run the test download this project and run grunt test
. Note: You'll need to have mongodb running when you run the tests.