mongodb-schema-simulator
v1.0.0
Published
MongoDB Schema simulator, Let's you simulate schema behavior against MongoDB
Downloads
10
Readme
MongoDB Schema Simulator Tool
The MongoDB Schema Simulator Tool was built to allow simulating the schemas outlined in the The Little MongoDB Schema Design Book
.
| Links | |:-----------| |Simulation examples |
Installing the tool
Installing the tool is as simple as
npm install -g mongodb-schema-simulator
It installs two executables
schema-monitor
schema-agent
schema-agent
is a load generating agent that applies traffic to your MongoDB topology
.
schema-monitor
is the monitor that orchestrates all the agents
and generates the end reports.
To see the options available for schema-agent and schema-monitor run the commands with the command line option -h
The goal of this tool is to allow you to simulate the interaction of multiple different scenarios when applying load. We are going to use an ecommerce website as the example for the tool.
Ecommerce website
We are going to simulate an ecommerce website schema. We've decided that we wish to look at two particular aspects.
- Users browsing the product catalog
- Users successfully adding 5 products to their cart and checking it out.
The tool comes with several built in scenarios (will be extended in the future.) You can list all the available scenarios by entering the command.
schema-monitor --scenarios
This will output a table containing the name and description of each scenario supported by the tool. Let's view the details of a particular scenario.
schema-monitor --scenarios cart_reservation_successful
This will output the json
description of the scenario. In this case it might look something like the following.
{
"name": "cart_reservation_successful",
"title": "fixed number of cart items with reservation",
"description": "simulates successful carts with a fixed number of items in the cart with reservation",
"params": {
"numberOfItems": {
"name": "number of items in the cart",
"type": "number",
"default": 5
},
"numberOfProducts": {
"name": "number of products available",
"type": "number",
"default": 100
},
"sizeOfProductsInBytes": {
"name": "size of products in bytes",
"type": "number",
"default": 1024
}
}
}
The scenario
contains a name, title, description and a set of parameters that can be adjusted for the particular scenario to tune the behavior. In this case we can tune the numberOfItems
in a cart, the numberOfProducts
available and the sizeOfProductsInBytes
for each product in the catalog.
We also wanted to use catalog browsing scenario. So let's list the one that shows all the products for a specific category.
schema-monitor --scenarios retrieve_products_by_category
The output is the following.
{
"name": "retrieve_products_by_category",
"title": "retrieve all the products for a specific category",
"description": "retrieve all the products for a specific category",
"params": {
"numberOfProducts": {
"name": "the number of preloaded products",
"type": "number",
"default": 1000
},
"treeStructure": {
"name": "the tree structure layout",
"type": "object",
"default": [
{
"level": 0,
"width": 5
},
{
"level": 1,
"width": 5
},
{
"level": 2,
"width": 5
},
{
"level": 3,
"width": 5
},
{
"level": 4,
"width": 5
}
]
}
}
}
The types here are the numberOfProducts
in our catalog and treeStructure
the number of layers in out category tree and how many nodes are in each level. Note that this schema does not reuse the products from the cart_reservation_successful
and is independent.
Let's put these two schemas together into a complete simulation that we wish to run against a single MongoDB
instance. First create a new file which we will call ecommerce_simulation.js
. Open the file and enter the following.
var carts = {
name: 'cart_reservation_successful',
collections: {
carts: 'carts', products: 'products'
, inventories: 'inventories', order: 'orders'
},
params: {
numberOfItems: 5, numberOfProducts: 1000
, sizeOfProductsInBytes: 1024
},
db: 'shop',
writeConcern: {
metadata: { w: 1, wtimeout: 10000 }
},
setup: function(db, callback) {
db.dropDatabase(function(err) {
return callback();
});
},
execution: {
iterations: 100, numberOfUsers: 50
}
}
var browse = {
name: 'retrieve_products_by_category',
collections: {
categories: 'categories', products: 'cateogory_products'
},
params: {
numberOfProducts: 10000
, treeStructure: [{ level: 0, width: 5}
, { level: 1, width: 5 }, { level: 2, width: 5}
, { level: 3, width: 5 }, { level: 4, width: 5
}]
},
db: 'shop',
// readPreference settings
readPreferences: {
categories: { mode: 'secondaryPreferred' , tags: {} }
, products: { mode: 'secondaryPreferred' , tags: {} }
},
setup: function(db, callback) {
db.dropDatabase(function(err) {
return callback();
});
},
execution: {
iterations: 100, numberOfUsers: 150
}
}
module.exports = [carts, browse];
Each simulation is composed of one or more of the built in scenarios in the tool. Each scenario defines the following fields.
name
: The name of the scenario we wish to execute.collections
: The collections we wish to run the scenario operations against.params
: Paramters to execute the scenario against.db
: The database to run the scenario against.setup
: A setup function that is runonce
for the scenario allowing us to do setup operations like dropping the database, creating shard keys etc.execution
: The execution parameters for the simulation tool relative to this scenario.
Running a simulation
Let's run the ecommerce_simulation.js
file through it's pases against a single MongoDB
instance. First start up a mongodb instance.
mongod
Next let's execute the simulation
.
schema-monitor -s ./ecommerce_simulation.js
The simulation will now start up and after it's finished you can find the resulting report in the ./out/index.html
file that is the default output of the tool.
In this case we ran the tool using locally spawned agent processes and schema-monitor
managed the lifecycle of the load generation. You might find that you need to run agents on different machines to create enough load for you particular tests.
Running remote agents
We are going to run the same simulation as before but this time we are going to boot up two separate agents and have the monitor control them.
First let's boot the monitor in remote
agent mode.
schema-monitor -s ./ecommerce_simulation.js -r
The process will start and await the number of agents needed to execute the scenario (the default is 2 processes, this can be controlled using the -n
flag).
Open up to new terminals and in the first enter the following.
schema-agent -p 5024 -s localhost -m 5100
And in the next terminal enter the following
schema-agent -p 5025 -s localhost -m 5100
Notice that the running of the scenario will now kick off just as when we ran with the local agents.
Optimize for latency
The Schema simulation tool lets you optimize against latency. F.ex you might want to know how many simultaneous users you can handle while keeping the scenario completion close to a specific amount of latency. In other words how many simultaneous users can we support while keeping the time it takes to complete a cart checkout around 100 in the 99 percentile.
Let's run the scenario above and optimize it.
schema-monitor -s ./ecommerce_simulation.js --optimize --optimize-mode latency --optimize-percentile 99 --optimize-latency-target 100 --optimize-for-scenario cart_reservation_successful --optimize-margin 25
What do the following options mean.
| Parameter | Description | |:-----------|:------------| | --optimize | Run an optimization against the provided scenario | | --optimize-mode | Mode of optimization (total run time or latency) | | --optimize-percentile | Optimize against the X percentile of the results | | --optimize-latency-target | Latency target in milliseconds | | --optimize-for-scenario | If multiple scenarios in a simulation pick the one to optimize for, otherwise it will pick the first available | | --optimize-margin | The percentage margin of error +- that is acceptable for the optimization against the latency target, hitting the latency 100% is impossible so you need to ensure that you have a margin that allows the optimization to find a stable state and finish |
Once the optimization is done it will spit out a json file with the optimized parameters in the --out
directory.