easy-rl
v1.0.6
Published
A library to implement reinforcment learning in the easiest way possible.
Downloads
3
Maintainers
Readme
Easy RL (Reinforcement Learning)
Quick Start
Install
const { RN } = require("easy-rl");
// or
const RN = require("easy-rl").RN
Initializing you model
- First, create a new instance of the RN class
var rn = new RN(options)
Options:
- numInputs (Required): how many inputs your model has
- gamma (Defualt: 0.1): how much your model is affected by the rewards you assign (go deeper)
- learningRate (Defualt: 0.8): the learning rate of the model (bigger is not always better, go deeper)
- epsilon (Defualt: 0.5): how much randomness will your model have, this will be decayed, so gradually the model will stop being so random. When you save the model, there will be zero randomness. go deeper
- maxMem (Defualt: 256): how many states, rewards, and actions the code should store
For example, if I want to create a model to play blackjack, I would initialize it as follows:
var rn = new RN({ numInputs: 2 }) // your hand value, and dealer face up card
This will work without modifying any of the othe options, but it is encouraged to tweak and mess around them to see what works best for your model.
Building your net
To properly train a model, we need to build a neural network for it. easy-rl makes this very simple. To add a layer, use RN.addLayer(numNodes, activation)
Make sure to not make your model too complex.. this could lead to overfitting If you model is too simple, it could lead to underfitting
Following our blackjack example, we will add 3 layers (two hidden layers, and one last layer for output)
rn.addLayer(10, 'sigmoid');
rn.addLayer(10,'sigmoid');
rn.addLayer(2, 'softmax'); // for an in-depth explanation of activation functions click use this guide https://deepai.org/machine-learning-glossary-and-terms/activation-function
// this model has 4 layers in all, an input layer with 2 nodes, 2 hidden layers with 10 nodes and sigmoid activation, and an output layer with 2 nodes and softmax activation
// some other common activations are "ReLu" and "tanh"
After we build our model, we want to compile it (actually build the model). We can do this by using RN.compile
RN.compile(loss)
- compiles the model so that it can be used (do this even if you are loading a model)
- loss (defualt: "meanSquaredError"): the loss function, for a full list, click here. Loss is not needed when loading a model
Training your model
- Step 1:
- Get the models action.
var action = await rn.getAction(input)
- The input should be an array with the same length that you passed to "numInputs" when initializing the class
- The action will be encoded as a number. If you have 4 outputs, the output will be either 0, 1, 2, or 3. If you have two outputs, it will be 0 or 1. This numbers is predecided based off of what the model thinks the most likely output should be.
If you want to use the actuall numbers, you can use
await rn.model.predict(xTensor)
, but you have to convert your input into a tensor with the correct shape yourself.
- Step 2
- Reward the model based on its behavior.
rn.reward(reward)
- The reward can be any number that you want. In our blackjack example, we feed it a -1 reward if it loses the hand, 0 if it pushes, and 1 if it wins. You can also apply more importance to certain outcomes. For example, if you are training a bot to playe snake, you could assign a +1 to every time it gets closer to the apple and +2 for when it eats the apple.
- Step 3
- train the model
await rn.next(options)
- Options:
- batchSize (Defualt 64): the batch size to train the model in
- decayRate (Defualt 0.08): the rate to exponentially decay the epsilon parameter
- trainingOpts (Defualt {}): extra options to pass to the
model.fit
when training the model
- Step 4
- Save the model
await model.save(modelName);
- modelName (Defualt "model"): this will determine where the model is stored. Relative paths are supported.
to load back in a model, use
model.load(modelName)
. Make sure to use the same name that you put in duringmodel.save()
Remember to compile the model after as well
Class RN(options)
Create and initialize the model
- numInputs (Required, Integer): the number of inputs your model should accept
- gamma (Defualt 0.1, Float): how much impact rewards have on the model training
- learningRate (Defualt: 0.8, Float): The learning rate of the model
- epsilon (Defualt: 0.5, Float): How much your model relies on random actions during the training, exponentially decayed
- maxMem (Defualt 256): The number of samples that should be stored in memory
addLayer(numNodes, activation='relu')
Adds a layer to your neural network
- numNodes (Required, Integer): the number of nodes in this layer
- activation (Defualt 'relu', String): the activation function for this layer
compile(loss='meanSquaredError')
Compiles the model
- loss (Defualt 'meanSquaredError', String): the loss function for training and initializing the weights and biases
async save(path='model')
Saves the model
- path (Defualt 'model', String): where the model will be save, relative paths supported
async load(path='model')
Loads a model in from a previously saved model
- path (Defualt 'model', String): the path to load the model from (use the same path in
save
)
async getRandomAction()
Returns a random action from the list of possible actions. Encoded as a whole number integer
async getAction(input)
Returns the action that the model will take based of off the input. Encoded as a whole number integer.
- input (Required, Array): an array representing the inputs, one element per input, same length as
numInputs
async reward(reward)
Assign a reward to the previous action
- reward (Required, Integer): the reward assigned to the previous action
async next(options)
Trains the model for one step
- options
- batchSize (Defualt 64, Integer): the number of samples to pull from memory when training
- decayRate (Defualt 0.08, Float): the percentage rate to decay the epsilon factor. 0.08 = 8%
- trainingOpts (Object): passed directly to
model.fit
as additional options