@rl-js/redux-mdp
v0.9.5
Published
An interface for buildings MDPs using Redux, compatible with the rl-js framework.
Downloads
5
Readme
Classes
Typedefs
MdpFactory ⇐ EnvironmentFactory
Class for constructing an Environment implemented as a ReduxMDP
Kind: global class
Extends: EnvironmentFactory
- MdpFactory ⇐ EnvironmentFactory
new MdpFactory(params)
Create a factory for a particular MDP
| Param | Type | Default | Description | | --- | --- | --- | --- | | params | object | | Parameters for constructing the MDP | | params.reducer | Reducer | | Redux reducer representing the state of the MDP | | params.getObservation | getObservation | | Compute the current observation | | params.computeReward | computeReward | | Compute the current reward | | params.isTerminated | isTerminated | | Compute whether the environment is terminated | | [params.resolveAction] | resolveAction | | Resolve the MdpAction into a ReduxAction | | [params.gamma] | number | 1 | Reward discounting factor for the MDP |
mdpFactory.createEnvironment() ⇒ ReduxMDP
Create an instance of the environment.
Kind: instance method of MdpFactory
mdpFactory.setMdpMiddleware(middleware)
Configure any MdpMiddleware that should be part of the next invocation of createEnvironment()
Kind: instance method of MdpFactory
| Param | Type | | --- | --- | | middleware | function |
mdpFactory.setReduxMiddleware(middleware)
Configure any ReduxMiddleware that should be part of the next invocation of createEnvironment()
Kind: instance method of MdpFactory
| Param | Type | | --- | --- | | middleware | function |
ReduxMDP ⇐ Environment
Class representing in an Environment as an MDP using Redux.
Kind: global class
Extends: Environment
State : *
The underlying state representation of the environment. Should be a serializable object, e.g. state => JSON.parse(JSON.stringify(state)) should be an identity
Kind: global typedef
MdpAction : *
An object representing an action in an MDP. The type is specific to the MDP.
Kind: global typedef
Observation : *
An object representing the observation of an agent in the current state. The type is specific to the MDP.
Kind: global typedef
ReduxAction : Object
An Redux action. e.g. a Flux Standard Action: https://github.com/redux-utilities/flux-standard-action Your MdpAction will be converted into a ReduxAction by resolveAction
Kind: global typedef
Properties
| Name | Type | Description | | --- | --- | --- | | type | string | Each action must have a type associated with it. | | [payload] | * | Any data associated with the action goes here | | [error] | boolean | Should be true IIF the action represents an error | | [meta] | * | Any data that is not explicitly part of the payload |
reducer ⇒ State
A Redux reducer. Computes the next state without mutating the previous state object
Kind: global typedef
Returns: State - The new state object after the action is applied
| Param | Type | Description | | --- | --- | --- | | state | State | The current state of the MDP | | action | ReduxAction | The resolved action for the MDP |
getObservation ⇒ Observation
A function to get the observation of the agent given the current state.
Kind: global typedef
Returns: Observation - The observation for the current state
| Param | Type | Description | | --- | --- | --- | | state | State | The current state of the MDP |
computeReward ⇒ number
A function to compute the reward given a state transition, i.e. (s, a, s). This function should be completely deterministic; any non-determinism should be handled by resolveAction.
Kind: global typedef
Returns: number - The reward for given the state transition.
| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | ReduxAction | The next action | | nextState | State | the next state for the mdp |
isTerminated ⇒ boolean
A function to compute whether the environment is terminated, i.e. the current episode is over.
Kind: global typedef
Returns: boolean - True if the environment is terminated, false otherwise.
| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | ReduxAction | The next action | | nextState | State | the next state for the MDP. | | time | number | The current timestep of the MDP, useful for finite horizon MDPs. |
resolveAction ⇒ ReduxAction
A function to resolve a MdpAction into a ReduxAction. Any non-determinism in your environment should go here, as your Redux reducer should be completely deterministic.
Kind: global typedef
Returns: ReduxAction - The new state object after the action is applied
| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | MdpAction | The resolved action for the MDP |