@rl-js/redux-mdp

v0.9.5

Published

3 years ago

An interface for buildings MDPs using Redux, compatible with the rl-js framework.

Downloads

0High
0Medium
0Low

cpnota

Classes

Typedefs

MdpFactory ⇐ EnvironmentFactory

Class for constructing an Environment implemented as a ReduxMDP

Kind: global class
Extends: EnvironmentFactory

MdpFactory ⇐ EnvironmentFactory

new MdpFactory(params)

Create a factory for a particular MDP

| Param | Type | Default | Description | | --- | --- | --- | --- | | params | object | | Parameters for constructing the MDP | | params.reducer | Reducer | | Redux reducer representing the state of the MDP | | params.getObservation | getObservation | | Compute the current observation | | params.computeReward | computeReward | | Compute the current reward | | params.isTerminated | isTerminated | | Compute whether the environment is terminated | | [params.resolveAction] | resolveAction | | Resolve the MdpAction into a ReduxAction | | [params.gamma] | number | 1 | Reward discounting factor for the MDP |

mdpFactory.createEnvironment() ⇒ ReduxMDP

Create an instance of the environment.

Kind: instance method of MdpFactory

mdpFactory.setMdpMiddleware(middleware)

Configure any MdpMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

| Param | Type | | --- | --- | | middleware | function |

mdpFactory.setReduxMiddleware(middleware)

Configure any ReduxMiddleware that should be part of the next invocation of createEnvironment()

Kind: instance method of MdpFactory

| Param | Type | | --- | --- | | middleware | function |

ReduxMDP ⇐ Environment

Class representing in an Environment as an MDP using Redux.

Kind: global class
Extends: Environment

State : *

The underlying state representation of the environment. Should be a serializable object, e.g. state => JSON.parse(JSON.stringify(state)) should be an identity

Kind: global typedef

MdpAction : *

An object representing an action in an MDP. The type is specific to the MDP.

Kind: global typedef

Observation : *

An object representing the observation of an agent in the current state. The type is specific to the MDP.

Kind: global typedef

ReduxAction : Object

An Redux action. e.g. a Flux Standard Action: https://github.com/redux-utilities/flux-standard-action Your MdpAction will be converted into a ReduxAction by resolveAction

Kind: global typedef
Properties

| Name | Type | Description | | --- | --- | --- | | type | string | Each action must have a type associated with it. | | [payload] | * | Any data associated with the action goes here | | [error] | boolean | Should be true IIF the action represents an error | | [meta] | * | Any data that is not explicitly part of the payload |

reducer ⇒ State

A Redux reducer. Computes the next state without mutating the previous state object

Kind: global typedef
Returns: State - The new state object after the action is applied

| Param | Type | Description | | --- | --- | --- | | state | State | The current state of the MDP | | action | ReduxAction | The resolved action for the MDP |

getObservation ⇒ Observation

A function to get the observation of the agent given the current state.

Kind: global typedef
Returns: Observation - The observation for the current state

| Param | Type | Description | | --- | --- | --- | | state | State | The current state of the MDP |

computeReward ⇒ number

A function to compute the reward given a state transition, i.e. (s, a, s). This function should be completely deterministic; any non-determinism should be handled by resolveAction.

Kind: global typedef
Returns: number - The reward for given the state transition.

| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | ReduxAction | The next action | | nextState | State | the next state for the mdp |

isTerminated ⇒ boolean

A function to compute whether the environment is terminated, i.e. the current episode is over.

Kind: global typedef
Returns: boolean - True if the environment is terminated, false otherwise.

| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | ReduxAction | The next action | | nextState | State | the next state for the MDP. | | time | number | The current timestep of the MDP, useful for finite horizon MDPs. |

resolveAction ⇒ ReduxAction

A function to resolve a MdpAction into a ReduxAction. Any non-determinism in your environment should go here, as your Redux reducer should be completely deterministic.

Kind: global typedef
Returns: ReduxAction - The new state object after the action is applied

| Param | Type | Description | | --- | --- | --- | | state | State | the current state for the MDP | | action | MdpAction | The resolved action for the MDP |

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Classes

Typedefs

MdpFactory ⇐ EnvironmentFactory

new MdpFactory(params)

mdpFactory.createEnvironment() ⇒ ReduxMDP

mdpFactory.setMdpMiddleware(middleware)

mdpFactory.setReduxMiddleware(middleware)

ReduxMDP ⇐ Environment

State : *

MdpAction : *

Observation : *

ReduxAction : Object

reducer ⇒ State

getObservation ⇒ Observation

computeReward ⇒ number

isTerminated ⇒ boolean

resolveAction ⇒ ReduxAction