MDP model, returned as a GenericMDP object with the following
properties.
CurrentState — Name of the current state string
Name of the current state, specified as a string.
States — State names string vector
State names, specified as a string vector with length equal to the number of
states.
Actions — Action names string vector
Action names, specified as a string vector with length equal to the number of
actions.
T — State transition matrix 3D array
State transition matrix, specified as a 3-D array, which determines the
possible movements of the agent in an environment. State transition matrix
T is a probability matrix that indicates how likely the agent
will move from the current state s to any possible next state
s' by performing action a.
T is an
S-by-S-by-A array,
where S is the number of states and A is the
number of actions. It is given by:
The sum of the transition probabilities out from a nonterminal state
s following a given action must sum up to one. Therefore, all
stochastic transitions out of a given state must be specified at the same
time.
For example, to indicate that in state 1 following action
4 there is an equal probability of moving to states
2 or 3, use the
following:
MDP.T(1,[2 3],4) = [0.5 0.5];
You can also specify that, following an action, there is some probability of
remaining in the same state. For example:
MDP.T(1,[1 2 3 4],1) = [0.25 0.25 0.25 0.25];
R — Reward transition matrix 3D array
Reward transition matrix, specified as a 3-D array, which determines how much
reward the agent receives after performing an action in the environment.
R has the same shape and size as state transition matrix
T. The reward for moving from state s to
state s' by performing action a is given by:
TerminalStates — Terminal state names in the grid world string vector
Terminal state names in the grid world, specified as a string vector of state
names.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.