rlValueFunction
Description
This object implements a value function approximator object that you can use as a
critic for a reinforcement learning agent. A value function is a mapping from an environment
observation to the value of a policy. Specifically, its output is a scalar that represents the
expected discounted cumulative long-term reward when an agent starts from the state
corresponding to the given observation and executes actions according to a given policy
afterwards. After you create an rlValueFunction
critic, use it to create an
agent such as an rlACAgent
, rlPGAgent
, or rlPPOAgent
agent. For
an example of this workflow, see Create Actor and Critic Representations. For more information on creating
value functions, see Create Policies and Value Functions.
Creation
Syntax
Description
creates the value-function object critic
= rlValueFunction(net
,observationInfo
)critic
using the deep neural
network net
as approximation model, and sets the
ObservationInfo
property of critic
to the
observationInfo
input argument. The network input layers are
automatically associated with the environment observation channels according to the
dimension specifications in observationInfo
.
specifies the network input layer names to be associated with the environment
observation channels. The function assigns, in sequential order, each environment
observation channel specified in critic
= rlValueFunction(net
,ObservationInputNames=netObsNames
)observationInfo
to the layer
specified by the corresponding name in the string array
netObsNames
. Therefore, the network input layers, ordered as the
names in netObsNames
, must have the same data type and dimensions
as the observation channels, as ordered in observationInfo
.
creates the value function object critic
= rlValueFunction(tab
,observationInfo
)critic
with a discrete
observation space, from the table tab
, which is an
rlTable
object
containing a column array with as many elements as the number of possible observations.
The function sets the ObservationInfo
property of
critic
to the observationInfo
input
argument, which in this case must be a scalar rlFiniteSetSpec
object.
creates the value function object critic
= rlValueFunction({basisFcn
,W0
},observationInfo
)critic
using a custom basis
function as underlying approximator. The first input argument is a two-element cell
array whose first element is the handle basisFcn
to a custom basis
function and whose second element is the initial weight vector W0
.
The function sets the ObservationInfo
property of
critic
to the observationInfo
input
argument.
specifies the device used to perform computations for the critic
= rlValueFunction(___,UseDevice=useDevice
)critic
object, and sets the UseDevice
property of
critic
to the useDevice
input argument. You
can use this syntax with any of the previous input-argument combinations.
Input Arguments
Properties
Object Functions
rlACAgent | Actor-critic (AC) reinforcement learning agent |
rlPGAgent | Policy gradient (PG) reinforcement learning agent |
rlPPOAgent | Proximal policy optimization (PPO) reinforcement learning agent |
getValue | Obtain estimated value from a critic given environment observations and actions |
evaluate | Evaluate function approximator object given observation (or observation-action) input data |
gradient | Evaluate gradient of function approximator object given observation and action input data |
accelerate | Option to accelerate computation of gradient for approximator object based on neural network |
getLearnableParameters | Obtain learnable parameter values from agent, function approximator, or policy object |
setLearnableParameters | Set learnable parameter values of agent, function approximator, or policy object |
setModel | Set function approximation model for actor or critic |
getModel | Get function approximator model from actor or critic |
Examples
Version History
Introduced in R2022a