Mixed Type Observation Variables in RL

Question

Mahmood reza on 9 Sep 2024

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/2151409-mixed-type-observation-variables-in-rl

Answered: Shantanu Dixit on 10 Sep 2024

Hi.

I want to design a DQN agent to train in an environment that its observation variables consists of 5 continous double variable, a discreat variable with values [0 1] and two discreat variables with values [-1 0 1]. I define the observation info as:

ObsInfo = [
    rlNumericSpec([1 5], 'Name', 'X15'), ... % 5 double observation variables
    rlFiniteSetSpec([0 1], 'Name', 'X6'), ... % 1 discrete observation variable with values [0 1]
    rlFiniteSetSpec([-1 0 1], 'Name', 'X7'), ... % 2 discrete observation variables with values [-1 0 1]
    rlFiniteSetSpec([-1 0 1], 'Name', 'X8')
];
ActionInfo = rlFiniteSetSpec([-2, -1, 0, 1, 2]);

Therefore, the reset and step functions returns Observation in the form:

Obs = {[X1; X2; X3; X4; X5], X6, X7, X8}

Then I define a deep neural network as follows:

layers = [
    featureInputLayer(8, 'Normalization', 'none', 'Name', 'state') % 8 observation variables
    fullyConnectedLayer(100, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(100, 'Name', 'fc2')
    reluLayer('Name', 'relu2')
    fullyConnectedLayer(5, 'Name', 'fc3') % Number of actions
];
dnn = dlnetwork(layers);
critic = rlVectorQValueFunction(dnn,obsInfo,ActionInfo);

However, this code leads to following error:

The number of network input layers must be equal to the number of observation channels in the environment specification object.

Could you please help me to fix this issue? Is the definition of ObsInfo is correct for this type of problem? And also is the architecture of the network is ok?

Thank you.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Shantanu Dixit on 10 Sep 2024

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/2151409-mixed-type-observation-variables-in-rl#answer_1513889

Open in MATLAB Online

Hi Mahmood,

The issue encountered is due to the mismatch between the observation space and the network input layer. For incorporating both continuous and discrete observations you can try using a single continuous observation space. The discrete observations will take values in a finite set as dictated by the environment. This approach would require changing the environment to output continuous values.

ObsInfo = rlNumericSpec([1, 8], 'Name', 'Observations');

Alternatively, if the observations are to be provided as separate channels as given in the above code, the network needs to be modified to handle multiple input channels. Following steps describe it briefly

Separate Input layers for each observation channel followed by fully connected layers for feature extraction
Concatenation/Adding outputs from the separate channels
Passing the concatenated to the base network for further processing

Below is a reference code for the above (using the same base network as earlier) :

%% 1. Separate input layers for each channel
continuousInput = featureInputLayer(5, 'Normalization', 'none', 'Name', 'continuousInput');
binaryInput = featureInputLayer(1, 'Normalization', 'none', 'Name', 'binaryInput');
ternaryInput1 = featureInputLayer(1, 'Normalization', 'none', 'Name', 'ternaryInput1');
ternaryInput2 = featureInputLayer(1, 'Normalization', 'none', 'Name', 'ternaryInput2');
continuousPath = [
    continuousInput
    fullyConnectedLayer(10, 'Name', 'fc_continuous')
    reluLayer('Name', 'relu_continuous')
];
binaryPath = [
    binaryInput
    fullyConnectedLayer(5, 'Name', 'fc_binary')
    reluLayer('Name', 'relu_binary')
];
ternaryPath1 = [
    ternaryInput1
    fullyConnectedLayer(5, 'Name', 'fc_ternary1')
    reluLayer('Name', 'relu_ternary1')
];
ternaryPath2 = [
    ternaryInput2
    fullyConnectedLayer(5, 'Name', 'fc_ternary2')
    reluLayer('Name', 'relu_ternary2')
];

%% 2. Concatenating outputs from all the channels
concatLayer = concatenationLayer(1, 4, 'Name', 'concat');
% Further processing after concatenation
commonPath = [
    fullyConnectedLayer(100, 'Name', 'fc1')
    reluLayer('Name', 'relu1')
    fullyConnectedLayer(100, 'Name', 'fc2')
    reluLayer('Name', 'relu2')
    fullyConnectedLayer(5, 'Name', 'fc3')
];
% Assemble the network
layers = layerGraph();
layers = addLayers(layers, continuousPath);
layers = addLayers(layers, binaryPath);
layers = addLayers(layers, ternaryPath1);
layers = addLayers(layers, ternaryPath2);
layers = addLayers(layers, concatLayer);
layers = addLayers(layers, commonPath);
% Connect the layers
layers = connectLayers(layers, 'relu_continuous', 'concat/in1');
layers = connectLayers(layers, 'relu_binary', 'concat/in2');
layers = connectLayers(layers, 'relu_ternary1', 'concat/in3');
layers = connectLayers(layers, 'relu_ternary2', 'concat/in4');
layers = connectLayers(layers, 'concat', 'fc1');

%% 3. Passing to base network and further processing
dnn = dlnetwork(layers);
ObsInfoContinuous = rlNumericSpec([1 5], 'Name', 'ContinuousObs');
ObsInfoBinary = rlFiniteSetSpec([0 1], 'Name', 'BinaryObs');
ObsInfoTernary1 = rlFiniteSetSpec([-1 0 1], 'Name', 'TernaryObs1');
ObsInfoTernary2 = rlFiniteSetSpec([-1 0 1], 'Name', 'TernaryObs2');
ActionInfo = rlFiniteSetSpec([-2, -1, 0, 1, 2]);
critic = rlVectorQValueFunction(dnn, ...
    [ObsInfoContinuous, ObsInfoBinary, ObsInfoTernary1, ObsInfoTernary2], ...
    ActionInfo, ...
    'ObservationInputNames', {'continuousInput', 'binaryInput', 'ternaryInput1', 'ternaryInput2'});

Refer to the below MathWorks documentation for more information on creating observation info using different channels:

https://www.mathworks.com/help/reinforcement-learning/ref/rl.function.rlvectorqvaluefunction.html#mw_a7aec303-f92a-4c06-a4d1-5b3027ef0c8d

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Mixed Type Observation Variables in RL

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Mixed Type Observation Variables in RL

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments