beta distribution in PPO

Sourabh

2 Feb 2024

0 Answers

Updated 15 Feb 2024

14 Views (30 days)

Follow Question

Show older comments

0 votes

I want to confine the actions of my PPO algorithm and I was thinking whether or not I can implement beta distribution for my PPO algorithm to confine my action space somehow.

heres the script of networks i am using

----------

commonPath = [

featureInputLayer(prod(obsInfo.Dimension),Name="comPathIn")

fullyConnectedLayer(120)

tanhLayer

fullyConnectedLayer(1,Name="comPathOut")

];

% Define mean value path

meanPath = [

fullyConnectedLayer(64,Name="meanPathIn")

tanhLayer

fullyConnectedLayer(64,Name="fc_2")

tanhLayer

fullyConnectedLayer(prod(actInfo.Dimension))

leakyReluLayer(0.1,Name="meanPathOut")

];

% Define standard deviation path

sdevPath = [

fullyConnectedLayer(64,"Name","stdPathIn")

tanhLayer

fullyConnectedLayer(64)

tanhLayer

fullyConnectedLayer(prod(actInfo.Dimension));

softmaxLayer(Name="stdPathOut")

];

% Add layers to layerGraph object

actorNet = layerGraph(commonPath);

actorNet = addLayers(actorNet,meanPath);

actorNet = addLayers(actorNet,sdevPath);

% Connect paths

actorNet = connectLayers(actorNet,"comPathOut","meanPathIn/in");

actorNet = connectLayers(actorNet,"comPathOut","stdPathIn/in");

actorNetwork = dlnetwork(actorNet);

1 Comment
Show -1 older comments Hide -1 older comments

Kautuk Raj on 15 Feb 2024

To implement a Beta distribution for the action outputs in the PPO algorithm, I think we would need to modify the network architecture to output the parameters (alpha and beta) of the Beta distribution. These parameters must be positive, so one would typically use an activation function that ensures positivity, such as the softplus function.

Follow Question

Answers (0)

Products

Release

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

beta distribution in PPO

1 Comment
Show -1 older comments Hide -1 older comments

Answers (0)

Categories

Products

Release

Tags

Community Treasure Hunt

beta distribution in PPO

1 Comment Show -1 older comments Hide -1 older comments

Answers (0)

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

1 Comment
Show -1 older comments Hide -1 older comments