Clear Filters
Clear Filters

How to extract the weights of the actor network (inside the step function in the environment) while training the agent in DDPG RL

3 views (last 30 days)
Hello Everyone,
I am building an LQR type controller. I need to extract the weights of the actor network (which is essentially the feedback K) inside the step function of the enviroment during training. The reason i want to do this is that during training i want to see the K (the actor weights) and add stability condition on the closed loop system. My step function is as follows:
function [nextobs,rwd,isdone,loggedSignals] = step(this,action)
%% I want to extract K (the actor network weights)
loggedSignals = [];
x = this.State;
tspan = 0:0.01:this.Ts;
[t2,xk1] = ode15s(@NDAE_func_ode_RL,tspan,x,this.SYS.options1,action,this.SYS.d1,this);
this.State = xk1(end,:)';
nextobs = this.Cd*xk1(end,:)';
rwd = -x'*this.Qd*x - action'*this.Rd*action - 2*x'*this.Nd*action;
isdone = length(xk1(:,1))<length(tspan) || norm(x) < this.GoalThreshold;
end
Any guidance/suggestions would be highly appreciated.
Thanks,
Nadeem

Answers (1)

Harsha Vardhan
Harsha Vardhan on 17 Nov 2023
Edited: Harsha Vardhan on 17 Nov 2023
Hi,
I understand that you want to extract the weights of the actor network inside the step function of the environment during training in a DDPG reinforcement learning setup.
To extract the weights, you can follow the below steps:
  • Pass the 'agent' as an argument to the 'step' function.
  • Inside the ‘step’ function, use the 'getActor' method to obtain the actor function approximator from the agent.
  • Use the 'getLearnableParameters' method to extract the actor's learnable parameters (weights).
Please check the modified code below:
function [nextobs,rwd,isdone,loggedSignals] = step(this,action, agent)
%% I want to extract K (the actor network weights)
%Obtain actor function approximator from the agent
actor = getActor(agent);
%Obtain learnable parameters from the actor
params = getLearnableParameters(actor);
loggedSignals = [];
x = this.State;
tspan = 0:0.01:this.Ts;
[t2,xk1] = ode15s(@NDAE_func_ode_RL,tspan,x,this.SYS.options1,action,this.SYS.d1,this);
this.State = xk1(end,:)';
nextobs = this.Cd*xk1(end,:)';
rwd = -x'*this.Qd*x - action'*this.Rd*action - 2*x'*this.Nd*action;
isdone = length(xk1(:,1))<length(tspan) || norm(x) < this.GoalThreshold;
end
For more details, please refer to the following documentation:
  1. Extract actor from reinforcement learning agent: https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlqagent.getactor.html
  2. Create Custom Environment Using Step and Reset: https://www.mathworks.com/help/reinforcement-learning/ug/create-matlab-environments-using-custom-functions.html
  3. Deep Deterministic Policy Gradient (DDPG) Agents – Creation and Training: https://www.mathworks.com/help/reinforcement-learning/ug/ddpg-agents.html
Hope this helps in resolving your query!

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!