How to ensure an RL Agent in Simulink receives input only after a condition based on the output is met

7 views (last 30 days)
I'm working on a reinforcement learning agent in Simulink to optimize the shape of a truck's path. I am using a DDPG agent that controls the shape of the curve by adjusting certain parameters of a Bezier curve. This path is then input into a model where a controller follows the path as best it can. When the curve is solved, an output of the controller named goalReached goes to true, and the final runtime is passed to the RL agent, completing the cycle.
My problem is that the RL agent in Simulink expects constant input, rather than just when the model provides the total runtime. I’ve tried time-delays, switch systems, and trigger blocks, but they still provide some form of output to the RL agent, even when goalReached is false.
Are there any ways to make the RL agent activate only when goalReached is true, or is reinforcement learning only possible with direct, continuous input/output?
  1 Comment
Sam Chak
Sam Chak on 15 Oct 2024
This sounds like an Online Motion Planning where the real-time path is determined by the Bezier curve-informed DDPG agent. I'm not sure, but you probably can try using a combination of logic blocks (the Switch, or If blocks) to evaluate the conditions under which the RL agent should be allowed to take actions.

Sign in to comment.

Accepted Answer

Aravind
Aravind on 16 Oct 2024
From your question, it seems you want the "RL Agent" block in Simulink to run only when certain conditions are met. I assume you have set up an "rlDDPGAgent" object in the MATLAB workspace and assigned its name to the "Agent object" parameter of the "RL Agent" block. Let's call this variable "ddpgAgent."
Use a triggered subsystem for conditional execution:
Place the “RL Agent” block inside a triggered subsystem and set up the trigger condition so that the subsystem activates when the "goalReached" signal becomes true.
One way to do this is by setting the "Trigger type" of the "Trigger" block within the subsystem to "rising." This ensures that whenever the "goalReached" signal changes from "false" to "true," the subsystem is triggered.
Adjust the sample time of the “RL Agent” block:
When you place the "RL Agent" block inside a triggered subsystem, it is important to set the sample time of the "RL Agent" block correctly. According to the official documentation at https://www.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlddpgagent.html#mw_7e49ed9c-eea8-4994-8532-c7f66d5359ef_sep_mw_6ccbf7d0-1b4b-4dfc-8e02-8caca7909d2f, the default sample time for a DDPG Agent is 1.
However, when the "RL Agent" block is inside a triggered subsystem, the sample time should be set to -1. This ensures the "RL Agent" block processes inputs and generates outputs only when the subsystem is triggered.
To set the sample time to -1, include the following line of code when initializing the "ddpgAgent" variable in MATLAB:
ddpgAgent.SampleTime = -1;
Although inputs are provided to the "RL Agent" block throughout the simulation, it processes only those present when the block is executed. This occurs only when the subsystem containing the "RL Agent" block is triggered, which happens when the "goalReached" signal turns "true." Thus, the RL Agent processes inputs only when the specified condition is met.
I hope this clarifies your query.

More Answers (0)

Products


Release

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!