Run SIL and PIL Verification for Reinforcement Learning
This example shows how to perform software-in-the-loop (SIL) and processor-in-the-loop (PIL) verification workflows for reinforcement learning agents in Simulink®.
This example requires the following hardware.
Raspberry Pi® hardware.
Wi-Fi® dongle or an Ethernet cable.
Power source connected to a micro USB cable.
You must also download and install the following support packages using the Add-Ons Explorer.
MATLAB® Support Package for Raspberry Pi Hardware. See Install Support for Raspberry Pi Hardware for more information about how to install the support package.
Simulink Support Package for Raspberry Pi Hardware. Follow the Get Started with Simulink Support Package for Raspberry Pi Hardware (Simulink) example to set up Raspberry Pi hardware.
MATLAB® Coder™ Interface for Deep Learning. This will install the Intel® MKL-DNN and ARM® Compute libraries.
Simulink Environment
The environment for this example is a Quanser QUBE™-Servo 2 pendulum swing-up model. The swing-up and balancing actions are performed by a combination of proportional-derivative (PD) controllers and reinforcement learning (RL) agents. In this example, you will simulate the controllers in software-in-the-loop (SIL) and processor-in-the-loop (PIL) verification modes and compare the results with normal simulation. For more information, see SIL and PIL Simulations (Embedded Coder).
Load the parameters for the environment.
loadQubeParameters
The top level model consists of the controller model reference and the pendulum environment. Open the model.
mdl = "rlQubeServo_SIL_PIL";
open_system(mdl)
Open the controller model reference.
open_system("Controller")
The overall control system consists of two RL Agents in the outer loop computing high level reference angles. The reference angles are sent to a low-level controller that stabilizes the pendulum system by computing the motor voltage. For more information on the controller design and training, see Train Reinforcement Learning Agents to Control Quanser QUBE Pendulum.
Policy Evaluation
To generate code and deploy reinforcement learning policies to hardware, you can use one of the following methods.
Evaluate the policy using the Deep Learning Toolbox™ Predict block.
Evaluate the policy by generating MATLAB® code using
generatePolicyFunction
. This option is used in this example.
Load the RL agents from the rlQubeServoSILPILAgents.mat
file.
load("rlQubeServoSILPILAgents.mat","swingAgent","modeAgent");
Generate the following policy evaluation functions.
evaluateAgentSelectPolicy
— For selecting the outer-loop pendulum reference angleevaluateAgentSwingPolicy
— For oscillating and swinging up the pendulum.
generatePolicyFunction(swingAgent,... "MATFileName","policy_swing.mat",... "FunctionName","evaluateAgentSwingPolicy"); generatePolicyFunction(modeAgent,... "MATFileName","policy_select.mat",... "FunctionName","evaluateAgentSelectPolicy");
Open the Outer Loop Control subsystem to choose the model evaluation method and select RL from the dropdown menu. Doing so activates policy execution using generated MATLAB code.
Alternatively, you can set this parameter using the following command.
set_param("Controller/Outer Loop Control","VChoice","RL");
The generated functions evaluateAgentSelectPolicy
and evaluateAgentSwingPolicy
are executed inside the Outer Loop Control subsystem using MATLAB Function blocks.
Software-in-the-Loop Simulation
You can analyze code generation performance using software-in-the-loop (SIL) simulation. A SIL simulation generates and builds code on your development computer and then simulates the system using the generated code. You can then compare the results with the ones obtained from a Normal mode simulation.
Configure Controller for SIL Simulation
The following steps show how to configure code generation settings in Simulink for SIL simulation. You can skip these steps and use preconfigured settings with the following command, which sets the appropriate configuration reference for SIL simulation.
setActiveConfigSet("Controller","configSILReferenceRL");
Otherwise,
Open the Controller model.
In the Simulink model window, on the Modeling tab, click Model Settings.
In the Configuration Parameters window, in the Solver section, set the Type parameter to
Fixed-step
and Solver parameter todiscrete
. Set the Fixed-step size parameter tots_PID
, which is 0.005 s.In the Simulation Target section, set the Language parameter to
C++
and the Target Library parameter toMKL-DNN
. If using the Deep Learning Toolbox Predict block for evaluation, set Language toC
.In the Code Generation section, set the System target file parameter to
ert.tlc
and the Language parameter toC
. Also, select an appropriate Toolchain parameter.Optionally, you can set the Language to
C++
and set the Target library in the Code Generation > Interface section to MKL-DNN. Doing so generates C++ code for the controller.Save the model.
View Generated Code
Optionally, you can view the generated code for the controller from the C-code perspective.
In the Simulink model window, on the Apps tab, in the gallery, click Embedded Coder.
To generate code and display it in the Code panel, on the C Code tab, click Build.
Ensure that there are no errors in this process. You can reconfigure the code generation settings to optimize the generated code.
Configure Top-Level Model for SIL Simulation
Open the top-level model rlQubeServo_SIL_PIL.slx
and specify the simulation mode for the Controller model reference.
To configure the simulation mode of the Controller model reference, right-click the Controller subsystem and select Block Parameters. Then, in the Block Parameters dialog box, set the Simulation mode parameter to Software-in-the-loop (SIL)
.
Run SIL Simulation
To run the SIL simulation, on the Apps tab, click SIL/PIL Manager.
On the SIL/PIL tab, in the System Under Test drop-down menu, select Model blocks in SIL/PIL mode
. Then, in the Top Model Mode, select Normal
.
To simulate the model, generate and run the code in a SIL simulation, and compare the results, click Run Verification. The results are shown in the Simulink Data Inspector.
A comparison of controller output values is shown between Normal and SIL simulations. The error tolerances are acceptable for this example.
Processor-in-the-Loop Simulation
In a processor-in-the-loop (PIL) simulation, you can generate code for the target hardware (in this case the Raspberry Pi), and deploy and run the code from the hardware. The results of the PIL simulation are transferred to Simulink to verify the numerical equivalence of the simulation and the code generation results. The PIL verification process is an important part of the design cycle to ensure that the behavior of the deployment code matches the design.
Configure Controller for PIL Simulation
To set up PIL simulation:
Connect the Raspberry Pi to the host computer and power it on.
Execute the
raspi
command in MATLAB to ensure that the hardware is connected. This command also displays the device address.
Follow the configuration steps from the SIL simulation workflow. You can alternatively use preconfigured settings with the following command.
setActiveConfigSet("Controller","configPILReference");
In addition to these settings, in the Configuration Parameters dialog box, in the Hardware Implementation section, set the Hardware board parameter to Raspberry Pi
and enter the board parameters. Specify the Device Address, Username, and Password as parameters to appropriate values.
To configure the top-level model for PIL simulation, first open the top-level model rlQubeServo_SIL_PIL.slx
.
Then, right-click the Controller subsystem and select Block Parameters. In the Block Parameters dialog box, set the Simulation mode parameter to Processor-in-the-loop (PIL)
.
Run PIL Simulation
To run the PIL simulation, on the Apps tab, click SIL/PIL Manager.
On the SIL/PIL tab, in the System Under Test drop-down menu, select Model blocks in SIL/PIL mode
. Then, in the Top Model Mode, select Normal
.
To simulate the model, generate and run the code on the hardware, and compare the results, click Run Verification. The results are shown in the Simulink Data Inspector.
A comparison of controller output values is shown between Normal and PIL simulations. The error tolerances are acceptable for this example.
See Also
Functions
Objects
Blocks
Related Examples
- Train DDPG Agent to Swing Up and Balance Pendulum
- Train Reinforcement Learning Agents to Control Quanser QUBE Pendulum