Optimize RL Agent for a DC-Motor Speed control
23 views (last 30 days)
Show older comments
Hello together
I am trying to replace a PI controller with a RL agent to achieve a simple speed control of a motor (at the moment without current control). I have managed so far that the RL agent behaves like a P-controller. It keeps its set speed well and can also correct it well and quickly in case of a step. However, it still has an error of 10-200 rpm (depends on the specified target speed). I am currently observing the current rpm, the error and the integrated error.
I punish it linearly at an error and reward it, from the point when it gets closer than 50 rpm to the desired rpm.
In the graph I have simulated a brake from the 2nd second onwards. The goal would be to get to the desired speed despite the brake. Unfortunately, an error remains present with a swing.
I slowly do not know what I could change to teach the RL agent a robust PI controller behavior and ask for possible suggestions. As a template for the actor and critic I use the example of the water tank.
Another problem is that the agent so far could learn the behavior only for positive speeds. Teaching it to behave in the negative range the same as in the positive range simply with the negative voltage has not worked yet.
Thanks for a possible answer.
2 Comments
madhav
on 7 Nov 2023
Hi Franz,
were you able to control the speed now.If you had done pls share the code for my reference
Answers (1)
Hornett
on 12 Sep 2024
Hi Franz,
I understand that you want to replace the PI controller with an RL (Reinforcement Learning) agent and would like to increase the accuracy of the system. For achieving the same, you can consider the following:
- ·Adjust the reward function: Instead of punishing the agent linearly for errors, you can use a reward function that penalizes larger errors more heavily, such as a quadratic or exponential penalty. This can help the agent prioritize reducing the error more effectively.
- Experiment with different network architectures: Try experimenting with different architectures, such as increasing the depth or width of the neural networks used for the actor and critic. This can provide the agent with more capacity to learn complex control strategies.
- You can try adjusting the exploration rate or using different exploration strategies, such as epsilon-greedy or noise-based exploration. This can allow the agent to explore a wider range of actions and potentially discover better control strategies.
- Explore different reward structures: In addition to the error, consider incorporating other factors into the reward function. For example, you can include a term that rewards the agent for maintaining a stable and smooth response, such as penalizing large changes in control output. This can encourage the agent to learn a more robust and stable control strategy.
- Adjust hyperparameters: Hyperparameters, such as learning rate, discount factor, and exploration rate decay, can significantly impact the learning process. Experiment with different values for these hyperparameters to find the ones that work best for your specific problem.
Please find links to below documentation which I believe will help you for further reference:
- Reinforcement Learning Agents: https://in.mathworks.com/help/reinforcement-learning/ug/create-agents-for-reinforcement-learning.html
- rlQAgentOptions: https://in.mathworks.com/help/reinforcement-learning/ref/rl.option.rlqagentoptions.html
- Define Reward and Observation Signals: https://in.mathworks.com/help/reinforcement-learning/ug/define-reward-and-observation-signals.html
Hope this helps!
0 Comments
See Also
Categories
Find more on Training and Simulation in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!