How to let the reinforcement learning agent know exactly what action it takes?

Question

Aaron Bramhasta on 5 Nov 2024 at 17:06

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes

Answered: Maneet Kaur Bagga on 21 Nov 2024 at 6:52

Model.zip

Dear Matlab Experts,

I am currently running a reinforcement learning simulation, integrated with a discrete events system of simulink. My main simulation of the discrete events utilizes bus element containing multiple entites that some will serve as an observation for the RL agent (via conversion entity -> signal) and to impose the action the RL agent chooses (via conversion signal -> entity). I imposed some policy in the DES where given a certain requirements, the entity value will be assigned, that will switch an entity gate to determine which course of action to take. However, my reinforcement learning agent does not seem to understand this rule, as it assigns the entity value randomly from the values available. Is there a way to apply this rule that is present in the DES, to somehow make the same rule understandable by the RL agent?

Thank you so much in advance! I am attaching my model for reference.

Best regards,

Aaron.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Maneet Kaur Bagga on 21 Nov 2024 at 6:52

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes#answer_1547838

Open in MATLAB Online

Hi,

As per my understanding, the issue is encountered because your DES contains specific policies, such as switching gates based on entity attributes. These rules are likely hard-coded and not inherently part of the RL environment's observation or reward structure. The RL agent explores actions based on the provided observations and the learned policy.

Please refer to the following workaround for the same:

Incorporate Rule into Observations: Add flags or variables that indicate the rule's state (e.g "Gate should switch" = 1/0). Ensure these conditions are dynamically updated during simulation.

Augment Reward Structure: Add a penalty or reward for actions that align with or violate the DES rules. This encourages the RL agent to learn behaviors aligned with the rules.

reward = reward + (agentAction == expectedAction) * rewardFactor;

Pretrain the Agent: Use supervised learning to pretrain the RL agent to follow the DES rules as a baseline policy. Later, fine-tune with reinforcement learning.

Custom Environment Dynamics: Modify the environment (DES model) such that the DES rules are enforced during interaction. For instance, override the agent’s selected action if it violates a rule.

if violatesRule(action, currentState)
    action = enforceRule(currentState);
end

Regularization: Include constraints in the training process that mimic the DES rules. For example, ensure that the policy network outputs actions adhering to the rules.

loss = loss + ruleViolationPenalty * countViolations(actions, state);

Rule-Based Hybrid Approach: Use "rlAgent.getAction" to test the agent's action in specific scenarios and compare it against the DES policy to identify mismatches.

Please refer to the following MathWorks documentation of "rlAcAgent.getAction" for better understanding:

https://in.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlacagent.html

Hope this helps!

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

How to let the reinforcement learning agent know exactly what action it takes?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

How to let the reinforcement learning agent know exactly what action it takes?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments