What's the state space of critic network in multi-agent reinforcement learning with centralized training?

2 views (last 30 days)
I have tried the centralized training, and I extracted all the neural networks of actors and critics in every agents. I found all the actor networks share the same parameters, as well as critic networks. Does each actor or critic using all agents' mini-batches to update itself?
I mean, for example, if there are 3 agents and the mini-batch size of each of them is 128, is 128*3 samples applied for actor or critic training?
Another question is: What's the input of critic network? The state space of each agent or some kinds of joint state space?

Answers (1)

Anshuman
Anshuman on 21 Oct 2024
Hi Yiwen,
In some MARL algorithms, actor and critic networks share parameters across agents to promote coordination and reduce the complexity of the learning process. This is particularly common in environments where agents have similar roles or tasks.
When parameters are shared, it's common for the networks to use experiences from all agents to update themselves. This means that if each agent has a mini-batch size of 128, the combined mini-batch size used for training could be 128 * 3 = 384. This helps the network learn from a more diverse set of experiences.
In centralized training, the critic network often takes a joint state space as input. This means it considers the states of all agents in the environment to evaluate the value of a given action or policy.
Hope it helps!

Products


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!