DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

Question

hieu nguyen on 26 Apr 2023

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/1953509-ddpg-algorithm-experience-buffer-rl-util-experiencebuffer

Answered: Aravind on 12 Feb 2025

I want to code my own DDPG algorithm. In the intial steps, the Batch size is bigger than the number of experiences in experience buffer, how can I still get enough sampled data for my miniBatch ?

I use rl.util.ExperienceBuffer to create my experience buffer and use createSampledExperienceMiniBatch(buffer,BatchSize) function to get datas for minBatch. However, when the data in experience buffer is smaller than the BatchSize, the function return 0x0 cell.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Aravind on 12 Feb 2025

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/1953509-ddpg-algorithm-experience-buffer-rl-util-experiencebuffer#answer_1559641

Hi @hieu nguyen,

To manage the initial phase when your experience buffer has fewer experiences than the desired mini-batch size in a DDPG algorithm, you consider one of these two options:

Adjust the Mini-Batch Size Dynamically: Use the current buffer size as the mini-batch size if it is smaller than the desired size when calling the “createSampledExperienceMiniBatch” function. This allows the agent to learn at each time step. However, the downside is that the agent might not explore sufficiently, potentially leading to a sub-optimal policy by the end of training.
Warm-Up Phase: Implement an initial phase where the agent collects experiences without updating the policy to ensure the buffer is adequately filled. This is a common approach in training a DDPG agent. The agent takes random actions until the buffer is filled with enough entries to match the batch size. Once the buffer length meets the batch size, you can begin training thus avoiding errors with the “createSampledExperienceMiniBatch” function.

I hope this addresses your query.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

DDPG algorithm/Experience Buffer/ rl.util.ExperienceBuffer

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments