How to train a sequence to classification network on GPU

1 view (last 30 days)
I have a sequence to classification network that I can successfully train on a CPU using the trainNetwork function. However, when I set the ExecutionEnvironment to GPU it takes the same amount of time as on the CPU, even though it says that it's running on the GPU. I'm assuming that's because the input/output data arrays are not on the GPU. When I try to move the arrays to the GPU (using gpuArray) the input array moves but I get an error on the output array because it's a categorical array and gpuArray only supports numeric and logical arrays.
Is there any way to move a categorical array to the GPU so that I can get trainNetwork to actually run at full speed on the GPU? Or is there another way to get a sequence to classification network to train on a GPU?

Accepted Answer

Venu
Venu on 4 Dec 2023
Edited: Venu on 4 Dec 2023
I understand you are facing issue when you train your network on GPU. I can suggest you 2 solutions regarding your query:
1.You can move a categorical array to the GPU, but you need to convert it to a numeric array first.
Convert the categorical array to a numeric array using the "grp2idx" function.
Once you have the numeric array, you can use "gpuArray" to move it to the GPU.
Example code:
numericArray = grp2idx(categoricalArray);
numericArrayGPU = gpuArray(numericArray);
You can refer to this MATLAB documentation and the MATLAB answer mentioned below:
https://www.mathworks.com/help/stats/grp2idx.html
2. You can try by specifying the execution environment as 'auto' in the training options. The Deep Learning toolbox will automatically handle the data movement and computations on the GPU for you. This includes moving the input data to the GPU and converting categorical data as needed. You can train your sequence-to-classification network with categorical arrays on a GPU without explicitly moving the input data to the GPU or converting the categorical data.
Hope this helps!
  1 Comment
Tony Marino
Tony Marino on 4 Dec 2023
Thank you Venu, this does work. In the process of checking I did discover a little mystery. If I set the ExecutionEnvironment to "auto" the training runs on the GPU and takes about 2 minutes 49 seconds. If I set it to "GPU" I get the same results. If I set it to "CPU" the training runs on the CPU and takes 1 minute 52 seconds. This is on a 12th gen i9 laptop with an RTX3060 GPU.

Sign in to comment.

More Answers (1)

Joss Knight
Joss Knight on 3 Jan 2024
This performance discrepancy is normal. Small sequence networks often cannot benefit from GPU parallelism, especially if they use recurrent layers, and especially on a weak laptop GPU.
You can try increasing the MiniBatchSize as high as you can, see if that improves things.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!