GPU Device is not recognised in Matlab Deep-Learning offical docker image
6 views (last 30 days)
Show older comments
Hello,
I am trying to used Matlab-deep-learning (mathworks/matlab-deep-learning Tags | Docker Hub R2023b) docker image on our HPC server (slurm -based).
I am using srun utility to run the docker image:
srun \
--time=0-02:00:00 --gpus-per-node=1 --container-image=mathworks/matlab-deep-learning:r2023b \
--container-name=matlabDeepLearningGPU --pty bash
When launching the image, nvidia-smi returns the following, showing the CUDA version to be N/A.
When I ran matlab and execute gpuDevice(), I get the following error:
I am wondering if this is an issue with the docker image provided by matlab, or is it related to the drivers installed on the host or maybe something else...?
I am getting the same error where i use NVIDIA GeForce RTX 3090 , NVIDIA H100 NVL , RTX 2080Ti....
Thank you!
0 Comments
Answers (1)
Michael
on 9 Sep 2024
Thanks for reaching out about this.
This error looks like the one seen when the container has not been started using the Nvidia container runtime correctly.
To do this using docker you need to install the nvidia-container-toolkit and then ensure that both the driver is installed and the GPUs passed into the container runtime. For Docker this is done by passing --gpus all when running the container and for Singularity this done by passing -nv.
You should be able to test these if you have interactive access to any machines where these GPUs are available, but for more information in your case you may need to speak to the system administrators of the HPC you are using to determine if the correct flags are being passed when the containers are run.
Hope that is helpful,
Michael
See Also
Categories
Find more on Containers in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!