Overview
Some applications require a specific kernel version to be installed on the Linux machines for proper functioning. When the Linux kernel is upgraded on a VM running NVIDIA drivers, there is a possibility of driver failure.
Prerequisites
- SSH access to the VM
- Root privilege
Steps
-
Step 1: Identify the CUDA version installed on the VM
-
Traditional VM: The CUDA version can be found by running the following command on the VM
# nvidia-smi
-
Kubernetes Node: The CUDA version can be found by running the following command
# kubectl -n gpu-operator exec ds/nvidia-driver-daemonset -- nvidia-smi
-
Traditional VM: The CUDA version can be found by running the following command on the VM
-
Step 2: Identify the supported kernel version
- NVIDIA maintains an archive of all CUDA releases along with a link to their documentation here:
https://developer.nvidia.com/cuda-toolkit-archive - Click on the "Versioned Online Documentation" link across the CUDA version identified in Step 1
- Under the installation guides, find the guide for Linux
- Refer to the system requirements to identify the supported kernel version for your Linux distribution version.
- NVIDIA maintains an archive of all CUDA releases along with a link to their documentation here:
-
Step 3: Upgrade/Downgrade the kernel version to match the recommended kernel version
-
# sudo apt-get install linux-image-x.x.x-x-generic linux-headers-x.x.x-x-generic
-
Comments
0 comments
Article is closed for comments.