Install the Nvidia drivers as well as the container toolkit.
The following settings make Nvidia GPUs available inside Podman Quadlets:
# jellyfin.container
...
[Service]
# Create /dev/nvidia-uvm after rebooting
ExecStartPre=/usr/bin/nvidia-smi
...
[Container]
# Enable access to Nvidia GPUs
# See https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
# See https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html
AddDevice=nvidia.com/gpu=all
...
This can happen when the Nvidia driver has not been compiled for the new Linux kernel version.
[Optional: Check available drivers]
To check, get your Linux kernel version:
uname -r
List all compiled Nvidia drivers on your machine (takes a few seconds):
sudo dkms status
Check if one of them matches.
Run the following commands to compile the driver for the current kernel:
sudo update-initramfs -k all -u
sudo reboot
If it still does not work, try to nuke and reinstall the driver:
sudo apt purge *nvidia*
sudo apt install nvidia-kernel-dkms nvidia-driver firmware-misc-nonfree
sudo reboot
The following error is displayed while starting up the container:
Error: setting up CDI devices: failed to inject devices: failed to stat CDI host device "/dev/nvidia-uvm": no such file or directory
Apparently, some versions of the Nvidia driver don't automatically create the /dev/nvidia-uvm file after booting.
To fix this, it can be manually created by running the nvidia-smi command once after each boot / before starting containers:
nvidia-smi
[Alternative solution without nvidia-smi]
The /dev/nvidia-uvm file can also be created without nvidia-smi using the following commands:
DEVICE_NUMBER="$(grep nvidia-uvm /proc/devices | awk '{print $1}')"
sudo mknod -m 666 /dev/nvidia-uvm c "$DEVICE_NUMBER" 0
sudo mknod -m 666 /dev/nvidia-uvm-tools c "$DEVICE_NUMBER" 0
[source]
To test, run the following command:
podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L
If the GPU driver as well as the container toolkit are correctly installed, the output looks like this:
GPU 0: Quadro P600 (UUID: GPU-AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE)