익명 15:12

Rootless podman-compose Ollama container can't access NVIDIA GPU on Ubuntu 24.04...

Rootless podman-compose Ollama container can't access NVIDIA GPU on Ubuntu 24.04, but native Ollama works

I'm setting up a local AI stack with Ollama and Open WebUI using rootless podman-compose on Ubuntu 24.04. Both containers share a network, but when I started testing, I noticed the Ollama container was falling back to CPU inference — even though native Ollama on the host accesses the RTX 4060 Laptop GPU just fine.

My setup:

  • Ubuntu 24.04
  • NVIDIA GeForce RTX 4060 Laptop GPU, driver 595.71.05
  • Podman 4.9.3
  • podman-compose 1.5.0 (pip)
  • nvidia-container-toolkit installed, CDI spec at /etc/cdi/nvidia.yaml

My short debugging:

# Checked the CDI devices
nvidia-ctk cdi list
: '
INFO[0000] Found 3 CDI devices                          
nvidia.com/gpu=0
nvidia.com/gpu=GPU-3cb61464-c22f-067b-dcb8-ea32dda1197d
nvidia.com/gpu=all
'

ollama run ministral-3:8b "Hi"
# Hello! How can I assist you today? 😊

# Checks if CDI spec exists
head -20 /etc/cdi/nvidia.yaml

: '

cdiVersion: 0.7.0
kind: nvidia.com/gpu
devices:
    - name: "0"
      containerEdits:
        deviceNodes:
            - path: /dev/nvidia0
              major: 195
              fileMode: 438
              permissions: rwm
            - path: /dev/dri/card0
              major: 226
              fileMode: 432
              permissions: rwm
              gid: 44
            - path: /dev/dri/renderD129
              major: 226
              minor: 129
              fileMode: 432
'

Bare podman run

podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L
# Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all

podman-compose run

podman-compose -f compose.yaml --profile gpu-podman up -d ollama-gpu-podman
# Error: unable to start container "5f5fd11ffa125aef013245690ba61e088460aee7a22c562af7326e12a43d1fec": setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all

My compose.yaml for the Ollama service:

services:
  ollama-gpu-podman:
    image: ollama/ollama:latest
    container_name: ollama-container
    profiles:
      - gpu-podman
    restart: unless-stopped
    networks:
      - ollama-network
    devices:
      - nvidia.com/gpu=all
    security_opt:
      - label=disable

How do I fix it?



Top Answer/Comment:

It would seem the issue is two-fold,

1. The CDI configuration was made by root, and it's also possible that I as the user, don't have access to it. so I should run this

# generate the nvidia.yaml locally
nvidia-ctk cdi generate --output=$HOME/.config/cdi/nvidia.yaml
: '
WARN[0000] Ignoring error in locating libnvidia-sandboxutils.so.1: libnvidia-sandboxutils.so.1: not found
libnvidia-sandboxutils.so.1: not found 
INFO[0002] Selecting /usr/share/vulkan/icd.d/nvidia_icd.json as /etc/vulkan/icd.d/nvidia_icd.json 

...

WARN[0002] Could not locate vulkan/icd.d/nvidia_layers.json: vulkan/icd.d/nvidia_layers.json: not found
vulkan/icd.d/nvidia_layers.json: not found 
INFO[0002] Selecting /usr/share/vulkan/implicit_layer.d/nvidia_layers.json as /etc/vulkan/implicit_layer.d/nvidia_layers.json 
WARN[0002] Could not locate vulkan/icd.d/nvidia_icd.x86_64.json: vulkan/icd.d/nvidia_icd.x86_64.json: not found
vulkan/icd.d/nvidia_icd.x86_64.json: not found 
INFO[0002] Selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.595.71.05 as /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.595.71.05 
INFO[0002] Selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.595.71.05 as /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.595.71.05 
INFO[0002] Selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.595.71.05 as /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.595.71.05 
INFO[0002] Selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.595.71.05 as /usr/INFO[0002] Generated CDI spec with version 0.7.0 
'
  1. Podman is only search the default CDI dirs, and so it couldn't find it, so I had to run this

    # Generated this containers.conf
    mkdir -p ~/.config/containers
    echo -e "\n[engine]\ncdi_spec_dirs = [\"$HOME/.config/cdi\", \"/etc/cdi\"]" \
      >> ~/.config/containers/containers.conf
    
  2. Finally restart the computer

    # Restart the computer, then I'm able to access the NVIDIA via podman
    podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L
    # GPU 0: NVIDIA GeForce RTX 4060 Laptop GPU (UUID: GPU-3cb61464-c22f-067b-dcb8-ea32dda3297d)
    

It's quite likely, that for everytime you update the nvidia drivers, you have to run the nvidia-ctk again for the local configuration.

상단 광고의 [X] 버튼을 누르면 내용이 보입니다