I'm setting up a local AI stack with Ollama and Open WebUI using rootless podman-compose on Ubuntu 24.04. Both containers share a network, but when I started testing, I noticed the Ollama container was falling back to CPU inference — even though native Ollama on the host accesses the RTX 4060 Laptop GPU just fine.
My setup:
- Ubuntu 24.04
- NVIDIA GeForce RTX 4060 Laptop GPU, driver 595.71.05
- Podman 4.9.3
- podman-compose 1.5.0 (pip)
- nvidia-container-toolkit installed, CDI spec at
/etc/cdi/nvidia.yaml
My short debugging:
# Checked the CDI devices
nvidia-ctk cdi list
: '
INFO[0000] Found 3 CDI devices
nvidia.com/gpu=0
nvidia.com/gpu=GPU-3cb61464-c22f-067b-dcb8-ea32dda1197d
nvidia.com/gpu=all
'
ollama run ministral-3:8b "Hi"
# Hello! How can I assist you today? 😊
# Checks if CDI spec exists
head -20 /etc/cdi/nvidia.yaml
: '
cdiVersion: 0.7.0
kind: nvidia.com/gpu
devices:
- name: "0"
containerEdits:
deviceNodes:
- path: /dev/nvidia0
major: 195
fileMode: 438
permissions: rwm
- path: /dev/dri/card0
major: 226
fileMode: 432
permissions: rwm
gid: 44
- path: /dev/dri/renderD129
major: 226
minor: 129
fileMode: 432
'
Bare podman run
podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L
# Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
podman-compose run
podman-compose -f compose.yaml --profile gpu-podman up -d ollama-gpu-podman
# Error: unable to start container "5f5fd11ffa125aef013245690ba61e088460aee7a22c562af7326e12a43d1fec": setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
My compose.yaml for the Ollama service:
services:
ollama-gpu-podman:
image: ollama/ollama:latest
container_name: ollama-container
profiles:
- gpu-podman
restart: unless-stopped
networks:
- ollama-network
devices:
- nvidia.com/gpu=all
security_opt:
- label=disable
How do I fix it?