Skip to content

Instantly share code, notes, and snippets.

@joaocc
Created February 16, 2026 14:02
Show Gist options
  • Select an option

  • Save joaocc/f086ced99899bb8f0f33a7f0b609ee27 to your computer and use it in GitHub Desktop.

Select an option

Save joaocc/f086ced99899bb8f0f33a7f0b609ee27 to your computer and use it in GitHub Desktop.
AMD ROCm audit on linux
# Replace <ctr> with container name
docker inspect <ctr> | jq '.[0].HostConfig.Devices, .[0].HostConfig.GroupAdd, .[0].HostConfig.Runtime, .[0].Config.Env'
docker logs --tail=200 <ctr>
# 1) Hardware + kernel baseline
lscpu
uname -r
lsb_release -a
# 2) Boot/kernel args actually in effect
cat /proc/cmdline
grep -E '^GRUB_CMDLINE_LINUX(_DEFAULT)?=' /etc/default/grub
# 3) TTM/AMDGPU runtime knobs currently active
for p in /sys/module/ttm/parameters/pages_limit \
/sys/module/ttm/parameters/page_pool_size \
/sys/module/amdgpu/parameters/gttsize; do
[ -e "$p" ] && echo "$p=$(cat "$p")" || echo "$p=not-present"
done
# 4) ROCm + GPU detection
which rocminfo >/dev/null 2>&1 && rocminfo | rg -i "Name:|gfx|Marketing Name" || echo "rocminfo not installed"
which rocm-smi >/dev/null 2>&1 && rocm-smi || echo "rocm-smi not installed"
clinfo | rg -i "platform|device|gfx|amd" || true
# 5) Driver/modules/log hints
lsmod | rg -i "amdgpu|kfd|ttm"
dmesg | rg -i "amdgpu|kfd|ttm|gttsize|pages_limit|iommu" | tail -n 200
# 6) Permissions/runtime prerequisites
id
ls -l /dev/kfd /dev/dri/render* 2>/dev/null
groups "$USER"
# 7) Container runtime (if vLLM in Docker/Podman)
docker ps --format "table {{.Names}}\t{{.Image}}\t{{.Status}}" 2>/dev/null || true
podman ps --format "table {{.Names}}\t{{.Image}}\t{{.Status}}" 2>/dev/null || true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment