Fix: AMD XDNA NPU (Strix Halo) on CachyOS — aie2_check_protocol: Incompatible firmware protocol major 7 minor 2
Hardware: HP ZBook Ultra G1a — AMD Ryzen AI MAX+ PRO 395 (Strix Halo)
NPU PCI ID: 1022:17f0 rev 11 (amdnpu/17f0_11/)
OS: CachyOS (Arch-based)
Kernel: 6.19.6-2-cachyos
Symptom: NPU fails to probe on every boot
amdxdna 0000:c4:00.1: [drm] *ERROR* aie2_check_protocol: Incompatible firmware protocol major 7 minor 2
amdxdna 0000:c4:00.1: [drm] *ERROR* aie2_hw_start: firmware is not alive
amdxdna 0000:c4:00.1: probe with driver amdxdna failed with error -22
Two layered problems:
1. In-kernel amdxdna module speaks protocol 6; current firmware speaks protocol 7.
The linux-firmware-other package (≥ 20260221) ships amdnpu/17f0_11/npu.sbin.1.1.2.65, which is a protocol 7 firmware blob. The in-kernel amdxdna driver in Linux 6.19.x still has protocol_major = 6 hardcoded for npu5 (Strix Halo). Mismatch → immediate probe failure.
2. The out-of-tree DKMS driver expects npu_7.sbin but linux-firmware-other doesn't ship it.
amdxdna-dkms 7.0 (AMD's upstream out-of-tree driver) supports protocol 7 and tries to load npu_7.sbin first, falling back to npu.sbin. Since npu_7.sbin doesn't exist in the firmware package, it falls back to npu.sbin — the same protocol 7 blob the in-kernel driver already rejected — and then fails at SMU init (aie2_smu_init: Access power failed).
Additionally, if the in-kernel module loads at boot and fails, it corrupts the NPU SMU state. Any subsequent modprobe in the same session will also fail.
paru -S amdxdna-dkmsThe DKMS driver selects firmware by trying npu_7.sbin before npu.sbin. The current firmware blob (npu.sbin.1.1.2.65) is the protocol 7 blob — it just isn't named correctly for the driver to find it first.
sudo ln -s npu.sbin.1.1.2.65.zst /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zst
# Verify
ls -la /usr/lib/firmware/amdnpu/17f0_11/Expected output:
npu.sbin.1.1.2.65.zst
npu.sbin.zst -> npu.sbin.1.1.2.65.zst
npu_7.sbin.zst -> npu.sbin.1.1.2.65.zst
The in-kernel module must not probe before the DKMS module. If it does, it corrupts the NPU SMU state and all subsequent probes fail even with the correct driver.
echo 'install amdxdna /sbin/modprobe --ignore-install amdxdna $CMDLINE_OPTS' | sudo tee /etc/modprobe.d/amdxdna-dkms.confCachyOS uses Limine, not standard mkinitcpio presets:
sudo limine-mkinitcpiosudo rebootsudo dmesg | grep -i amdxdna | head -10
ls /dev/accel/
dkms status | grep amdxdnaExpected dmesg (clean, no errors):
amdxdna 0000:c4:00.1: [drm] Load firmware amdnpu/17f0_11/npu_7.sbin
amdxdna 0000:c4:00.1: enabling device (0000 -> 0002)
[drm] Initialized amdxdna_accel_driver 0.6.0 for 0000:c4:00.1 on minor 0
Expected /dev/accel/:
crw-rw-rw- 261,0 root accel0
On linux-firmware-other updates:
Check if npu_7.sbin.zst now ships in the package:
pacman -Ql linux-firmware-other | grep 17f0_11If it appears, remove your manual symlink:
sudo rm /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zstOn kernel updates:
DKMS rebuilds automatically. Verify with:
dkms status | grep amdxdnaIf a build failed:
sudo dkms autoinstallNote: The symlink at /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zst is unmanaged by pacman. It survives firmware package updates but must be manually removed once the package ships the file natively.
Any AMD Strix Halo NPU (1022:17f0 rev 11) on Arch-based distros with:
linux-firmware-other >= 1:20260221-1- Linux kernel
<= 6.19.x(in-kernelamdxdnamodule)
Other Strix Point variants (rev 10 → 17f0_10/) likely need the same symlink fix:
sudo ln -s npu.sbin.<version>.zst /usr/lib/firmware/amdnpu/17f0_10/npu_7.sbin.zst