Skip to content

Instantly share code, notes, and snippets.

@kinncj
Created March 7, 2026 02:24
Show Gist options
  • Select an option

  • Save kinncj/75cdf455f2bae70719665725386098e2 to your computer and use it in GitHub Desktop.

Select an option

Save kinncj/75cdf455f2bae70719665725386098e2 to your computer and use it in GitHub Desktop.
Fix: AMD XDNA NPU (Strix Halo) on CachyOS — `aie2_check_protocol: Incompatible firmware protocol major 7 minor 2`

Fix: AMD XDNA NPU (Strix Halo) on CachyOS — aie2_check_protocol: Incompatible firmware protocol major 7 minor 2

Hardware: HP ZBook Ultra G1a — AMD Ryzen AI MAX+ PRO 395 (Strix Halo)
NPU PCI ID: 1022:17f0 rev 11 (amdnpu/17f0_11/)
OS: CachyOS (Arch-based)
Kernel: 6.19.6-2-cachyos
Symptom: NPU fails to probe on every boot


Symptom

amdxdna 0000:c4:00.1: [drm] *ERROR* aie2_check_protocol: Incompatible firmware protocol major 7 minor 2
amdxdna 0000:c4:00.1: [drm] *ERROR* aie2_hw_start: firmware is not alive
amdxdna 0000:c4:00.1: probe with driver amdxdna failed with error -22

Root Cause

Two layered problems:

1. In-kernel amdxdna module speaks protocol 6; current firmware speaks protocol 7.

The linux-firmware-other package (≥ 20260221) ships amdnpu/17f0_11/npu.sbin.1.1.2.65, which is a protocol 7 firmware blob. The in-kernel amdxdna driver in Linux 6.19.x still has protocol_major = 6 hardcoded for npu5 (Strix Halo). Mismatch → immediate probe failure.

2. The out-of-tree DKMS driver expects npu_7.sbin but linux-firmware-other doesn't ship it.

amdxdna-dkms 7.0 (AMD's upstream out-of-tree driver) supports protocol 7 and tries to load npu_7.sbin first, falling back to npu.sbin. Since npu_7.sbin doesn't exist in the firmware package, it falls back to npu.sbin — the same protocol 7 blob the in-kernel driver already rejected — and then fails at SMU init (aie2_smu_init: Access power failed).

Additionally, if the in-kernel module loads at boot and fails, it corrupts the NPU SMU state. Any subsequent modprobe in the same session will also fail.


Fix

Step 1 — Install the out-of-tree DKMS driver

paru -S amdxdna-dkms

Step 2 — Create the missing npu_7.sbin symlink

The DKMS driver selects firmware by trying npu_7.sbin before npu.sbin. The current firmware blob (npu.sbin.1.1.2.65) is the protocol 7 blob — it just isn't named correctly for the driver to find it first.

sudo ln -s npu.sbin.1.1.2.65.zst /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zst

# Verify
ls -la /usr/lib/firmware/amdnpu/17f0_11/

Expected output:

npu.sbin.1.1.2.65.zst
npu.sbin.zst -> npu.sbin.1.1.2.65.zst
npu_7.sbin.zst -> npu.sbin.1.1.2.65.zst

Step 3 — Prevent the in-kernel module from loading at boot

The in-kernel module must not probe before the DKMS module. If it does, it corrupts the NPU SMU state and all subsequent probes fail even with the correct driver.

echo 'install amdxdna /sbin/modprobe --ignore-install amdxdna $CMDLINE_OPTS' | sudo tee /etc/modprobe.d/amdxdna-dkms.conf

Step 4 — Rebuild initramfs

CachyOS uses Limine, not standard mkinitcpio presets:

sudo limine-mkinitcpio

Step 5 — Reboot

sudo reboot

Step 6 — Verify

sudo dmesg | grep -i amdxdna | head -10
ls /dev/accel/
dkms status | grep amdxdna

Expected dmesg (clean, no errors):

amdxdna 0000:c4:00.1: [drm] Load firmware amdnpu/17f0_11/npu_7.sbin
amdxdna 0000:c4:00.1: enabling device (0000 -> 0002)
[drm] Initialized amdxdna_accel_driver 0.6.0 for 0000:c4:00.1 on minor 0

Expected /dev/accel/:

crw-rw-rw- 261,0 root accel0

Ongoing Maintenance

On linux-firmware-other updates:
Check if npu_7.sbin.zst now ships in the package:

pacman -Ql linux-firmware-other | grep 17f0_11

If it appears, remove your manual symlink:

sudo rm /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zst

On kernel updates:
DKMS rebuilds automatically. Verify with:

dkms status | grep amdxdna

If a build failed:

sudo dkms autoinstall

Note: The symlink at /usr/lib/firmware/amdnpu/17f0_11/npu_7.sbin.zst is unmanaged by pacman. It survives firmware package updates but must be manually removed once the package ships the file natively.


Affected Hardware

Any AMD Strix Halo NPU (1022:17f0 rev 11) on Arch-based distros with:

  • linux-firmware-other >= 1:20260221-1
  • Linux kernel <= 6.19.x (in-kernel amdxdna module)

Other Strix Point variants (rev 1017f0_10/) likely need the same symlink fix:

sudo ln -s npu.sbin.<version>.zst /usr/lib/firmware/amdnpu/17f0_10/npu_7.sbin.zst
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment