Skip to content

Instantly share code, notes, and snippets.

@AntonIXO
Created March 4, 2025 09:31
Show Gist options
  • Select an option

  • Save AntonIXO/3f49ce8453766a4c07c8920c4c9d2b01 to your computer and use it in GitHub Desktop.

Select an option

Save AntonIXO/3f49ce8453766a4c07c8920c4c9d2b01 to your computer and use it in GitHub Desktop.
amdgpu reload dmesg log
[ 570.786475] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 570.786483] amdgpu 0000:04:00.0: amdgpu: failed to remove hardware queue from MES, doorbell=0x1004
[ 570.786485] amdgpu 0000:04:00.0: amdgpu: MES might be in unrecoverable state, issue a GPU reset
[ 570.786490] amdgpu 0000:04:00.0: amdgpu: Failed to evict queue 3
[ 570.786509] amdgpu 0000:04:00.0: amdgpu: Failed to evict process queues
[ 570.786513] amdgpu: Failed to quiesce KFD
[ 570.786648] amdgpu 0000:04:00.0: amdgpu: GPU reset begin!
[ 570.786665] amdgpu 0000:04:00.0: amdgpu: remove_all_kfd_queues_mes: Failed to remove queue 2 for dev 33395
[ 570.786719] amdgpu 0000:04:00.0: amdgpu: Dumping IP State
[ 570.787485] amdgpu 0000:04:00.0: amdgpu: Dumping IP State Completed
[ 573.567966] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=SUSPEND
[ 573.567974] [drm:amdgpu_mes_suspend [amdgpu]] *ERROR* failed to suspend all gangs
[ 573.568277] amdgpu 0000:04:00.0: amdgpu: suspend of IP block <mes_v11_0> failed -110
[ 573.568374] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1868974267 wd_nsec: 1868966097
[ 576.278698] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 576.278705] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 577.928821] amdgpu 0000:04:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:169 vmid:0 pasid:0)
[ 577.928828] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 577.928832] amdgpu 0000:04:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040B53
[ 577.928834] amdgpu 0000:04:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5)
[ 577.928836] amdgpu 0000:04:00.0: amdgpu: MORE_FAULTS: 0x1
[ 577.928837] amdgpu 0000:04:00.0: amdgpu: WALKER_ERROR: 0x1
[ 577.928839] amdgpu 0000:04:00.0: amdgpu: PERMISSION_FAULTS: 0x5
[ 577.928840] amdgpu 0000:04:00.0: amdgpu: MAPPING_ERROR: 0x1
[ 577.928842] amdgpu 0000:04:00.0: amdgpu: RW: 0x1
[ 577.928855] amdgpu 0000:04:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:153 vmid:0 pasid:0)
[ 577.928858] amdgpu 0000:04:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10
[ 578.981043] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 578.981049] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 581.936473] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 581.936482] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 583.940887] amdgpu 0000:04:00.0: amdgpu: MES failed to respond to msg=REMOVE_QUEUE
[ 583.940899] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[ 583.941753] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 5865277207 wd_nsec: 5865251588
[ 583.943472] amdgpu 0000:04:00.0: amdgpu: MODE2 reset
[ 583.983092] amdgpu 0000:04:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 583.983765] [drm] PCIE GART of 512M enabled (table at 0x0000008001900000).
[ 583.983912] amdgpu 0000:04:00.0: amdgpu: SMU is resuming...
[ 583.985609] amdgpu 0000:04:00.0: amdgpu: SMU is resumed successfully!
[ 583.992353] [drm] DMUB hardware initialized: version=0x08004B01
[ 584.322671] amdgpu 0000:04:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
[ 584.322677] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[ 584.322678] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[ 584.322680] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
[ 584.322681] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
[ 584.322682] amdgpu 0000:04:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
[ 584.322683] amdgpu 0000:04:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
[ 584.322685] amdgpu 0000:04:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
[ 584.322686] amdgpu 0000:04:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
[ 584.322689] amdgpu 0000:04:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
[ 584.322691] amdgpu 0000:04:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
[ 584.322693] amdgpu 0000:04:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
[ 584.322695] amdgpu 0000:04:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
[ 584.325060] amdgpu 0000:04:00.0: amdgpu: GPU reset(1) succeeded
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment