Many people have these crashes with a particular graphics card series.
The error GCVM_L2_PROTECTION_FAULT_STATUS:0x00501031
is significant for this problem.
The error is observed using dmesg -w
or journalctl --dmesg --follow
.
[ 7288.620678] amdgpu 0000:2d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:5 pasid:32770, for process Xorg pid 6128 thread Xorg:cs0 pid 6130)
[ 7288.620690] amdgpu 0000:2d:00.0: amdgpu: in page starting at address 0x000080050220c000 from client 0x1b (UTCL2)
[ 7288.620697] amdgpu 0000:2d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00501031
[ 7288.620701] amdgpu 0000:2d:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 7288.620705] amdgpu 0000:2d:00.0: amdgpu: MORE_FAULTS: 0x1
[ 7288.620709] amdgpu 0000:2d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 7288.620713] amdgpu 0000:2d:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 7288.620716] amdgpu 0000:2d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 7288.620720] amdgpu 0000:2d:00.0: amdgpu: RW: 0x0
Change the value of /sys/class/drm/card1/device/power_dpm_force_performance_level
Depending on the system card1
could als be card0
.
cat /sys/class/drm/card1/device/power_dpm_force_performance_level
echo "high" > /sys/class/drm/card1/device/power_dpm_force_performance_level
Change the boot parameters for the amdgpu
v6.5
with grub
and add amdgpu.mcbp=0
.
This could do the trick to make it more stable and prevent crashes.