amdgpu is borked for me
Currently not having a good time with my AMD system. :(
Started out with infrequent random kernel panics out of nowhere, doing nothing specific, even after a while of login screen.
I updated, no change. Still happened. Unfortunately, the QR codes for the kernel panics had no log entries, no info at all aside from what kernel version it was.
I reinstalled kernels, but afterwards, I had full system freezes instead. Not even TTY or SysRq worked. Had to shut down hard, physically.
I read through all kinds of logs, outputs, reports, and monitored system performance. Couldn’t find anything, seemed like nothing relevant was written to logs at that point.
Decided to install the LTS kernel. Lasted for almost two hours until the laptop screen completely froze while the external screen worked just fine. Was finally able to find stuff in the logs.
kernel: amdgpu 0000:07:00.0: [drm] * ERROR * flip_done timed out
0010:amdgpu_dm_atomic_commit_tail+0x3934/0x3a10 [amdgpu]
I found a lot about this online from every other year and seems to be a kernel bug, specifically with how it handles atomic commit, that pops up every couple years. It’s causing KWin pageflip issues, freezes the system and can also cause a panic. Latest issue thread activity was from 3 weeks ago, some even more recent, so I’m not the only one.
I don’t feel confident in any temp solutions provided online as they just throw stuff at the wall to see what sticks and none work for everyone. Also I would like to understand what they do before I apply anything, and I don’t.
So I guess I’ll live with a freeze every couple hours and just use my other laptop more until that is fixed again. I may try to see if switching between Wayland or Xorg changes anything.
I have always been lucky with updates on that machine, so I guess I was due for this tax.
Reply via email
Published 02 Oct, 2025