public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Troubshooting AMD Radeon Vega system freeze
@ 2025-01-25 19:13 Grant Edwards
  2025-01-25 19:39 ` [gentoo-user] " Grant Edwards
  2025-01-26 17:37 ` [gentoo-user] Troubshooting AMD Radeon Vega system freeze Daniel Frey
  0 siblings, 2 replies; 11+ messages in thread
From: Grant Edwards @ 2025-01-25 19:13 UTC (permalink / raw
  To: gentoo-user

Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
Radeon Vega Graphics) has been freezing up multiple times per day --
always when in active use with X11. Before that, it had been reliable
since assembled (about 5 years ago).

Ctrl-Alt-Backspace does nothing
Ctrl-Alt-F1 does nothing

The SysRq key does work (I'm able to sync/remount/reboot), but I'm
unable to get the console restored using SysRq keys once it freezes.

When it's not frozen, I can use SysRq to kill X11 and switch back to
the FB console, but I can't figure out how to get console output to
show up: I see the SysRq commands being echoed on the console, but all
output from them is going into the syslog.

The CPU was not at all not heavily loaded, and temps and fans speeds
are not noticably different than when the machine is idle.

I've run MemTest86+ for 12 hours: no errors.

I've tried dropping back fomm kernel 6.1.121 to 6.1.118: no change.

Next, I suppose I should start rolling back X11 drivers?

--
Grant




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-25 19:13 [gentoo-user] Troubshooting AMD Radeon Vega system freeze Grant Edwards
@ 2025-01-25 19:39 ` Grant Edwards
  2025-01-25 21:16   ` Dale
  2025-01-26 17:37 ` [gentoo-user] Troubshooting AMD Radeon Vega system freeze Daniel Frey
  1 sibling, 1 reply; 11+ messages in thread
From: Grant Edwards @ 2025-01-25 19:39 UTC (permalink / raw
  To: gentoo-user

On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
> Radeon Vega Graphics) has been freezing up multiple times per day --
> always when in active use with X11. Before that, it had been reliable
> since assembled (about 5 years ago).
>
> Ctrl-Alt-Backspace does nothing
> Ctrl-Alt-F1 does nothing
>
> The SysRq key does work (I'm able to sync/remount/reboot), but I'm
> unable to get the console restored using SysRq keys once it freezes.

I should have also mentioned that when "frozen" I can ping it, but
can't ssh to it.

--
Grant



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-25 19:39 ` [gentoo-user] " Grant Edwards
@ 2025-01-25 21:16   ` Dale
  2025-01-26  3:16     ` Grant Edwards
  0 siblings, 1 reply; 11+ messages in thread
From: Dale @ 2025-01-25 21:16 UTC (permalink / raw
  To: gentoo-user

Grant Edwards wrote:
> On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
>> Radeon Vega Graphics) has been freezing up multiple times per day --
>> always when in active use with X11. Before that, it had been reliable
>> since assembled (about 5 years ago).
>>
>> Ctrl-Alt-Backspace does nothing
>> Ctrl-Alt-F1 does nothing
>>
>> The SysRq key does work (I'm able to sync/remount/reboot), but I'm
>> unable to get the console restored using SysRq keys once it freezes.
> I should have also mentioned that when "frozen" I can ping it, but
> can't ssh to it.
>
> --
> Grant
>
>
>


I've accidentally seen this ages ago and this may not be there anymore. 
Isn't there some sort of output sent to ctrl alt F12?  If so, that may
shed some sort of light.  I think it is the same as messages or
something.  I saw it once when scrolling through the consoles looking
for something else.  I'm not real sure what I saw but looked like kernel
something. 

Dale

:-)  :-) 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-25 21:16   ` Dale
@ 2025-01-26  3:16     ` Grant Edwards
  2025-01-26  4:25       ` Dale
  2025-01-27 23:03       ` [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED] Grant Edwards
  0 siblings, 2 replies; 11+ messages in thread
From: Grant Edwards @ 2025-01-26  3:16 UTC (permalink / raw
  To: gentoo-user

On 2025-01-25, Dale <rdalek1967@gmail.com> wrote:
> Grant Edwards wrote:
>> On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
>>> Radeon Vega Graphics) has been freezing up multiple times per day --
>>> always when in active use with X11. Before that, it had been reliable
>>> since assembled (about 5 years ago).
>>>
>>> Ctrl-Alt-Backspace does nothing
>>> Ctrl-Alt-F1 does nothing
>
> I've accidentally seen this ages ago and this may not be there anymore. 
> Isn't there some sort of output sent to ctrl alt F12?  If so, that may
> shed some sort of light.  I think it is the same as messages or
> something.  I saw it once when scrolling through the consoles looking
> for something else.  I'm not real sure what I saw but looked like kernel
> something.

None of the other ctrl-alt-Fx keys work. I don't specifically remember
trying ctrl-alt-F12, but I'll give that a go next time.

I tried rolling back Xorg server, that didn't seem to help.

Google has found reports of recent versions (24.3.x) of mesa being
blamed for problems like this.  I see in the logs that mesa got
upgraded from 24.2.8 to 24.3.3 on the 20th.  I've rolled that back, to
see if that helps.  I thought it started a little earlier than that,
but my memory may be playing tricks on me: it's been a long week.

I've been looking for a good document with some recommendations on how
to use the various SysRq functions (in what sequence), but all I can
ever find is just lists of the commands with very descriptions of
each.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-26  3:16     ` Grant Edwards
@ 2025-01-26  4:25       ` Dale
  2025-01-27 23:03       ` [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED] Grant Edwards
  1 sibling, 0 replies; 11+ messages in thread
From: Dale @ 2025-01-26  4:25 UTC (permalink / raw
  To: gentoo-user

Grant Edwards wrote:
> On 2025-01-25, Dale <rdalek1967@gmail.com> wrote:
>> Grant Edwards wrote:
>>> On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>>> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
>>>> Radeon Vega Graphics) has been freezing up multiple times per day --
>>>> always when in active use with X11. Before that, it had been reliable
>>>> since assembled (about 5 years ago).
>>>>
>>>> Ctrl-Alt-Backspace does nothing
>>>> Ctrl-Alt-F1 does nothing
>> I've accidentally seen this ages ago and this may not be there anymore. 
>> Isn't there some sort of output sent to ctrl alt F12?  If so, that may
>> shed some sort of light.  I think it is the same as messages or
>> something.  I saw it once when scrolling through the consoles looking
>> for something else.  I'm not real sure what I saw but looked like kernel
>> something.
> None of the other ctrl-alt-Fx keys work. I don't specifically remember
> trying ctrl-alt-F12, but I'll give that a go next time.
>
> I tried rolling back Xorg server, that didn't seem to help.
>
> Google has found reports of recent versions (24.3.x) of mesa being
> blamed for problems like this.  I see in the logs that mesa got
> upgraded from 24.2.8 to 24.3.3 on the 20th.  I've rolled that back, to
> see if that helps.  I thought it started a little earlier than that,
> but my memory may be playing tricks on me: it's been a long week.
>
> I've been looking for a good document with some recommendations on how
> to use the various SysRq functions (in what sequence), but all I can
> ever find is just lists of the commands with very descriptions of
> each.
>
>
>


This is two links I found.  Should help. 

https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html

https://linuxconfig.org/how-to-enable-all-sysrq-functions-on-linux

When I have a system to lock up and nothing else works, I remember it
this way.  Reboot Even If System Utterly Broken.  R - E - I - S - U -
B.  Another trick, using the F option.  If something is using a lot of
memory and making your system use swap which makes a system very slow,
just Alt Sysrq F to kill it.  I've done that when Firefox gets really
hungry.  I can't recall if you need to do the R option first for the F
option or not.  It's been a while. 

Dale

:-)  :-) 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [gentoo-user] Troubshooting AMD Radeon Vega system freeze
  2025-01-25 19:13 [gentoo-user] Troubshooting AMD Radeon Vega system freeze Grant Edwards
  2025-01-25 19:39 ` [gentoo-user] " Grant Edwards
@ 2025-01-26 17:37 ` Daniel Frey
  2025-01-26 20:18   ` [gentoo-user] " Grant Edwards
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Frey @ 2025-01-26 17:37 UTC (permalink / raw
  To: gentoo-user

On 1/25/25 11:13 AM, Grant Edwards wrote:
> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
> Radeon Vega Graphics) has been freezing up multiple times per day --
> always when in active use with X11. Before that, it had been reliable
> since assembled (about 5 years ago).
> 
> Ctrl-Alt-Backspace does nothing
> Ctrl-Alt-F1 does nothing
> 
> The SysRq key does work (I'm able to sync/remount/reboot), but I'm
> unable to get the console restored using SysRq keys once it freezes.
> 
> When it's not frozen, I can use SysRq to kill X11 and switch back to
> the FB console, but I can't figure out how to get console output to
> show up: I see the SysRq commands being echoed on the console, but all
> output from them is going into the syslog.
> 
> The CPU was not at all not heavily loaded, and temps and fans speeds
> are not noticably different than when the machine is idle.
> 
> I've run MemTest86+ for 12 hours: no errors.
> 
> I've tried dropping back fomm kernel 6.1.121 to 6.1.118: no change.
> 
> Next, I suppose I should start rolling back X11 drivers?
> 
> --
> Grant
> 
> 
> 

I had a problem with two of my Ryzen systems that exhibited this 
behaviour. One has a G processor, the other doesn't

Apparently Ryzen processors have an idle bug that locks up the system in 
this way. The bugs manifests randomly when the CPU is idle. For me, if I 
left the PC on overnight it would always be hung up the next morning. It 
would also trigger if I started a long emerge and forgot about it - it 
would idle enough it would hang.

I had to update the BIOS on both machines, then change the "Power Supply 
Idle Control" to "Typical Current Idle". Any other setting and the bug 
manifests. Note this setting is for Asus motherboards; I would imagine 
other manufacturers have a similar setting but it may be named differently.

I did test it, I left both my PCs on for over 48 hours and no lockup.

Dan


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-26 17:37 ` [gentoo-user] Troubshooting AMD Radeon Vega system freeze Daniel Frey
@ 2025-01-26 20:18   ` Grant Edwards
  2025-01-26 21:43     ` Michael
  2025-01-28 15:04     ` Daniel Frey
  0 siblings, 2 replies; 11+ messages in thread
From: Grant Edwards @ 2025-01-26 20:18 UTC (permalink / raw
  To: gentoo-user

On 2025-01-26, Daniel Frey <djqfrey@gmail.com> wrote:

> I had a problem with two of my Ryzen systems that exhibited this
> behaviour. One has a G processor, the other doesn't
>
> Apparently Ryzen processors have an idle bug that locks up the
> system in this way. The bugs manifests randomly when the CPU is
> idle. For me, if I left the PC on overnight it would always be hung
> up the next morning. It would also trigger if I started a long
> emerge and forgot about it - it would idle enough it would hang.

And that only hung user-space stuff?

I can still ping mine when it's frozen, and the SysRq key works
(except for commands to do with the framebuffer console).  Ssh doesn't
work and Ctrl-Alt-Fx doesn't work.

> I had to update the BIOS on both machines, then change the "Power Supply 
> Idle Control" to "Typical Current Idle". Any other setting and the bug 
> manifests. Note this setting is for Asus motherboards; I would imagine 
> other manufacturers have a similar setting but it may be named differently.
>
> I did test it, I left both my PCs on for over 48 hours and no lockup.

I don't think this is the same. My machine never locked up when idle.

It was always when doing something like resizing an X11 window.  I
could let it sit idle for days (either at the console prompt or with
X11 screen-saver active and a blanked screen). I could do anything I
wanted remotely via ssh.  It only seemed to lock up when I was doing
something in X11. It didn't have to be _much_ in X11 (didn't need to
be rendering video or 3D gaming). Just working with xemacs and xterms
seemed to be enough (though I probably had a Thunderbird window
sitting idle/iconified and a chrome window showing some
documentation).

Yesterday I downgraded mesa from 24.3.3 to 24.2.8, and it hasn't
frozen since -- though I also haven't been using it a lot since the
downgrade.  If make it through a day of work tomorrow without a
lockup, then I'm going to blame mesa. During a normal work day last
week it would usually freeze a half-dozen times.

--
Grant






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-26 20:18   ` [gentoo-user] " Grant Edwards
@ 2025-01-26 21:43     ` Michael
  2025-01-28 15:04     ` Daniel Frey
  1 sibling, 0 replies; 11+ messages in thread
From: Michael @ 2025-01-26 21:43 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 2680 bytes --]

On Sunday 26 January 2025 20:18:57 Greenwich Mean Time Grant Edwards wrote:
> On 2025-01-26, Daniel Frey <djqfrey@gmail.com> wrote:
> > I had a problem with two of my Ryzen systems that exhibited this
> > behaviour. One has a G processor, the other doesn't
> > 
> > Apparently Ryzen processors have an idle bug that locks up the
> > system in this way. The bugs manifests randomly when the CPU is
> > idle. For me, if I left the PC on overnight it would always be hung
> > up the next morning. It would also trigger if I started a long
> > emerge and forgot about it - it would idle enough it would hang.
> 
> And that only hung user-space stuff?
> 
> I can still ping mine when it's frozen, and the SysRq key works
> (except for commands to do with the framebuffer console).  Ssh doesn't
> work and Ctrl-Alt-Fx doesn't work.
> 
> > I had to update the BIOS on both machines, then change the "Power Supply
> > Idle Control" to "Typical Current Idle". Any other setting and the bug
> > manifests. Note this setting is for Asus motherboards; I would imagine
> > other manufacturers have a similar setting but it may be named
> > differently.
> > 
> > I did test it, I left both my PCs on for over 48 hours and no lockup.
> 
> I don't think this is the same. My machine never locked up when idle.
> 
> It was always when doing something like resizing an X11 window.  I
> could let it sit idle for days (either at the console prompt or with
> X11 screen-saver active and a blanked screen). I could do anything I
> wanted remotely via ssh.  It only seemed to lock up when I was doing
> something in X11. It didn't have to be _much_ in X11 (didn't need to
> be rendering video or 3D gaming). Just working with xemacs and xterms
> seemed to be enough (though I probably had a Thunderbird window
> sitting idle/iconified and a chrome window showing some
> documentation).
> 
> Yesterday I downgraded mesa from 24.3.3 to 24.2.8, and it hasn't
> frozen since -- though I also haven't been using it a lot since the
> downgrade.  If make it through a day of work tomorrow without a
> lockup, then I'm going to blame mesa. During a normal work day last
> week it would usually freeze a half-dozen times.
> 
> --
> Grant

I have an older AMD system running a wayland desktop, which locks up if I 
update mesa/xorg in the background and do not restart/reboot after it is done.  
It tends to lock up with Firefox when moving its window between monitors, or 
resizing windows in general.  It feels a bit random when it may decide to 
trigger a freeze, but only common denominator is mesa & xorg drivers updates 
which are not followed through with restarting the session or rebooting the 
system.

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED]
  2025-01-26  3:16     ` Grant Edwards
  2025-01-26  4:25       ` Dale
@ 2025-01-27 23:03       ` Grant Edwards
  2025-01-27 23:29         ` Grant Edwards
  1 sibling, 1 reply; 11+ messages in thread
From: Grant Edwards @ 2025-01-27 23:03 UTC (permalink / raw
  To: gentoo-user

On 2025-01-26, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>> On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>>> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
>>>> Radeon Vega Graphics) has been freezing up multiple times per day --
>>>> always when in active use with X11. Before that, it had been reliable
>>>> since assembled (about 5 years ago).
>>>>
[...]
>
> I tried rolling back Xorg server, that didn't seem to help.
>
> Google has found reports of recent versions (24.3.x) of mesa being
> blamed for problems like this.  I see in the logs that mesa got
> upgraded from 24.2.8 to 24.3.3 on the 20th.  I've rolled that back, to
> see if that helps.

No freezes for over two days now (including a full day of work today).

It's definitely mesa 24.3.3 that's causing the lockups.

Is this something that needs to be filed as a bug with Gentoo, or just upstream?

-- 
Grant



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED]
  2025-01-27 23:03       ` [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED] Grant Edwards
@ 2025-01-27 23:29         ` Grant Edwards
  0 siblings, 0 replies; 11+ messages in thread
From: Grant Edwards @ 2025-01-27 23:29 UTC (permalink / raw
  To: gentoo-user

On 2025-01-27, Grant Edwards <grant.b.edwards@gmail.com> wrote:
> On 2025-01-26, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>>> On 2025-01-25, Grant Edwards <grant.b.edwards@gmail.com> wrote:
>>>>> Starting about a week ago, my AMD system (AMD Ryzen 5 3400G with
>>>>> Radeon Vega Graphics) has been freezing up multiple times per day --
>>>>> always when in active use with X11. Before that, it had been reliable
>>>>> since assembled (about 5 years ago).
>>>>>
> [...]
>>
>> I tried rolling back Xorg server, that didn't seem to help.
>>
>> Google has found reports of recent versions (24.3.x) of mesa being
>> blamed for problems like this.  I see in the logs that mesa got
>> upgraded from 24.2.8 to 24.3.3 on the 20th.  I've rolled that back, to
>> see if that helps.
>
> No freezes for over two days now (including a full day of work today).
>
> It's definitely mesa 24.3.3 that's causing the lockups.

[sort-of see below]

> Is this something that needs to be filed as a bug with Gentoo, or just upstream?

And I found an excellent thread discussing this problem:

  https://bbs.archlinux.org/viewtopic.php?id=301798

I also found threads in Ubuntu forums talking (I think) about the same
problem.  But, as usual, the quality of the info in the archlinux
forum is several orders of magnitude better than that in the Ubuntu
forums where it's the usual blind leading the blind 3 (million)
stooges routine.

Based on info in the Arch Linux thread above, it looks like the root
cause is a race condition in some kernel code that wasn't triggered
until mesa starting using some particular feature of the AMD GPU.
There seem to be some fixes/remediations proposed for both the kernel
and mesa, but nothing final. Apparently the race condition has been
there all along, but was only very rarely triggred before mesa 24.3.

There are also hints that a real fix might involve AMD GPU firmware
changes, which aren't going to happen.

--
Grant






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze
  2025-01-26 20:18   ` [gentoo-user] " Grant Edwards
  2025-01-26 21:43     ` Michael
@ 2025-01-28 15:04     ` Daniel Frey
  1 sibling, 0 replies; 11+ messages in thread
From: Daniel Frey @ 2025-01-28 15:04 UTC (permalink / raw
  To: gentoo-user

On 1/26/25 12:18 PM, Grant Edwards wrote:
> And that only hung user-space stuff?
> 

For me, yes - I could ping it not nothing worked at the physical console 
nor could I ssh in.

Dan


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-01-28 15:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-25 19:13 [gentoo-user] Troubshooting AMD Radeon Vega system freeze Grant Edwards
2025-01-25 19:39 ` [gentoo-user] " Grant Edwards
2025-01-25 21:16   ` Dale
2025-01-26  3:16     ` Grant Edwards
2025-01-26  4:25       ` Dale
2025-01-27 23:03       ` [gentoo-user] Re: Troubshooting AMD Radeon Vega system freeze [SOLVED] Grant Edwards
2025-01-27 23:29         ` Grant Edwards
2025-01-26 17:37 ` [gentoo-user] Troubshooting AMD Radeon Vega system freeze Daniel Frey
2025-01-26 20:18   ` [gentoo-user] " Grant Edwards
2025-01-26 21:43     ` Michael
2025-01-28 15:04     ` Daniel Frey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox