* [gentoo-user] Invalid opcode after kernel update
@ 2023-09-17 21:49 Fernando Rodriguez
2023-09-17 22:03 ` Alan Mackenzie
0 siblings, 1 reply; 6+ messages in thread
From: Fernando Rodriguez @ 2023-09-17 21:49 UTC (permalink / raw
To: gentoo-user
A few months ago after updating my kernel I started getting an invalid
opcode error during boot on the init process on my initramfs which I did
rebuilt. Switching to the old kernel and initramfs fixed the problem so
I kept that kernel for a few months for lack of time.
Today I rebuilt the whole system using `emerge -e @world` and after that
I'm able to boot the new kernel but now some pre-compiled packages (and
some that emerge -e missed because the ebuild was masked) crash with
illegal opcode. In the case of chrome it's not crashing but it only
renders garbage for webpages.
Does anyone have a clue what is happening? It's like the instruction set
changed after the kernel update (or was it the microcode?)
Thanks,
--
Fernando Rodriguez
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Invalid opcode after kernel update
2023-09-17 21:49 [gentoo-user] Invalid opcode after kernel update Fernando Rodriguez
@ 2023-09-17 22:03 ` Alan Mackenzie
2023-09-18 15:04 ` Fernando Rodriguez
0 siblings, 1 reply; 6+ messages in thread
From: Alan Mackenzie @ 2023-09-17 22:03 UTC (permalink / raw
To: gentoo-user
Hello, Fernando.
On Sun, Sep 17, 2023 at 17:49:22 -0400, Fernando Rodriguez wrote:
> A few months ago after updating my kernel I started getting an invalid
> opcode error during boot on the init process on my initramfs which I did
> rebuilt. Switching to the old kernel and initramfs fixed the problem so
> I kept that kernel for a few months for lack of time.
> Today I rebuilt the whole system using `emerge -e @world` and after that
> I'm able to boot the new kernel but now some pre-compiled packages (and
> some that emerge -e missed because the ebuild was masked) crash with
> illegal opcode. In the case of chrome it's not crashing but it only
> renders garbage for webpages.
> Does anyone have a clue what is happening? It's like the instruction set
> changed after the kernel update (or was it the microcode?)
Could it be that you've got a sporadic RAM failure? Running the
standard RAM test (the one you boot into, I've forgotten its name) for
many hours might pin down the problem.
> Thanks,
> --
> Fernando Rodriguez
--
Alan Mackenzie (Nuremberg, Germany).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Invalid opcode after kernel update
2023-09-17 22:03 ` Alan Mackenzie
@ 2023-09-18 15:04 ` Fernando Rodriguez
2023-09-18 18:52 ` Fernando Rodriguez
0 siblings, 1 reply; 6+ messages in thread
From: Fernando Rodriguez @ 2023-09-18 15:04 UTC (permalink / raw
To: gentoo-user
On 9/17/23 18:03, Alan Mackenzie wrote:
> Hello, Fernando.
>
> On Sun, Sep 17, 2023 at 17:49:22 -0400, Fernando Rodriguez wrote:
>> A few months ago after updating my kernel I started getting an invalid
>> opcode error during boot on the init process on my initramfs which I did
>> rebuilt. Switching to the old kernel and initramfs fixed the problem so
>> I kept that kernel for a few months for lack of time.
>
>> Today I rebuilt the whole system using `emerge -e @world` and after that
>> I'm able to boot the new kernel but now some pre-compiled packages (and
>> some that emerge -e missed because the ebuild was masked) crash with
>> illegal opcode. In the case of chrome it's not crashing but it only
>> renders garbage for webpages.
>
>> Does anyone have a clue what is happening? It's like the instruction set
>> changed after the kernel update (or was it the microcode?)
>
> Could it be that you've got a sporadic RAM failure? Running the
> standard RAM test (the one you boot into, I've forgotten its name) for
> many hours might pin down the problem.
I ran the test to be sure but it's not sporadic. It happens all the time
with the same pre-built binaries. My last working kernel was 5.15.122,
if I boot from that kernel everything works. Before the update
everything was built with -march=native and before the 'emerge -e' I
switched to -mtune=generic but I don't think it was the flags that
messed it up but the act of rebuilding because after rebuilding the
whole system I'm still having issues with pre-compiled binaries and
those should be generic builds. Strangely the same binaries that crash
on the host system run fine on a VM using hw virtualization.
I will try to run it on gdb to find out which instruction is triggering
the fault.
Thanks,
Fernando
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Invalid opcode after kernel update
2023-09-18 15:04 ` Fernando Rodriguez
@ 2023-09-18 18:52 ` Fernando Rodriguez
2023-09-18 19:04 ` Fernando Rodriguez
2023-09-18 19:23 ` Peter Böhm
0 siblings, 2 replies; 6+ messages in thread
From: Fernando Rodriguez @ 2023-09-18 18:52 UTC (permalink / raw
To: gentoo-user
On 9/18/23 11:04, Fernando Rodriguez wrote:
> On 9/17/23 18:03, Alan Mackenzie wrote:
> I will try to run it on gdb to find out which instruction is triggering
> the fault.
>
> Thanks,
> Fernando
>
The crash is happening on AVX2 instructions. My CPU is Intel(R) Core(TM)
i7-8809G CPU @ 3.10GHz and it's supposed to have AVX2 but I don't see it
listed on /proc/cpuinfo. I can't reboot into the old kernel right now
but I suspect that when I do it will be there because I kind of remember
seeing it there. Any clues?
--
Fernando Rodriguez
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Invalid opcode after kernel update
2023-09-18 18:52 ` Fernando Rodriguez
@ 2023-09-18 19:04 ` Fernando Rodriguez
2023-09-18 19:23 ` Peter Böhm
1 sibling, 0 replies; 6+ messages in thread
From: Fernando Rodriguez @ 2023-09-18 19:04 UTC (permalink / raw
To: gentoo-user
On 9/18/23 14:52, Fernando Rodriguez wrote:
> On 9/18/23 11:04, Fernando Rodriguez wrote:
>> On 9/17/23 18:03, Alan Mackenzie wrote:
>> I will try to run it on gdb to find out which instruction is
>> triggering the fault.
>>
>> Thanks,
>> Fernando
>>
>
> The crash is happening on AVX2 instructions. My CPU is Intel(R) Core(TM)
> i7-8809G CPU @ 3.10GHz and it's supposed to have AVX2 but I don't see it
> listed on /proc/cpuinfo. I can't reboot into the old kernel right now
> but I suspect that when I do it will be there because I kind of remember
> seeing it there. Any clues?
>
Found this on my journal: "GDS: Microcode update needed! Disabling AVX
as mitigation." So I guess it's a microcode issue. I'm using dracut with
--early-microcode and I have CONFIG_MICROCODE_INTEL set and I have the
latest (as of friday) intel-microcode. I don't have initramfs enabled
for intel-microcode but never did and it was working. Will try it when I
get back, gotta run now. Any more ideas?
--
Fernando Rodriguez
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Invalid opcode after kernel update
2023-09-18 18:52 ` Fernando Rodriguez
2023-09-18 19:04 ` Fernando Rodriguez
@ 2023-09-18 19:23 ` Peter Böhm
1 sibling, 0 replies; 6+ messages in thread
From: Peter Böhm @ 2023-09-18 19:23 UTC (permalink / raw
To: gentoo-user, Fernando Rodriguez
Am Montag, 18. September 2023, 20:52:27 CEST schrieb Fernando Rodriguez:
> On 9/18/23 11:04, Fernando Rodriguez wrote:
> > On 9/17/23 18:03, Alan Mackenzie wrote:
> > I will try to run it on gdb to find out which instruction is triggering
> > the fault.
> >
> > Thanks,
> > Fernando
>
> The crash is happening on AVX2 instructions. My CPU is Intel(R) Core(TM)
> i7-8809G CPU @ 3.10GHz and it's supposed to have AVX2 but I don't see it
> listed on /proc/cpuinfo. I can't reboot into the old kernel right now
> but I suspect that when I do it will be there because I kind of remember
> seeing it there. Any clues?
It is Intel DOWNFALL, also called GDS Gather Data Sampling.
Maybe you want read: https://www.phoronix.com/review/downfall
Regards,
Peter
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-09-18 19:23 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-17 21:49 [gentoo-user] Invalid opcode after kernel update Fernando Rodriguez
2023-09-17 22:03 ` Alan Mackenzie
2023-09-18 15:04 ` Fernando Rodriguez
2023-09-18 18:52 ` Fernando Rodriguez
2023-09-18 19:04 ` Fernando Rodriguez
2023-09-18 19:23 ` Peter Böhm
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox