* [gentoo-user] Kernel crash - howto find out what happened?
@ 2008-10-12 9:08 Alexander Puchmayr
2008-10-12 9:16 ` Erik Hahn
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Alexander Puchmayr @ 2008-10-12 9:08 UTC (permalink / raw
To: gentoo-user
Hi there!
MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset) worked fine for
nearly two years, but now it frequently freezes, sometimes (not always)
scrollock and capslock LED blinking).
Since I'm using the box as desktop, I have only a frozen X-server and no
possibility to switch to console (maybe there's some hint whats happened?).
How do I find out what happened, why it crashed? Modern systems have
MCE-logs, but how do I read it in this case? After reboot, all information
seems to be gone since mcelog is always empty.
I assume there's some problem with some hardware, I already tested RAM with
memtest86, but no errors.
Thanks for suggestions
Alex
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 9:08 [gentoo-user] Kernel crash - howto find out what happened? Alexander Puchmayr
@ 2008-10-12 9:16 ` Erik Hahn
2008-10-12 11:12 ` Alexander Puchmayr
2008-10-13 15:09 ` Duane Griffin
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Erik Hahn @ 2008-10-12 9:16 UTC (permalink / raw
To: gentoo-user
On Sun, Oct 12, 2008 at 11:08:57AM +0200, Alexander Puchmayr wrote:
> Since I'm using the box as desktop, I have only a frozen X-server and no
> possibility to switch to console (maybe there's some hint whats happened?).
>
> How do I find out what happened, why it crashed? Modern systems have
> MCE-logs, but how do I read it in this case? After reboot, all information
> seems to be gone since mcelog is always empty.
> Alex
If it's a kernel panic you actually get debugging information on the
console. It's just hidden "behind" the X server. Maybe you can reproduce
the problem working without X (If you can do your work purely from the
VTs)
Do you use any proprietary drivers?
-Erik
--
hackerkey://v4sw5hw2ln3pr5ck0ma2u7LwXm4l7Gi2e2t4b7Ken4/7a16s0r1p-5.62/-6.56g5OR
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 9:16 ` Erik Hahn
@ 2008-10-12 11:12 ` Alexander Puchmayr
2008-10-12 11:35 ` Alan McKinnon
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Puchmayr @ 2008-10-12 11:12 UTC (permalink / raw
To: gentoo-user
Am Sonntag, 12. Oktober 2008 schrieb Erik Hahn:
> On Sun, Oct 12, 2008 at 11:08:57AM +0200, Alexander Puchmayr wrote:
> > Since I'm using the box as desktop, I have only a frozen X-server and
> > no possibility to switch to console (maybe there's some hint whats
> > happened?).
> >
> > How do I find out what happened, why it crashed? Modern systems have
> > MCE-logs, but how do I read it in this case? After reboot, all
> > information seems to be gone since mcelog is always empty.
> > Alex
>
> If it's a kernel panic you actually get debugging information on the
> console. It's just hidden "behind" the X server. Maybe you can reproduce
> the problem working without X (If you can do your work purely from the
> VTs)
I've tried, but unfortunately, the X-Driver on my laptop (i965) does also
seem to have stability problems, after ca an hour it froze using 100%
cpu-time, unable to kill (nither kill or kill -9 did work). I guess it
didn't wakeup from DPMS :-(
>
> Do you use any proprietary drivers?
>
On the desktop I have nvidia-card with prop. driver.
Alex
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 11:12 ` Alexander Puchmayr
@ 2008-10-12 11:35 ` Alan McKinnon
2008-10-12 13:00 ` Alexander Puchmayr
0 siblings, 1 reply; 10+ messages in thread
From: Alan McKinnon @ 2008-10-12 11:35 UTC (permalink / raw
To: gentoo-user
On Sunday 12 October 2008 13:12:20 Alexander Puchmayr wrote:
> > If it's a kernel panic you actually get debugging information on the
> > console. It's just hidden "behind" the X server. Maybe you can reproduce
> > the problem working without X (If you can do your work purely from the
> > VTs)
>
> I've tried, but unfortunately, the X-Driver on my laptop (i965) does also
> seem to have stability problems, after ca an hour it froze using 100%
> cpu-time, unable to kill (nither kill or kill -9 did work). I guess it
> didn't wakeup from DPMS :-(
Here's a thought: if you have a spare machine, you could ssh in to your
desktop and continue to work normally. The ssh session would be tailing an
appropriate log, so even if the desktop goes south there's a good chance the
error log is visible
For something more persistent, you could try temporarily sending all logs to a
remote log server. Remote logging is quite efficient, I usually find the only
thing that gets in it's way is a complete instant kernel halt that brings the
whole machine down without warning - this is extremely rare on production
kernels
--
alan dot mckinnon at gmail dot com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 11:35 ` Alan McKinnon
@ 2008-10-12 13:00 ` Alexander Puchmayr
0 siblings, 0 replies; 10+ messages in thread
From: Alexander Puchmayr @ 2008-10-12 13:00 UTC (permalink / raw
To: gentoo-user
Am Sonntag, 12. Oktober 2008 schrieb Alan McKinnon:
> On Sunday 12 October 2008 13:12:20 Alexander Puchmayr wrote:
> > > If it's a kernel panic you actually get debugging information on the
> > > console. It's just hidden "behind" the X server. Maybe you can
> > > reproduce the problem working without X (If you can do your work
> > > purely from the VTs)
> >
> > I've tried, but unfortunately, the X-Driver on my laptop (i965) does
> > also seem to have stability problems, after ca an hour it froze using
> > 100% cpu-time, unable to kill (nither kill or kill -9 did work). I
> > guess it didn't wakeup from DPMS :-(
>
> Here's a thought: if you have a spare machine, you could ssh in to your
> desktop and continue to work normally. The ssh session would be tailing
> an appropriate log, so even if the desktop goes south there's a good
> chance the error log is visible
>
> For something more persistent, you could try temporarily sending all logs
> to a remote log server. Remote logging is quite efficient, I usually find
> the only thing that gets in it's way is a complete instant kernel halt
> that brings the whole machine down without warning - this is extremely
> rare on production kernels
I really doubt that this works, the logger does not have the change to write
anything as soon the kernel crashed, neither on a local disk or remote. It
seems to be something you called the "instant kernel halt", and I have the
luck to mess around with one of these rare cases :-(
But to give it a chance, I'm running a "cat /proc/kmsg" on the desktop,
started via ssh as you suggested.
Alex
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 9:08 [gentoo-user] Kernel crash - howto find out what happened? Alexander Puchmayr
2008-10-12 9:16 ` Erik Hahn
@ 2008-10-13 15:09 ` Duane Griffin
2008-10-13 23:30 ` Daniel da Veiga
2008-10-19 9:58 ` Alexander Puchmayr
3 siblings, 0 replies; 10+ messages in thread
From: Duane Griffin @ 2008-10-13 15:09 UTC (permalink / raw
To: gentoo-user
2008/10/12 Alexander Puchmayr <alexander.puchmayr@linznet.at>:
> MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset)
> worked fine for nearly two years, but now it frequently freezes, sometimes
> (not always) scrollock and capslock LED blinking).
If you have another machine lying around, try setting up netconsole
and/or serial console logging. They should catch any dying messages
from your kernel. Blinking LEDs indicates a panic, which means you
should get a message in those cases, at least.
Using serial console is the easiest and most reliable way, but
requires a serial cable. Netconsole just uses ethernet but isn't as
reliable. Take a look at Documentation/serial-console.txt and
Documentation/networking/netconsole.txt under your kernel source
directory for more info.
Cheers,
Duane.
--
"I never could learn to drink that blood and call it wine" - Bob Dylan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 9:08 [gentoo-user] Kernel crash - howto find out what happened? Alexander Puchmayr
2008-10-12 9:16 ` Erik Hahn
2008-10-13 15:09 ` Duane Griffin
@ 2008-10-13 23:30 ` Daniel da Veiga
2008-10-14 16:31 ` Alexander Puchmayr
2008-10-19 9:58 ` Alexander Puchmayr
3 siblings, 1 reply; 10+ messages in thread
From: Daniel da Veiga @ 2008-10-13 23:30 UTC (permalink / raw
To: gentoo-user
On Sun, Oct 12, 2008 at 07:08, Alexander Puchmayr
<alexander.puchmayr@linznet.at> wrote:
> Hi there!
>
> MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset) worked fine for
> nearly two years, but now it frequently freezes, sometimes (not always)
> scrollock and capslock LED blinking).
>
> Since I'm using the box as desktop, I have only a frozen X-server and no
> possibility to switch to console (maybe there's some hint whats happened?).
>
> How do I find out what happened, why it crashed? Modern systems have
> MCE-logs, but how do I read it in this case? After reboot, all information
> seems to be gone since mcelog is always empty.
>
> I assume there's some problem with some hardware, I already tested RAM with
> memtest86, but no errors.
>
I had one of this freezes today.
Simply killed X using CTRL+SYSREQ+K and got back a console with error messages.
Have you tried the SYSREQ keys?
--
Daniel da Veiga
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-13 23:30 ` Daniel da Veiga
@ 2008-10-14 16:31 ` Alexander Puchmayr
2008-10-14 17:09 ` Alex Schuster
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Puchmayr @ 2008-10-14 16:31 UTC (permalink / raw
To: gentoo-user
On Dienstag, 14. Oktober 2008, Daniel da Veiga wrote:
> On Sun, Oct 12, 2008 at 07:08, Alexander Puchmayr
>
> <alexander.puchmayr@linznet.at> wrote:
> > Hi there!
> >
> > MY gentoo system (an amd64@4400+, 2GB ram, nforce4-chipset) worked fine
> > for nearly two years, but now it frequently freezes, sometimes (not
> > always) scrollock and capslock LED blinking).
> >
> > Since I'm using the box as desktop, I have only a frozen X-server and
> > no possibility to switch to console (maybe there's some hint whats
> > happened?).
> >
> > How do I find out what happened, why it crashed? Modern systems have
> > MCE-logs, but how do I read it in this case? After reboot, all
> > information seems to be gone since mcelog is always empty.
> >
> > I assume there's some problem with some hardware, I already tested RAM
> > with memtest86, but no errors.
>
> I had one of this freezes today.
> Simply killed X using CTRL+SYSREQ+K and got back a console with error
> messages.
>
> Have you tried the SYSREQ keys?
How does this work? I've tried it but I didn't get this working at all.
AFAIK, first step is to compile the CONFIG_MAGIC_SYSRQ into the kernel.
Then, make sure there's a "1" in /proc/sys/kernel/sysrq; well it is.
/usr/src/linux/Documentation/sysrq.txt says press "ALT-SysRq-<command key>",
I've tried it out with SysRq=printScreen and cmd='h' for help, but nothing
happens, even under normal conditions. What did I make wrong?
Alex
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-14 16:31 ` Alexander Puchmayr
@ 2008-10-14 17:09 ` Alex Schuster
0 siblings, 0 replies; 10+ messages in thread
From: Alex Schuster @ 2008-10-14 17:09 UTC (permalink / raw
To: gentoo-user
Alexander Puchmayr writes:
> On Dienstag, 14. Oktober 2008, Daniel da Veiga wrote:
> >
> > I had one of this freezes today.
> > Simply killed X using CTRL+SYSREQ+K and got back a console with error
> > messages.
> >
> > Have you tried the SYSREQ keys?
>
> How does this work? I've tried it but I didn't get this working at all.
> AFAIK, first step is to compile the CONFIG_MAGIC_SYSRQ into the kernel.
> Then, make sure there's a "1" in /proc/sys/kernel/sysrq; well it is.
> /usr/src/linux/Documentation/sysrq.txt says press "ALT-SysRq-<command
> key>", I've tried it out with SysRq=printScreen and cmd='h' for help,
> but nothing happens, even under normal conditions. What did I make
> wrong?
Try another key than 'h'. The space key will show a little help, probably
that what you expected to see with 'h'. Oh, you need to be on a text
console (ctrl-at-f1) to get visible output.
http://en.wikipedia.org/wiki/Magic_SysRq_key
Wonko
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-user] Kernel crash - howto find out what happened?
2008-10-12 9:08 [gentoo-user] Kernel crash - howto find out what happened? Alexander Puchmayr
` (2 preceding siblings ...)
2008-10-13 23:30 ` Daniel da Veiga
@ 2008-10-19 9:58 ` Alexander Puchmayr
3 siblings, 0 replies; 10+ messages in thread
From: Alexander Puchmayr @ 2008-10-19 9:58 UTC (permalink / raw
To: gentoo-user
Hi there!
As my system froze again right now, I've tried to reproduce it, tried to use
some of the hints given to me in this thread, and made the following
observations:
* The system freezes on heavy I/O on my sata-harddisks, especially when
copying mpeg-files (>2GB) from one disk to another.
* a "cat /proc/kmsg" started via ssh from another machine showed the last
lines
<4>ata6: timeout waiting for ADMA IDLE, stat=0x440
<4>ata6: timeout waiting for ADMA LEGACY, stat=0x440
* sysrq does not work at all (why?? I configured it identically to my
notebook, it works on the nb but not on the desktop. Simply no reaction
when pressing alt-sysrq-something, even under normal conditions.)
The sata-controller is an nvidia (onboard on my nforce-based mainboard),
driven by sata_nv-driver (The one from the kernel, no proprietary nvidia
chipset/sata driver installed). The kernel in question is a
gentoo-2.6.24-r8, I'll try an upgrade to the latest stable gentoo kernel.
Thanks to all who gave suggestions
Alex
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2008-10-19 9:58 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-12 9:08 [gentoo-user] Kernel crash - howto find out what happened? Alexander Puchmayr
2008-10-12 9:16 ` Erik Hahn
2008-10-12 11:12 ` Alexander Puchmayr
2008-10-12 11:35 ` Alan McKinnon
2008-10-12 13:00 ` Alexander Puchmayr
2008-10-13 15:09 ` Duane Griffin
2008-10-13 23:30 ` Daniel da Veiga
2008-10-14 16:31 ` Alexander Puchmayr
2008-10-14 17:09 ` Alex Schuster
2008-10-19 9:58 ` Alexander Puchmayr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox