public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] random, hard lockups
@ 2005-07-07 19:44 Matt Garman
  2005-07-07 20:10 ` Brett I. Holcomb
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Matt Garman @ 2005-07-07 19:44 UTC (permalink / raw
  To: gentoo-user


My system has been experiencing random, hard (must physically
reboot) lockups over the last year or so. The lockups are thus far
completely unpredictable, and it always occurs when I'm not at my
computer (during the night, at work, etc). When the computer goes
into this hard lock up state, the monitor is blank (but not in power
save mode); the computer will respond to pings; I cannot ssh into
the computer.

I just ran 14 hours of memtest86+ and found no errors.

I also checked the logs---nothing unusual there (I can't even
pinpoint exactly when the lockups occur).

Even worse, my computer may be fine for weeks or even months (i.e.
completely stable), then suddently start locking up about once a
day.

Does anyone have any idea what the problem may be? For what it's
worth, I have a very high ERR count in /proc/interrupts:

# uptime
08:58:35 up  1:29, 12 users,  load average: 1.22, 1.28, 1.20

# cat /proc/interrupts
CPU0       
0:    5391962          XT-PIC  timer
1:       3486          XT-PIC  i8042
2:          0          XT-PIC  cascade
5:     481356          XT-PIC  sym53c8xx, NVidia nForce2, ohci1394
8:          2          XT-PIC  rtc
9:          0          XT-PIC  acpi
10:          0          XT-PIC  ohci_hcd
11:     534284          XT-PIC  sym53c8xx, ohci_hcd, ehci_hcd, eth0, nvidia
12:     115771          XT-PIC  i8042
14:        473          XT-PIC  ide0
15:         11          XT-PIC  ide1
NMI:          0
LOC:    5391944
ERR:      33336
MIS:          0


Note that the machine has only been up for 90 minutes and it's
already logged 33k ERRs (though I don't exactly know what that
means, my other to nforce2 boards have a zero ERR count).

For what it's worth, this computer has the following hardware: Asus
A7N8X Deluxe, AMD Athlon XP 2500 (Barton core), 2x512 MB RAM,
GeForce4 ti4200 AGP 8x video card, LSI Logic SCSI controller,
Fujitsu SCSI Drive, Samsung IDE drive.

Another idea, I see the following in my dmesg:


PCI: Using ACPI for IRQ routing
** PCI interrupts are no longer routed automatically.  If this
** causes a device to stop working, it is probably because the
** driver failed to call pci_enable_device().  As a temporary
** workaround, the "pci=routeirq" argument restores the old
** behavior.  If this argument makes the device work again,
** please email the output of "lspci" to bjorn.helgaas@hp.com
** so I can fix the driver.

In my kernel config, I have Processor Type and Features -> Local
APIC support on unicprocessors and IO-APIC support on unicprocessors
both enabled. However, as you can see above, the kernel is still
using XT-PIC. My other two nforce2 boards (with the same kernel
config) use IO-APIC. I'm not sure exactly what all this means, but
it may mean something to somebody. :)

Thanks for any help or suggestions!
Matt

p.s. I'd be happy to post my complete dmesg if anyone would like to
see it.  --MG

-- 
Matt Garman
email at: http://raw-sewage.net/index.php?file=email
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-07-07 23:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-07 19:44 [gentoo-user] random, hard lockups Matt Garman
2005-07-07 20:10 ` Brett I. Holcomb
2005-07-07 23:03 ` Richard Fish
2005-07-07 23:48 ` W.Kenworthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox