public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Oops on heavy load
@ 2006-08-28 10:24 CapSel
  2006-08-28 10:49 ` Mick
  0 siblings, 1 reply; 3+ messages in thread
From: CapSel @ 2006-08-28 10:24 UTC (permalink / raw
  To: gentoo-user

My server crashed twice with same Oops info. It runs apache,mysql,nfs
(client only), it has more than 2GB swap partition (does not use it
very much), 2GB ram, disk is Raptor on SATA...
I have two clones of this machine (computers with same parts, system
is copied by RSYNC) and only one of them has these Oopses. I made
copies of gentoo like this:

   1 -> 2 -> 3

...and that one which crashed is "2", so if there was any error
durring rsync from 1->2 it should be repeted on "3", but "3" and "1"
works all the time without any crashes.

When it crushes some process' just stops - for example ps ax stopped
in the middle of list, ctrl+c did not helped, w (/bin/w) did not
showed anything, but dmesg worked and I could login thourgh ssh to try
another command.

The server must work all the time...

Does anybody have an idea what coused this Oops??
Below I added text from dmesg after this Oops. Thanks in advance for any help.

BUG: unable to handle kernel NULL pointer dereference at virtual
address 00000044
 printing eip:
c02c4887
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfs lockd nfs_acl sunrpc floppy intel_agp agpgart e100
CPU:    0
EIP:    0060:[<c02c4887>]    Not tainted VLI
EFLAGS: 00010202   (2.6.17-gentoo-r4 #1)
EIP is at _spin_lock+0x0/0xf
eax: 00000044   ebx: f5064090   ecx: f5064c68   edx: f5064184
esi: f5064184   edi: 00000000   ebp: 00000001   esp: c2273ec8
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 162, threadinfo=c2272000 task=c2201a50)
Stack: c014d48c f5064090 f5064098 00000000 0000002f c01604fb 0000002f 00000080
       c5e52248 c446cab8 00000000 00007080 00000081 c20fe520 c01605ce c01394db
       001c2000 00000000 00007080 00000003 00000000 00000000 0007172e 000000d0
Call Trace:
 <c014d48c> remove_inode_buffers+0x28/0x5b  <c01604fb> prune_icache+0xb8/0x177
 <c01605ce> shrink_icache_memory+0x14/0x2b  <c01394db> shrink_slab+0x13c/0x194
 <c013a651> balance_pgdat+0x219/0x335  <c013a859> kswapd+0xec/0xee
 <c0128712> autoremove_wake_function+0x0/0x2d  <c0128712>
autoremove_wake_function+0x0/0x2d
 <c013a76d> kswapd+0x0/0xee  <c0100e01> kernel_thread_helper+0x5/0xb
Code: 81 28 00 00 00 01 74 05 e8 3f e8 ff ff c3 ba 00 e0 ff ff 21 e2
81 42 14 00 01 00 00 f0 81 28 00 00 00 01 74 05 e8 22 e8 ff ff c3 <f0>
fe 08 79 09 f3 90 80 38 00 7e f9 eb f2 c3 f0 81 28 00 00 00
EIP: [<c02c4887>] _spin_lock+0x0/0xf SS:ESP 0068:c2273ec8
-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [gentoo-user] Oops on heavy load
  2006-08-28 10:24 [gentoo-user] Oops on heavy load CapSel
@ 2006-08-28 10:49 ` Mick
  2006-08-28 12:30   ` Iain Buchanan
  0 siblings, 1 reply; 3+ messages in thread
From: Mick @ 2006-08-28 10:49 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 1218 bytes --]

On Monday 28 August 2006 11:24, CapSel wrote:

> Process kswapd0 (pid: 162, threadinfo=c2272000 task=c2201a50)
> Stack: c014d48c f5064090 f5064098 00000000 0000002f c01604fb 0000002f
> 00000080 c5e52248 c446cab8 00000000 00007080 00000081 c20fe520 c01605ce
> c01394db 001c2000 00000000 00007080 00000003 00000000 00000000 0007172e
> 000000d0 Call Trace:
>  <c014d48c> remove_inode_buffers+0x28/0x5b  <c01604fb>
> prune_icache+0xb8/0x177 <c01605ce> shrink_icache_memory+0x14/0x2b 
> <c01394db> shrink_slab+0x13c/0x194 <c013a651> balance_pgdat+0x219/0x335 
> <c013a859> kswapd+0xec/0xee <c0128712> autoremove_wake_function+0x0/0x2d 
> <c0128712>

This reminds me of a box I had with faulty memory - it would some times crash 
when it was about to start using swap.  If the buffering was too aggressive 
the machine would crash.  If not it would start swapping and carry on working 
without further problems.  I changed the offending memory module and had no 
problems since.

Somebody else may know more with respect to the particular data dump you 
provided, otherwise you could try troubleshooting it by using some more 
involved memory/swap tests (not just mem86+).

HTH.
-- 
Regards,
Mick

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [gentoo-user] Oops on heavy load
  2006-08-28 10:49 ` Mick
@ 2006-08-28 12:30   ` Iain Buchanan
  0 siblings, 0 replies; 3+ messages in thread
From: Iain Buchanan @ 2006-08-28 12:30 UTC (permalink / raw
  To: gentoo-user

On Mon, 2006-08-28 at 11:49 +0100, Mick wrote:
> On Monday 28 August 2006 11:24, CapSel wrote:
> 
> > Process kswapd0 (pid: 162, threadinfo=c2272000 task=c2201a50)
> > Stack: c014d48c f5064090 f5064098 00000000 0000002f c01604fb 0000002f
> > 00000080 c5e52248 c446cab8 00000000 00007080 00000081 c20fe520 c01605ce
> > c01394db 001c2000 00000000 00007080 00000003 00000000 00000000 0007172e
> > 000000d0 Call Trace:
> >  <c014d48c> remove_inode_buffers+0x28/0x5b  <c01604fb>
> > prune_icache+0xb8/0x177 <c01605ce> shrink_icache_memory+0x14/0x2b 
> > <c01394db> shrink_slab+0x13c/0x194 <c013a651> balance_pgdat+0x219/0x335 
> > <c013a859> kswapd+0xec/0xee <c0128712> autoremove_wake_function+0x0/0x2d 
> > <c0128712>
> 
> This reminds me of a box I had with faulty memory

I agree, it sounds notoriously like faulty RAM, which often shows its
symptoms at high load (don't ask me why).

I find that re-seating the RAM may help, at least for a short period
(expect that it might return).

You can try memtest, but apparently it gets harder and harder to detect
memory problems as new techniques are developed by hardware
manufacturers to make memory faster.

I suggest a trial and error - swap the RAM with one of the other
machines, or with a fresh set of RAM if you can get some - if the
symptoms follow the RAM from server to server then you know what the
problem is :)

HTH,
-- 
Iain Buchanan <iaindb at netspace dot net dot au>

The human race has one really effective weapon, and that is laughter.
		-- Mark Twain

-- 
gentoo-user@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-08-28 12:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-28 10:24 [gentoo-user] Oops on heavy load CapSel
2006-08-28 10:49 ` Mick
2006-08-28 12:30   ` Iain Buchanan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox