public inbox for gentoo-amd64@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-amd64] Problem with emerge on a dual-processor machine
@ 2006-10-31 18:46 Vesna Petrovic
  2006-10-31 18:54 ` Bob Sanders
  2006-10-31 23:39 ` [gentoo-amd64] " Duncan
  0 siblings, 2 replies; 3+ messages in thread
From: Vesna Petrovic @ 2006-10-31 18:46 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 12996 bytes --]

 Hello,

 I've just installed Gentoo on a dual-processor machine and now I'm running
into the following problem - when I start emerge, it randomly stops and one
of the following things happens:
  - the machine freezes completely so that I cannot switch to another
console or do anything
  - if I already have multiple ssh sessions open, sometimes one of the
sessions remains alive, but invoking any command freezes that session. Any
attempt to kill a process has no effect.
  - soft lockup detected on at least one cpu.

I'm running out of ideas what to try next, so I thought I would ask for
help.
Here is what I checked and tried so far:
  - configured and built the kernel with SMP and NUMA support.
Triple-checked this.
  - both processors are detected and initialized at boot. ACPI is used for
SMP configuration information
  - processor temperatures: CPU0 32 C, CPU1 31 C, system 39 C. System is
located in a room with steady 20 C
  - disabled CPU#1using
    echo 0 > /sys/devices/system/cpu/cpu1/online
    In this case, everything seems to work fine. This is the only way to
compile or emerge anything.
  - using MAKEOPTS="-j3". Tried with "-j2", but the same problem occurs.
  - checked if there are SMP specific USE flags, and the only one I could
find was for gimp.
  - experimented with different preemption models. The problem occurs with
all of them.
  - disabled APM and enabled ACPI 2.0 support. After I did this, I've got
"kernel panic - killing interrupt handler ...."

 The system has 2 AMD Opeteron Processors 252,  5 disks - 1IDE Maxtor
6B200R0 and 4 SCSI Maxtor 6L300S0, and probably irrelevant ATAPI 48X DVD-ROM
DVD-R CD-R/RW drive, Ethernet controller: Broadcom Corporation NetXtreme
BCM5703X Gigabit Ethernet (rev 02), RAID bus controller: Silicon Image, Inc.
SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02), FireWire (IEEE
1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link),
VGA compatible controller: ATI Technologies Inc RV350 AP [Radeon 9600]
05:00.1, Display controller: ATI Technologies Inc RV350 AP [Radeon 9600]
(Secondary).
  Kernel version is 2.6.17 built using gentoo-sources.

  Any idea what might be causing this problem? Bad kernel configuration? Bad
system configuration? Kernel bug? Portage bug? Defective processor? Problem
with disk access?
  I'm including a snapshot of the info I could retrieve from the system when
the system remained somewhat responsive after the problem occurred.

  Kind regards,
    Vesna


odin ~ # uname -a
Linux odin 2.6.17-gentoo-r8 #7 SMP PREEMPT Tue Oct 31 12:10:14 EST 2006
x86_64 AMD Opteron(tm) Processor 252 GNU/Linux

top - 22:42:38 up  7:57,  3 users,  load average: 7.99 , 7.71, 5.39
Tasks:  64 total,   8 running,  56 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0% us, 100.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0%si
Cpu1  :  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0%si
Mem:   6929848k total,   140192k used,  6789656k free,    15272k buffers
Swap:  5004236k total,        0k used,  5004236k free,    65328k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
 9811 root      16   0     0    0    0 R  100  0.0  18:27.43emerge
    1 root      16   0  2608  572  488 S    0  0.0   0:00.46init
    2 root      RT   0     0    0    0 R    0  0.0
0:00.00migration/0
    3 root      34  19     0    0    0 S    0  0.0
0:00.00ksoftirqd/0
    4 root      RT   0     0    0    0 R    0  0.0   0:00.00watchdog/0
    5 root      RT   0     0    0    0 S    0  0.0
0:00.00migration/1
    6 root      34  19     0    0    0 S    0  0.0
0:00.00ksoftirqd/1
    7 root      RT   0     0    0    0 S    0  0.0   0:00.00watchdog/1
    8 root      10  -5     0    0    0 R    0  0.0   0:00.00events/0
    9 root      10  -5     0    0    0 S    0  0.0   0:00.00events/1
   10 root      19  -5     0    0    0 S    0  0.0   0:00.00khelper
   11 root      10  -5     0    0    0 S    0  0.0   0:00.00kthread
   16 root      10  -5     0    0    0 R    0  0.0   0:00.00kblockd/0
   17 root      10  -5     0    0    0 S    0  0.0   0:00.00kblockd/1
   18 root      14  -5     0    0    0 S    0  0.0   0:00.00kacpid
  103 root      10  -5     0    0    0 S    0  0.0   0:00.02kseriod
  166 root      20   0     0    0    0 S    0  0.0   0:00.00pdflush
  167 root      15   0     0    0    0 S    0  0.0   0:00.00pdflush
  168 root      18   0     0    0    0 S    0  0.0   0:00.00kswapd0
  169 root      15   0     0    0    0 S    0  0.0   0:00.00kswapd1
  170 root      14  -5     0    0    0 S    0  0.0   0:00.00aio/0
  171 root      10  -5     0    0    0 S    0  0.0   0:00.00aio/1


top - 23:17:53 up  8:32,  3 users,  load average: 11.99, 11.92, 10.81
Tasks:  65 total,   8 running,  57 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0% us, 100.0% sy,  0.0% ni,  0.0% id,  0.0% wa,  0.0% hi,  0.0%si
Cpu1  :  0.0% us,  0.0% sy,  0.0% ni,  0.0% id, 100.0% wa,  0.0% hi,  0.0%si
Mem:   6929848k total,   142444k used,  6787404k free,    15300k buffers
Swap:  5004236k total,        0k used,  5004236k free,    65560k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
 9811 root      16   0     0    0    0 R  100  0.0  53:43.13emerge
    1 root      16   0  2608  572  488 S    0  0.0   0:00.46init
    2 root      RT   0     0    0    0 R    0  0.0
0:00.00migration/0
    3 root      34  19     0    0    0 S    0  0.0
0:00.00ksoftirqd/0
    4 root      RT   0     0    0    0 R    0  0.0   0:00.00watchdog/0
    5 root      RT   0     0    0    0 S    0  0.0
0:00.00migration/1
    6 root      34  19     0    0    0 S    0  0.0
0:00.00ksoftirqd/1
    7 root      RT   0     0    0    0 S    0  0.0   0:00.00watchdog/1
    8 root      10  -5     0    0    0 R    0  0.0   0:00.00events/0
    9 root      10  -5     0    0    0 S    0  0.0   0:00.00events/1
   10 root      19  -5     0    0    0 S    0  0.0   0:00.00khelper
   11 root      10  -5     0    0    0 S    0  0.0   0:00.00kthread
   16 root      10  -5     0    0    0 R    0  0.0   0:00.00kblockd/0
   17 root      10  -5     0    0    0 S    0  0.0   0:00.00kblockd/1
   18 root      14  -5     0    0    0 S    0  0.0   0:00.00kacpid
  103 root      10  -5     0    0    0 S    0  0.0   0:00.02kseriod
  166 root      20   0     0    0    0 S    0  0.0   0:00.00pdflush
  167 root      15   0     0    0    0 D    0  0.0   0:00.00pdflush
  168 root      18   0     0    0    0 S    0  0.0   0:00.00kswapd0
  169 root      15   0     0    0    0 S    0  0.0   0:00.00kswapd1
  170 root      14  -5     0    0    0 S    0  0.0   0:00.00aio/0
  171 root      10  -5     0    0    0 S    0  0.0   0:00.00 aio/1


F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
4 S     0     1     0  0  76   0 -   652 -      ?        00:00:00 init
1 R     0     2     1  0 -40   - -     0 -      ?        00:00:00
migration/0
1 S     0     3     1  0  94  19 -     0 ksofti ?        00:00:00
ksoftirqd/0
5 R     0     4     1  0 -40   - -     0 -      ?        00:00:00 watchdog/0

1 S     0     5     1  0 -40   - -     0 migrat ?        00:00:00
migration/1
1 S     0     6     1  0  94  19 -     0 ksofti ?        00:00:00
ksoftirqd/1
5 S     0     7     1  0 -40   - -     0 watchd ?        00:00:00 watchdog/1

5 R     0     8     1  0  70  -5 -     0 -      ?        00:00:00 events/0
1 S     0     9     1  0  70  -5 -     0 worker ?        00:00:00 events/1
1 S     0    10     1  0  79  -5 -     0 worker ?        00:00:00 khelper
1 S     0    11     1  0  70  -5 -     0 worker ?        00:00:00 kthread
1 R     0    16    11  0  70  -5 -     0 -      ?        00:00:00 kblockd/0
1 S     0    17    11  0  70  -5 -     0 worker ?        00:00:00 kblockd/1
1 S     0    18    11  0  74  -5 -     0 worker ?        00:00:00 kacpid
1 S     0   103    11  0  70  -5 -     0 serio_ ?        00:00:00 kseriod
1 S     0   166    11  0  80   0 -     0 pdflus ?        00:00:00 pdflush
1 S     0   167    11  0  75   0 -     0 pdflus ?        00:00:00 pdflush
1 S     0   168     1  0  78   0 -     0 kswapd ?        00:00:00 kswapd0
1 S     0   169     1  0  75   0 -     0 kswapd ?        00:00:00 kswapd1
1 S     0   170    11  0  74  -5 -     0 worker ?        00:00:00 aio/0
1 S     0   171    11  0  70  -5 -     0 worker ?        00:00:00 aio/1
1 S     0   770    11  0  70  -5 -     0 worker ?        00:00:00 kpsmoused
1 S     0   818    11  0  70  -5 -     0 worker ?        00:00:00 ata/0
1 S     0   819    11  0  71  -5 -     0 worker ?        00:00:00 ata/1
1 S     0   821    11  0  71  -5 -     0 scsi_e ?        00:00:00 scsi_eh_0
1 S     0   822    11  0  71  -5 -     0 scsi_e ?        00:00:00 scsi_eh_1
1 S     0   823    11  0  71  -5 -     0 scsi_e ?        00:00:00 scsi_eh_2
1 S     0   824    11  0  70  -5 -     0 scsi_e ?        00:00:00 scsi_eh_3
1 S     0   850     1  0  75   0 -     0 -      ?        00:00:00 khpsbpkt
1 S     0   854     1  0  76   0 -     0 -      ?        00:00:00
knodemgrd_0
1 S     0   862    11  0  70  -5 -     0 kjourn ?        00:00:00 kjournald
5 S     0   973     1  0  78  -4 -  1764 -      ?        00:00:00 udevd
1 S     0  2119    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2123    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2129    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2134    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2139    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2144    11  0  70  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2149    11  0  71  -5 -     0 kjourn ?        00:00:00 kjournald
1 S     0  2157    11  0  72  -5 -     0 hub_th ?        00:00:00 khubd
5 S   111  4246     1  0  76   0 -  2248 -      ?        00:00:00 portmap
5 S     0  4314     1  0  84   0 -  5305 -      ?        00:00:00 ypbind
5 S 65534  4384     1  0  84   0 -  1463 -      ?        00:00:00 rpc.statd
1 S     0  4391    11  0  71  -5 -     0 worker ?        00:00:00 rpciod/0
1 S     0  4392    11  0  71  -5 -     0 worker ?        00:00:00 rpciod/1
1 S     0  4393     1  0  85   0 -     0 -      ?        00:00:00 lockd
1 S     0  4394     1  0  76   0 -  1462 -      ?        00:00:00 mount
1 S     0  4453     1  0  76   0 -  1461 -      ?        00:00:00 mount
5 S     0  4514     1  0  76   0 -  4294 -      ?        00:00:00 sshd
0 S     0  4585     1  0  77   0 -   917 -      tty1     00:00:00 agetty
0 S     0  4586     1  0  76   0 -   917 -      tty2     00:00:00 agetty
0 S     0  4587     1  0  76   0 -   917 -      tty3     00:00:00 agetty
0 S     0  4588     1  0  76   0 -   917 -      tty4     00:00:00 agetty
0 S     0  4589     1  0  76   0 -   916 -      tty5     00:00:00 agetty
0 S     0  4590     1  0  76   0 -   916 -      tty6     00:00:00 agetty
4 S     0 14875  4514  0  75   0 -  7073 -      ?        00:00:00 sshd
4 S     0 14878 14875  0  75   0 -  2548 wait   pts/0    00:00:00 bash
0 D     0 26815     1  0  77   0 -     0 exit   pts/0    00:00:00 cc1
4 R     0  9811 14878 86  76   0 -     0 -      pts/0    00:02:59 emerge
4 S     0 17651  4514  0  75   0 -  7036 -      ?        00:00:00 sshd
4 S     0 17654 17651  0  75   0 -  2547 wait   pts/1    00:00:00 bash
0 R     0 17661 17654  0  77   0 -  1019 -      pts/1    00:00:00 ps


odin ~ # cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 37
model name      : AMD Opteron(tm) Processor 252
stepping        : 1
cpu MHz         : 2592.234
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm
3dnowext 3dnow pni lahf_lm
bogomips        : 5189.92
TLB size        : 1024 4K pages
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm
3dnowext 3dnow pni lahf_lm
bogomips        : 5189.92
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 37
model name      : AMD Opteron(tm) Processor 252
stepping        : 1
cpu MHz         : 2592.234
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm
3dnowext 3dnow pni lahf_lm
bogomips        : 5184.39
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

[-- Attachment #2: Type: text/html, Size: 36960 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-10-31 23:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-31 18:46 [gentoo-amd64] Problem with emerge on a dual-processor machine Vesna Petrovic
2006-10-31 18:54 ` Bob Sanders
2006-10-31 23:39 ` [gentoo-amd64] " Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox