public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] More emerge oddity in chroot
@ 2014-04-24 12:57 Peter Humphrey
  2014-04-24 21:05 ` Philip Webb
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Peter Humphrey @ 2014-04-24 12:57 UTC (permalink / raw
  To: gentoo-user

Hello list,

I'm wearying of this chroot operation, and I must be sounding like a tyro.

The other day emerge started hanging at the end of compilation, thus:

# emerge -1 apache-tools

[,,,]
make[1]: Leaving directory `/var/tmp/portage/app-admin/apache-
tools-2.2.25/work/httpd-2.2.25/support'
>>> Source compiled.
 * Skipping make test/check due to ebuild restriction.
>>> Test phase [disabled because of RESTRICT=test]: app-admin/apache-
tools-2.2.25

>>> Install apache-tools-2.2.25 into /var/tmp/portage/app-admin/apache-
tools-2.2.25/image/ category app-admin
make[1]: Entering directory `/var/tmp/portage/app-admin/apache-
tools-2.2.25/work/httpd-2.2.25/support'
mkdir /var/tmp/portage/app-admin/apache-tools-2.2.25/image/usr
mkdir /var/tmp/portage/app-admin/apache-tools-2.2.25/image/usr/sbin
make[1]: Leaving directory `/var/tmp/portage/app-admin/apache-
tools-2.2.25/work/httpd-2.2.25/support'
>>> Completed installing apache-tools-2.2.25 into /var/tmp/portage/app-
admin/apache-tools-2.2.25/image/

strip: i686-pc-linux-gnu-strip --strip-unneeded -R .comment -R 
.GCC.command.line -R .note.gnu.gold-version
   usr/sbin/htpasswd
   usr/sbin/ab
   usr/sbin/rotatelogs
   usr/sbin/logresolve
   usr/sbin/htdigest
   usr/sbin/htdbm
   usr/sbin/htcacheclean
   usr/sbin/httxt2dbm
   usr/sbin/checkgid
>>> Done.

It never comes back from there, not even with a CTRL-C; I have to "kill -9" 
from another Konsole.

Grepping ps for emerge (before the kill!) shows:

20749 pts/1    DN+    0:03 /usr/bin/python3 /usr/bin/emerge --nospinner -1 
apache-tools

Man ps says that the D means "uninterruptible sleep (usually IO)"

So far I've done these things:

1.	Wiped the whole system and restored from backup (heavy overkill, but I 
wanted everything to be in the same, consistent state).
2.	Run bad-blocks tests on all partitions (though all but / and /boot are in 
logical volumes - I don't know to what extent that will have affected the 
results).
3.	Remerged portage.
4.	Recompiled the kernel, 3.12.13.
5.	Booted previous good kernel,3.10.32.
6.	Emerged natively on the Atom box for which the chroot is a build host.

At various stages I got bizarreness like the kernel panicking at boot on a 
previously good kernel config, before it even switched to the frame buffer; or, 
during shutdown, "exiting from KVM" or some such, even though I've never used 
KVM.

Portage works fine outside the chroot and is at the same version, 2.2.8-r1. It 
has python3 set in package.use in both cases, so I also tried:

7.	Removed python3 USE flag from sys-apps/portage.

Is my hardware dying? Or maybe I am. Time to seek intelligent conversation 
down the pub.  :-(

-- 
Regards
Peter


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot
  2014-04-24 12:57 [gentoo-user] More emerge oddity in chroot Peter Humphrey
@ 2014-04-24 21:05 ` Philip Webb
  2014-04-25  8:16   ` Peter Humphrey
  2014-04-28 12:32 ` Peter Humphrey
  2014-05-08 23:34 ` [gentoo-user] More emerge oddity in chroot Peter Humphrey
  2 siblings, 1 reply; 8+ messages in thread
From: Philip Webb @ 2014-04-24 21:05 UTC (permalink / raw
  To: gentoo-user

140424 Peter Humphrey wrote:
> I'm wearying of this chroot operation, and I must be sounding like a tyro.
> The other day emerge started hanging at the end of compilation, thus:
> # emerge -1 apache-tools
-- details snipped -- 
>>>> Completed installing apache-tools-2.2.25 into /var/tmp/portage/app-
-- details snipped -- 
>>>> Done.
> It never comes back from there, not even with a CTRL-C;
> I have to "kill -9" from another Konsole.
> Grepping ps for emerge (before the kill!) shows:
> 20749 pts/1    DN+    0:03 /usr/bin/python3 /usr/bin/emerge --nospinner -1 
> apache-tools
> Man ps says that the D means "uninterruptible sleep (usually IO)"
> Is my hardware dying ?

It looks like hardware : how old is it ?  what is its record ?

-- 
========================,,============================================
SUPPORT     ___________//___,   Philip Webb
ELECTRIC   /] [] [] [] [] []|   Cities Centre, University of Toronto
TRANSIT    `-O----------O---'   purslowatchassdotutorontodotca



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot
  2014-04-24 21:05 ` Philip Webb
@ 2014-04-25  8:16   ` Peter Humphrey
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Humphrey @ 2014-04-25  8:16 UTC (permalink / raw
  To: gentoo-user

On Thursday 24 Apr 2014 17:05:25 Philip Webb wrote:
> 140424 Peter Humphrey wrote:
> > I'm wearying of this chroot operation, and I must be sounding like a tyro.
> > The other day emerge started hanging at the end of compilation, thus:
> > # emerge -1 apache-tools
> 
> -- details snipped --
> 
> >>>> Completed installing apache-tools-2.2.25 into /var/tmp/portage/app-
> 
> -- details snipped --
> 
> >>>> Done.
> > 
> > It never comes back from there, not even with a CTRL-C;
> > I have to "kill -9" from another Konsole.
> > Grepping ps for emerge (before the kill!) shows:
> > 20749 pts/1    DN+    0:03 /usr/bin/python3 /usr/bin/emerge --nospinner -1
> > apache-tools
> > Man ps says that the D means "uninterruptible sleep (usually IO)"
> > Is my hardware dying ?
> 
> It looks like hardware : how old is it ?  what is its record ?

It's only 3 or 4 years old. Never given any trouble with Gentoo, though I 
never managed to install any other distro - whichever one it was it would 
always just stop responding to anything.

Looks like time to revisit the BIOS settings. I prefer to stick with 
unadventurous defaults rather than playing around with core voltages, 
nanosecond timings etc. I've already checked all the hardware (and software) I 
can think of that would cause errors at only a single point, so it's firmware 
next.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot
  2014-04-24 12:57 [gentoo-user] More emerge oddity in chroot Peter Humphrey
  2014-04-24 21:05 ` Philip Webb
@ 2014-04-28 12:32 ` Peter Humphrey
  2014-04-29  0:47   ` [gentoo-user] More emerge oddity in chroot - SOLVED Peter Humphrey
  2014-05-08 23:34 ` [gentoo-user] More emerge oddity in chroot Peter Humphrey
  2 siblings, 1 reply; 8+ messages in thread
From: Peter Humphrey @ 2014-04-28 12:32 UTC (permalink / raw
  To: gentoo-user

On Thursday 24 Apr 2014 13:57:19 I wrote:

> So far I've done these things:
> 
> 1.	Wiped the whole system and restored from backup (heavy overkill, but I
> wanted everything to be in the same, consistent state).
> 2.	Run bad-blocks tests on all partitions (though all but / and /boot are in
> logical volumes - I don't know to what extent that will have affected the
> results).

--->8

Looking at bad-blocks again, I see from gkrellm that 'mkfs.ext4 -cc -L Atom 
/dev/vg7/atom' writes the test patterns to both the underlying physical disks, 
but it only reads back from one of them.

-- 
Regards
Peter


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot - SOLVED
  2014-04-28 12:32 ` Peter Humphrey
@ 2014-04-29  0:47   ` Peter Humphrey
  2014-05-04 18:52     ` J. Roeleveld
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Humphrey @ 2014-04-29  0:47 UTC (permalink / raw
  To: gentoo-user

On Monday 28 Apr 2014 13:32:05 I wrote:
> On Thursday 24 Apr 2014 13:57:19 I wrote:
> > So far I've done these things:
> > 
> > 1. Wiped the whole system and restored from backup (heavy overkill, but I
> > wanted everything to be in the same, consistent state).
> > 2. Run bad-blocks tests on all partitions (though all but / and /boot  are
> > in logical volumes - I don't know to what extent that will have affected
> > the results).
> 
> --->8
> 
> Looking at bad-blocks again, I see from gkrellm that 'mkfs.ext4 -cc -L Atom
> /dev/vg7/atom' writes the test patterns to both the underlying physical
> disks, but it only reads back from one of them

... so it isn't much use on a virtual disk.

Well, that was a long weekend.

The symptoms grew stranger and stranger, until I eventually discovered a
problem with IRQ 16.

/proc/interrupts includes this line:
 16:      0   302525      0      0   IO-APIC-fasteoi  ehci_hcd:usb1, nouveau

The source file /usr/src/linux/kernel/irq/spurious.c says:

/*
 * If 99,900 of the previous 100,000 interrupts have not been handled
 * then assume that the IRQ is stuck in some manner. Drop a diagnostic
 * and try to turn the IRQ off.
 *
 * (The other 100-of-100,000 interrupts may have been a correctly
 *  functioning device sharing an IRQ with the failing one)
 */

...and suggests booting with irqpoll.

So I added irqpoll to the kernel command line. It seemed to make no difference
at the time, but I haven't had any recurrence in the last two days. I see 
though that, according to gkrellm, I have core temps of 52 - 56C and the 
graphics card shows 59C. That shouldn't be hot enough to start raising 
spurious interrupts: the nVidia web site says to expect around 105C as a 
limit. Perhaps I should find a different slot for the Quadro FX580 card, to 
separate it from the usb interface.

So, many hours and much rebuilding later, I've installed a new chroot for the 
Atom and it seems to be working as expected. Actually, I reinstalled the 
entire system to be safe, including re-creating the physical and logical 
volumes on the two SATA disks.

The question still remaining is what caused millions of spurious interrupts 
over a period of a week or so and then subsided. This is an Asus P7P55D 
motherboard (http://www.asus.com/Motherboards/P7P55D/).

-- 
Regards
Peter


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot - SOLVED
  2014-04-29  0:47   ` [gentoo-user] More emerge oddity in chroot - SOLVED Peter Humphrey
@ 2014-05-04 18:52     ` J. Roeleveld
  2014-05-05  9:17       ` Peter Humphrey
  0 siblings, 1 reply; 8+ messages in thread
From: J. Roeleveld @ 2014-05-04 18:52 UTC (permalink / raw
  To: gentoo-user

On Tuesday, April 29, 2014 01:47:08 AM Peter Humphrey wrote:
> On Monday 28 Apr 2014 13:32:05 I wrote:
> > On Thursday 24 Apr 2014 13:57:19 I wrote:
> > > So far I've done these things:
> > > 
> > > 1. Wiped the whole system and restored from backup (heavy overkill, but
> > > I
> > > wanted everything to be in the same, consistent state).
> > > 2. Run bad-blocks tests on all partitions (though all but / and /boot 
> > > are
> > > in logical volumes - I don't know to what extent that will have affected
> > > the results).
> > 
> > --->8
> > 
> > Looking at bad-blocks again, I see from gkrellm that 'mkfs.ext4 -cc -L
> > Atom
> > /dev/vg7/atom' writes the test patterns to both the underlying physical
> > disks, but it only reads back from one of them
> 
> ... so it isn't much use on a virtual disk.

I thought the man-page for badblocks (it's a seperate utility) actually says 
it should be run on the physical disks?

> Well, that was a long weekend.
> 
> The symptoms grew stranger and stranger, until I eventually discovered a
> problem with IRQ 16.
> 
> /proc/interrupts includes this line:
>  16:      0   302525      0      0   IO-APIC-fasteoi  ehci_hcd:usb1, nouveau
> 
> The source file /usr/src/linux/kernel/irq/spurious.c says:
> 
> ...and suggests booting with irqpoll.
> 
> So I added irqpoll to the kernel command line. It seemed to make no
> difference at the time, but I haven't had any recurrence in the last two
> days. I see though that, according to gkrellm, I have core temps of 52 -
> 56C and the graphics card shows 59C. That shouldn't be hot enough to start
> raising spurious interrupts: the nVidia web site says to expect around 105C
> as a limit. Perhaps I should find a different slot for the Quadro FX580
> card, to separate it from the usb interface.

Always a good idea, if possible.
USB and Video are both heavily used items.

> So, many hours and much rebuilding later, I've installed a new chroot for
> the Atom and it seems to be working as expected. Actually, I reinstalled
> the entire system to be safe, including re-creating the physical and
> logical volumes on the two SATA disks.
> 
> The question still remaining is what caused millions of spurious interrupts
> over a period of a week or so and then subsided. This is an Asus P7P55D
> motherboard (http://www.asus.com/Motherboards/P7P55D/).

USB-mouse, keyboard,harddrive,flash-stick,.... ?
Eg. anything plugged into USB can cause interrupts.

Also, some mainboards have additional items pre-connected to the USB-bus.

--
Joost


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot - SOLVED
  2014-05-04 18:52     ` J. Roeleveld
@ 2014-05-05  9:17       ` Peter Humphrey
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Humphrey @ 2014-05-05  9:17 UTC (permalink / raw
  To: gentoo-user

On Sunday 04 May 2014 20:52:13 J. Roeleveld wrote:
> On Tuesday, April 29, 2014 01:47:08 AM Peter Humphrey wrote:
> I thought the man-page for badblocks (it's a seperate utility) actually says
> it should be run on the physical disks?

Could well do so - I stopped reading when I came to this:

"... it  is  strongly  recommended  that  users not run badblocks directly, 
but rather use the -c option of the e2fsck and mke2fs programs."

:-)

> > On Monday 28 Apr 2014 13:32:05 I wrote:

> > Perhaps I should find a different slot for the Quadro FX580
> > card, to separate it from the usb interface.
> 
> Always a good idea, if possible.
> USB and Video are both heavily used items.

Yes, you're right. I'll see what I can arrange. It must be time to get the 
duster out too. Also, this morning I see this in dmesg:

[66770.347672] nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 4 
[kwin[6587]] get 0x002007524c put 0x0020075704 ib_get 0x00000068 ib_put 
0x00000072 state 0x8000753c (err: INVALID_CMD) push 0x00406040
[66770.347697] nouveau E[   PFIFO][0000:01:00.0] DMA_PUSHER - ch 4 
[kwin[6587]] get 0x002037c000 put 0x002037c054 ib_get 0x00000069 ib_put 
0x00000073 state 0x80000000 (err: INVALID_CMD) push 0x00406040

> USB-mouse, keyboard,harddrive,flash-stick,.... ?
> Eg. anything plugged into USB can cause interrupts.

No USB mouse or keyboard here; still using PS/2. I usually have a memory stick 
plugged in but I removed it during the investigation. The external USB disk 
will have been powered up for some of the time.

> Also, some mainboards have additional items pre-connected to the USB-bus.

Time to stick my head into the nVidia docs...

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] More emerge oddity in chroot
  2014-04-24 12:57 [gentoo-user] More emerge oddity in chroot Peter Humphrey
  2014-04-24 21:05 ` Philip Webb
  2014-04-28 12:32 ` Peter Humphrey
@ 2014-05-08 23:34 ` Peter Humphrey
  2 siblings, 0 replies; 8+ messages in thread
From: Peter Humphrey @ 2014-05-08 23:34 UTC (permalink / raw
  To: gentoo-user

On Thursday 24 April 2014 13:57:19 Peter Humphrey wrote:
> Hello list,
> 
> I'm wearying of this chroot operation, and I must be sounding like a tyro.
> 
> The other day emerge started hanging at the end of compilation, thus:
> 
> # emerge -1 apache-tools

[,,,]

> >>> Completed installing apache-tools-2.2.25 into /var/tmp/portage/app-
> 
> admin/apache-tools-2.2.25/image/
> 
> strip: i686-pc-linux-gnu-strip --strip-unneeded -R .comment -R
> .GCC.command.line -R .note.gnu.gold-version
>    usr/sbin/htpasswd
>    usr/sbin/ab
>    usr/sbin/rotatelogs
>    usr/sbin/logresolve
>    usr/sbin/htdigest
>    usr/sbin/htdbm
>    usr/sbin/htcacheclean
>    usr/sbin/httxt2dbm
>    usr/sbin/checkgid
> 
> >>> Done.
> 
> It never comes back from there, not even with a CTRL-C; I have to "kill -9"
> from another Konsole.

I've found what was causing this. I can hardly believe it myself, but the 
evidence is conclusive.

In my /etc/init.d/atom start script I nfs-mounted the Atom's package 
directory, but for historical reasons (latterly approaching the hysterical) I 
was passing "-o vers=3". Once I removed that, portage sprang back into life.

Go figure, as they say on the other side of the pond.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-08 23:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-24 12:57 [gentoo-user] More emerge oddity in chroot Peter Humphrey
2014-04-24 21:05 ` Philip Webb
2014-04-25  8:16   ` Peter Humphrey
2014-04-28 12:32 ` Peter Humphrey
2014-04-29  0:47   ` [gentoo-user] More emerge oddity in chroot - SOLVED Peter Humphrey
2014-05-04 18:52     ` J. Roeleveld
2014-05-05  9:17       ` Peter Humphrey
2014-05-08 23:34 ` [gentoo-user] More emerge oddity in chroot Peter Humphrey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox