From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-110069-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1O3WBp-0001CM-Qv
	for garchives@archives.gentoo.org; Sun, 18 Apr 2010 15:13:42 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 827D5E0ADF;
	Sun, 18 Apr 2010 15:13:09 +0000 (UTC)
Received: from mail-pz0-f203.google.com (mail-pz0-f203.google.com [209.85.222.203])
	by pigeon.gentoo.org (Postfix) with ESMTP id 3B9F9E0ADF
	for <gentoo-user@lists.gentoo.org>; Sun, 18 Apr 2010 15:13:09 +0000 (UTC)
Received: by pzk41 with SMTP id 41so2916485pzk.10
        for <gentoo-user@lists.gentoo.org>; Sun, 18 Apr 2010 08:13:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:mime-version:received:in-reply-to:references
         :date:received:message-id:subject:from:to:content-type;
        bh=pBiwgVrcnr4gQaTS7RNYqPYUzZTQZ9GSeFwBmoEWkfw=;
        b=PV48DjfvsQl+Xs3pOKo1p/1SX2YYBI3/Ho6+rkK2WhtyyEXBXQ2B9AZOqLi748S8Ua
         ZoHFEsTdUSYBO1+uOZXLkIKzqVbv2rwZ2ps8A+YSanJT2DWDZcrW0WEc4vwG0ftXrhvU
         MiUCykwulIBb182A8NB4cvQW74D1UWdGS6Lg4=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=PC9ck3/Zw6whe2UjlD0w6W9E5TVtDCHzq+Y8vI77lHMvDW/CXvEVZD3yz1xcVaznBr
         6BNaWbMrvn+3Fw287Yl0b6Wge/D5wg2ZrB8pLLy+/0ETfXYVTb24Xu+gaXkr3rYvXZ2A
         aqX6vhttDPWvSd0Ixzp7Gk1mPXmGo86/H249M=
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Received: by 10.143.160.19 with HTTP; Sun, 18 Apr 2010 08:13:08 -0700 (PDT)
In-Reply-To: <20100417230107.71b44573@digimed.co.uk>
References: <h2j5bdc1c8b1004171032ofcada63dzbe74a29617cab7c4@mail.gmail.com>
	 <p2h5bdc1c8b1004171436h5fad72b7u24aea939a8edc2f@mail.gmail.com>
	 <20100417230107.71b44573@digimed.co.uk>
Date: Sun, 18 Apr 2010 08:13:08 -0700
Received: by 10.142.2.29 with SMTP id 29mr1606140wfb.204.1271603588660; Sun, 
	18 Apr 2010 08:13:08 -0700 (PDT)
Message-ID: <p2h5bdc1c8b1004180813if71503ai178d386457e77eb8@mail.gmail.com>
Subject: Re: [gentoo-user] Re: initramfs & RAID at boot time
From: Mark Knecht <markknecht@gmail.com>
To: gentoo-user@lists.gentoo.org
Content-Type: text/plain; charset=UTF-8
X-Archives-Salt: 57ed375f-af11-43a2-a09d-42354076f7e3
X-Archives-Hash: df59fb8e31180a517ed666a1967d06bc

On Sat, Apr 17, 2010 at 3:01 PM, Neil Bothwick <neil@digimed.co.uk> wrote:
> On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote:
>
>> Empirically any way there doesn't seem to be a problem. I built the
>> new kernel and it booted normally so I think I'm misinterpreting what
>> was written in the Wiki or the Wiki is wrong.
>
> As long as /boot is not on RAID, or is on RAID1, you don't need an
> initrd. I've been booting this system for years with / on RAID1 and
> everything else on RAID5.
>
>
> --
> Neil Bothwick

Neil,
   Completely agreed, and in fact it's the way I built my new system.
/boot is just a partition, / is RAID1 is three partitions marked with
0xfd partition type, using metadata=0.90 and assembled by the kernel.
I'm using WD RAID Edition drives and an Asus Rampage II Extreme
motherboard.

   It works, however I'm running into the sort of thing I ran into
this morning when booting - both md5 and md6 have problems this
morning. Random partitions get dropped out. It's never the same ones,
and it's sometimes only 1 partition out of three on the same drive -
sdc5 and sdc6 aren't found until I reboot, but sda3, sdb3 & sdc3 were.
Flakey hardware? What? The motherboard? The drives?

   I've noticed the entering the BIOS setup screens before allowing
grub to take over seems to eliminate the problem. Timing?

mark@c2stable ~ $ cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sda6[0] sdb6[1]
      247416933 blocks super 1.1 [3/2] [UU_]

md11 : active raid0 sdd1[0] sde1[1]
      104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
      52436096 blocks [3/3] [UUU]

md5 : active raid1 sdb5[1] sda5[0]
      52436032 blocks [3/2] [UU_]

unused devices: <none>
mark@c2stable ~ $

   For clarity, md3 is the only one needed to boot the system. The
other three RAIDs aren't required until I start running apps. However
they are all being assembled by the kernel at boot time and I would
prefer not to do that, or at least learn how not to do it.

   Now, as to why they are being assembled I suspect it's because I
marked them all with partition type 0xfd when possibly it's not the
best thing to have done. The kernel won't bother with non-0xfd
partitions and then mdadm could have done it later:

c2stable ~ # fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x8b45be24

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1           7       56196   83  Linux
/dev/sda2               8         530     4200997+  82  Linux swap / Solaris
/dev/sda3             536        7063    52436160   fd  Linux raid autodetect
/dev/sda4            7064       60801   431650485    5  Extended
/dev/sda5            7064       13591    52436128+  fd  Linux raid autodetect
/dev/sda6           30000       60801   247417065   fd  Linux raid autodetect
c2stable ~ #

However the Gentoo Wiki says we are supposed to mark everything 0xfd:

http://en.gentoo-wiki.com/wiki/RAID/Software#Setup_Partitions

I'm not sure that we good advice or not for RAIDs that could be
assembled later but that's what I did and it leads to the kernel
trying to do everything before the system is totally up and mdadm is
really running.

   Anyway, the failures happen, so I can step through and fail, remove
and add the partition back to the array. (In this case fail and remove
aren't necessary)

c2stable ~ # mdadm /dev/md5 -f /dev/sdc5
mdadm: set device faulty failed for /dev/sdc5:  No such device
c2stable ~ # mdadm /dev/md5 -r /dev/sdc5
mdadm: hot remove failed for /dev/sdc5: No such device or address
c2stable ~ # mdadm /dev/md5 -a /dev/sdc5
mdadm: re-added /dev/sdc5
c2stable ~ # mdadm /dev/md6 -a /dev/sdc6
mdadm: re-added /dev/sdc6
c2stable ~ #

At this point md5 is repaired and I'm waiting for md6

c2stable ~ # cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sdc6[2] sda6[0] sdb6[1]
      247416933 blocks super 1.1 [3/2] [UU_]
      [====>................]  recovery = 22.0% (54525440/247416933)
finish=38.1min speed=84230K/sec

md11 : active raid0 sdd1[0] sde1[1]
      104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
      52436096 blocks [3/3] [UUU]

md5 : active raid1 sdc5[2] sdb5[1] sda5[0]
      52436032 blocks [3/3] [UUU]

unused devices: <none>
c2stable ~ #c2stable ~ # cat /proc/mdstat
Personalities : [raid0] [raid1]
md6 : active raid1 sdc6[2] sda6[0] sdb6[1]
      247416933 blocks super 1.1 [3/2] [UU_]
      [====>................]  recovery = 22.0% (54525440/247416933)
finish=38.1min speed=84230K/sec

md11 : active raid0 sdd1[0] sde1[1]
      104871936 blocks super 1.1 512k chunks

md3 : active raid1 sdc3[2] sdb3[1] sda3[0]
      52436096 blocks [3/3] [UUU]

md5 : active raid1 sdc5[2] sdb5[1] sda5[0]
      52436032 blocks [3/3] [UUU]

unused devices: <none>
c2stable ~ #

   How do I get past this? It's happening 2-3 times a week! I'm
figuring if the kernel doesn't auto-assemble the RAIDs that I don't
need assembled then I can somehow check that all the partitions are
ready to go before I start them up. This exercise this morning will
have taken an hour before I can start using the machine.

- Mark

- Mark