From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1O3WBp-0001CM-Qv for garchives@archives.gentoo.org; Sun, 18 Apr 2010 15:13:42 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 827D5E0ADF; Sun, 18 Apr 2010 15:13:09 +0000 (UTC) Received: from mail-pz0-f203.google.com (mail-pz0-f203.google.com [209.85.222.203]) by pigeon.gentoo.org (Postfix) with ESMTP id 3B9F9E0ADF for ; Sun, 18 Apr 2010 15:13:09 +0000 (UTC) Received: by pzk41 with SMTP id 41so2916485pzk.10 for ; Sun, 18 Apr 2010 08:13:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=pBiwgVrcnr4gQaTS7RNYqPYUzZTQZ9GSeFwBmoEWkfw=; b=PV48DjfvsQl+Xs3pOKo1p/1SX2YYBI3/Ho6+rkK2WhtyyEXBXQ2B9AZOqLi748S8Ua ZoHFEsTdUSYBO1+uOZXLkIKzqVbv2rwZ2ps8A+YSanJT2DWDZcrW0WEc4vwG0ftXrhvU MiUCykwulIBb182A8NB4cvQW74D1UWdGS6Lg4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=PC9ck3/Zw6whe2UjlD0w6W9E5TVtDCHzq+Y8vI77lHMvDW/CXvEVZD3yz1xcVaznBr 6BNaWbMrvn+3Fw287Yl0b6Wge/D5wg2ZrB8pLLy+/0ETfXYVTb24Xu+gaXkr3rYvXZ2A aqX6vhttDPWvSd0Ixzp7Gk1mPXmGo86/H249M= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.143.160.19 with HTTP; Sun, 18 Apr 2010 08:13:08 -0700 (PDT) In-Reply-To: <20100417230107.71b44573@digimed.co.uk> References: <20100417230107.71b44573@digimed.co.uk> Date: Sun, 18 Apr 2010 08:13:08 -0700 Received: by 10.142.2.29 with SMTP id 29mr1606140wfb.204.1271603588660; Sun, 18 Apr 2010 08:13:08 -0700 (PDT) Message-ID: Subject: Re: [gentoo-user] Re: initramfs & RAID at boot time From: Mark Knecht To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 X-Archives-Salt: 57ed375f-af11-43a2-a09d-42354076f7e3 X-Archives-Hash: df59fb8e31180a517ed666a1967d06bc On Sat, Apr 17, 2010 at 3:01 PM, Neil Bothwick wrote: > On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote: > >> Empirically any way there doesn't seem to be a problem. I built the >> new kernel and it booted normally so I think I'm misinterpreting what >> was written in the Wiki or the Wiki is wrong. > > As long as /boot is not on RAID, or is on RAID1, you don't need an > initrd. I've been booting this system for years with / on RAID1 and > everything else on RAID5. > > > -- > Neil Bothwick Neil, Completely agreed, and in fact it's the way I built my new system. /boot is just a partition, / is RAID1 is three partitions marked with 0xfd partition type, using metadata=0.90 and assembled by the kernel. I'm using WD RAID Edition drives and an Asus Rampage II Extreme motherboard. It works, however I'm running into the sort of thing I ran into this morning when booting - both md5 and md6 have problems this morning. Random partitions get dropped out. It's never the same ones, and it's sometimes only 1 partition out of three on the same drive - sdc5 and sdc6 aren't found until I reboot, but sda3, sdb3 & sdc3 were. Flakey hardware? What? The motherboard? The drives? I've noticed the entering the BIOS setup screens before allowing grub to take over seems to eliminate the problem. Timing? mark@c2stable ~ $ cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdb5[1] sda5[0] 52436032 blocks [3/2] [UU_] unused devices: mark@c2stable ~ $ For clarity, md3 is the only one needed to boot the system. The other three RAIDs aren't required until I start running apps. However they are all being assembled by the kernel at boot time and I would prefer not to do that, or at least learn how not to do it. Now, as to why they are being assembled I suspect it's because I marked them all with partition type 0xfd when possibly it's not the best thing to have done. The kernel won't bother with non-0xfd partitions and then mdadm could have done it later: c2stable ~ # fdisk -l /dev/sda Disk /dev/sda: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x8b45be24 Device Boot Start End Blocks Id System /dev/sda1 * 1 7 56196 83 Linux /dev/sda2 8 530 4200997+ 82 Linux swap / Solaris /dev/sda3 536 7063 52436160 fd Linux raid autodetect /dev/sda4 7064 60801 431650485 5 Extended /dev/sda5 7064 13591 52436128+ fd Linux raid autodetect /dev/sda6 30000 60801 247417065 fd Linux raid autodetect c2stable ~ # However the Gentoo Wiki says we are supposed to mark everything 0xfd: http://en.gentoo-wiki.com/wiki/RAID/Software#Setup_Partitions I'm not sure that we good advice or not for RAIDs that could be assembled later but that's what I did and it leads to the kernel trying to do everything before the system is totally up and mdadm is really running. Anyway, the failures happen, so I can step through and fail, remove and add the partition back to the array. (In this case fail and remove aren't necessary) c2stable ~ # mdadm /dev/md5 -f /dev/sdc5 mdadm: set device faulty failed for /dev/sdc5: No such device c2stable ~ # mdadm /dev/md5 -r /dev/sdc5 mdadm: hot remove failed for /dev/sdc5: No such device or address c2stable ~ # mdadm /dev/md5 -a /dev/sdc5 mdadm: re-added /dev/sdc5 c2stable ~ # mdadm /dev/md6 -a /dev/sdc6 mdadm: re-added /dev/sdc6 c2stable ~ # At this point md5 is repaired and I'm waiting for md6 c2stable ~ # cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sdc6[2] sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] [====>................] recovery = 22.0% (54525440/247416933) finish=38.1min speed=84230K/sec md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdc5[2] sdb5[1] sda5[0] 52436032 blocks [3/3] [UUU] unused devices: c2stable ~ #c2stable ~ # cat /proc/mdstat Personalities : [raid0] [raid1] md6 : active raid1 sdc6[2] sda6[0] sdb6[1] 247416933 blocks super 1.1 [3/2] [UU_] [====>................] recovery = 22.0% (54525440/247416933) finish=38.1min speed=84230K/sec md11 : active raid0 sdd1[0] sde1[1] 104871936 blocks super 1.1 512k chunks md3 : active raid1 sdc3[2] sdb3[1] sda3[0] 52436096 blocks [3/3] [UUU] md5 : active raid1 sdc5[2] sdb5[1] sda5[0] 52436032 blocks [3/3] [UUU] unused devices: c2stable ~ # How do I get past this? It's happening 2-3 times a week! I'm figuring if the kernel doesn't auto-assemble the RAIDs that I don't need assembled then I can somehow check that all the partitions are ready to go before I start them up. This exercise this morning will have taken an hour before I can start using the machine. - Mark - Mark