From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1Nwiak-00038m-I5 for garchives@archives.gentoo.org; Tue, 30 Mar 2010 21:03:18 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 32B36E0DE9 for ; Tue, 30 Mar 2010 21:03:18 +0000 (UTC) Received: from mail-pz0-f189.google.com (mail-pz0-f189.google.com [209.85.222.189]) by pigeon.gentoo.org (Postfix) with ESMTP id 76798E0A81 for ; Tue, 30 Mar 2010 20:27:00 +0000 (UTC) Received: by pzk27 with SMTP id 27so4245598pzk.2 for ; Tue, 30 Mar 2010 13:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type :content-transfer-encoding; bh=em1iWVkdx+wDDbDurtseIyUI9EXCTBJguXTuwB0/09g=; b=xXB4DfcwLNOXdSSuIOvesoIt+SgJXzrEWhn++N60DXrVdcbANB/0PNxenE18jyJh87 vLEFmw8DXWFg9S+my+TQTaYtRNr8l5IZ3ji9Jh5mr0T4Zu1ovz2NEIgAJ19gsIkasOWD q2pOHFVjKXhNV1/lh7b5BqNcruUB0Xi/LEJDI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=vbmyaQeSYhrYsISgdbNfC+pRfB0DpWTltpyGvhZFXVdP68g8g1+5A0AiR27gK4tGVq VvBwxVd8Dh9nuunRwGRmCquUcpWIfZ28Mqluj9qxRpaH8E2G6CW7u5rQ4sDFljFnQDem xH1kP0pfO/NF4ND0qQYxpL7s+FASwOA4z5l+c= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@lists.gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org MIME-Version: 1.0 Received: by 10.143.13.4 with HTTP; Tue, 30 Mar 2010 13:26:59 -0700 (PDT) In-Reply-To: References: <5bdc1c8b1003281014w666f1cf7o20beeb736aaf7319@mail.gmail.com> <5bdc1c8b1003300656u6d1f6aa4nea031e5a60f1492@mail.gmail.com> Date: Tue, 30 Mar 2010 13:26:59 -0700 Received: by 10.142.3.19 with SMTP id 19mr827274wfc.200.1269980819637; Tue, 30 Mar 2010 13:26:59 -0700 (PDT) Message-ID: <5bdc1c8b1003301326g3bd92c72ra4c4585dbed88f69@mail.gmail.com> Subject: Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found From: Mark Knecht To: gentoo-amd64@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: 44a7067b-2667-4096-b880-78c2c1e5e77a X-Archives-Hash: 25d2eba0f312e5728ef3104d0dc228d0 On Tue, Mar 30, 2010 at 11:08 AM, Duncan <1i5t5.duncan@cox.net> wrote: > Mark Knecht posted on Tue, 30 Mar 2010 06:56:14 -0700 as excerpted: > >> 3) I LOVE your idea of managing 3 /boot partitions by hand instead of >> using RAID. Easy to do, completely testable ahead of time. If I ensure >> that every disk can boot then no matter what disk goes down the machine >> still works, at least a little. Not that much work and even if I don't >> do it for awhile it doesn't matter as I can do repairs without a CD. >> (well....) > > That's one of those things you only tend to realize after running a RAID > for awhile... and possibly after having grub die, for some reason I don't > quite understand, on just a kernel update... and realizing that had I > setup multiple independent /boot and boot-backup partitions instead of a > single RAID-1 /boot, I'd have had the backups to boot to if I'd have > needed it. > > So call it the voice of experience! =3D:^) > > Meanwhile, glad you figured the problem out. =C2=A0A boot-flag-requiring- > BIOS... that'd explain the problem for both the RAID and no-RAID version! I've set up a duplicate boot partition on sdb and it boots. However one thing I'm unsure if when I change the hard drive boot does the old sdb become the new sda because it's what got booted? Or is the order still as it was? The answer determines what I do in grub.conf as to which drive I'm trying to use. I can figure this out later by putting something different on each drive and looking. Might be system/BIOS dependent. > > 100% waits for long periods... =C2=A0I've seen a number of reasons for th= is. > One key to remember is that I/O backups have a way of stopping many other > things at times. =C2=A0Among the reasons I've seen: > OK, so some new information is another person the RAID list is experiencing something very similar with different hardware. As for your ideas: > 1a) Dying disk. > 1b) hard to read data sectors. All new drives, smartctl says no problems reading anything and no registered error correction has taken place. > > 2) DHCP Not using it, at least not intentionally. Doesn't mean networking isn't doing something strange. > > 3) suspend the disks after a period of inactivity This could be part of what's going on, but I don't think it's the whole story. My drives (WD Green 1TB drives) apparently park the heads after 8 seconds (yes 8 seconds!) of inactivity to save power. Each time it parks it increments the Load_Cycle_Count SMART parameter. I've been tracking this on the three drives in the system. The one I'm currently using is incrementing while the 2 that sit unused until I get RAID going again are not. Possibly there is something about how these drives come out of park that creates large delays once in awhile. OK, now the only problem with that analysis is the other guy experiencing this problem doesn't use this drive so that problem requires that he has something similar happening in his drives. Additionally I just used one of these drives in my dad's new machine with a different motherboard and didn't see this problem, or didn't notice it but I'll go study that and see what his system does. > > 4) I/O priority inversion on ext3 Now this one is an interesting idea. Maybe I should try a few different file systems for no other reason than eliminating the file system type as the cause. Good input. Thanks for the ideas! Cheers, Mark