[gentoo-amd64] RAID1 boot - no bootable media found

public inbox for gentoo-amd64@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-amd64] RAID1 boot - no bootable media found
@ 2010-03-28 17:14 Mark Knecht
  2010-03-30  6:39 ` [gentoo-amd64] " Duncan
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-03-28 17:14 UTC (permalink / raw
  To: gentoo-amd64

Hi all,
   Sorry for a bit of cross-posting between gentoo-amd64 and
linux-raid where I first posted this earlier this morning. I figure if
it's a RAID issue they will likely see the problem but if it's a
Gentoo problem (software or documentation) or a BIOS issue then likely
the very best folks in the world are here. Thanks in advance.

   The basic issue is trying to boot from a RAID1 boot partition using
grub-static. Apparently grub itself isn't found. See the link below
for the Gentoo instructions on doing this. Note that I'm SATA-based on
the motherboard where as he had some sort of controller although he is
using software RAID.

   Hopefully the post below is self explanatory but if not ask
questions. Since I made this post I've tried a couple more things:

1) Switching to AHCI in BIOS - no change

2) Documenting drive hook-up on the DX58SO motherboard

P0: drive 1
P1: CD_RW
P2: drive 2
P3: unused
P4: drive 3
P5: unused

3) Documenting codes shown on screen:
a) When Intel logo shows up:
BA
b) After the logo goes away
E7
E7
BA
BA

These appear to be POST codes from this page:

http://www.intel.com/support/motherboards/desktop/sb/CS-025434.htm

BA 	Detecting presence of a removable media (IDE, CD-ROM detection, etc.)
E7 	Waiting for user input

My motherboard is the last in the list at the bottom of the page.

The monitor is slow to display after changes so possibly they are two
identical strings of codes and I only see the last one the first time
through.

Thanks. More info follows below.

- Mark

[PREVIOUSLY POSTED TO LINUX-RAID]

Hi,
  I brought up new hardware yesterday for my first RAID install. I
followed this Gentoo page describing a software RAID1/LVM install:

http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml

  Note that I followed this page verbatim, even if it wasn't what I
wanted, with exceptions:

a) My RAID1 is 3 drives instead of 2
b) I'm AMD64 Gentoo based.
c) I used grub-static

 I did this install mostly just to get a first-hand feel for how to
do a RAID install and to try out some of the mdadm commands for real.
My intention was to blow away the install if I didn't like it and do
it again for real once I started to get a clearer picture about how
things worked. For instance, this set of instructions used RAID1 on
the /boot directory which I wasn't sure about.

  NOTE: THIS INSTALL PUTS EVERYTHING ON RAID1. (/, /boot, everything)
I didn't start out thinking I wanted to do that.

  So, the first problem is that on the reboot to see if the install
worked the Intel BIOS reports 'no bootable media found'. I am very
unclear how any system boots software RAID1 before software is loaded,
assuming I understand the Gentoo instructions. The instructions I used
to install grub where

root (hd0,0)
setup (hd0)
root (hd1,0)
setup (hd1)
root (hd2,0)
setup (hd2)

but the system finds nothing to boot from. to me this sounds like BIOS
so looking around I'm currently set up for compatibility but would
think that switching to AHCI support would be a better long term
solution. Any chance this setting is the root cause?

  I can boot from CD and assemble the /boot RAID

livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>
livecd ~ # mdadm --assemble /dev/md1 /dev/sda1 /dev/sdb1 /dev/sdc1
mdadm: /dev/md1 has been started with 3 drives.
livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda1[0] sdc1[2] sdb1[1]
     112320 blocks [3/3] [UUU]

unused devices: <none>
livecd ~ # mdadm --misc --stop /dev/md1
mdadm: stopped /dev/md1
livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>
livecd ~ #

Everything I expect to see on /boot seems to be there when using ls.

  Note that one possible clue - when the Intel BIOS screen first
comes up I see some hex digits flashing around in the lower right.
I've not seen this before on other machines and I beleive the
motherboard (DX58SO) does support some sort of RAID in hardware so
maybe there's confusion there? I've not selected RAID in BIOS but
possible it's trying to be too clever?

  Let me know what other info might be needed. I have concerns about
this install and will likely blow it away today and do a new one but I
figured maybe there's an opportunity to learn here before I do that.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-28 17:14 [gentoo-amd64] RAID1 boot - no bootable media found Mark Knecht
@ 2010-03-30  6:39 ` Duncan
  2010-03-30 13:56   ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2010-03-30  6:39 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Sun, 28 Mar 2010 10:14:03 -0700 as excerpted:

> I brought up new hardware yesterday for my first RAID install. I
> followed this Gentoo page describing a software RAID1/LVM install:
> 
> http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml
> 
>   Note that I followed this page verbatim, even if it wasn't what I
> wanted, with exceptions:
> 
> a) My RAID1 is 3 drives instead of 2
> b) I'm AMD64 Gentoo based.
> c) I used grub-static

Had you gotten anything off the other list, I see no other replies here.
Do you have that install or are you trying over as you mentioned you might?

That post was a bit long to quote in full, and somewhat disordered to try 
to reply per element, so I just quoted the above and will cover a few 
things as I go.

1) I'm running kernel/md RAID here, too (and was formerly running LVM2, 
which is what I expect you mean by LVM, and I'll continue simply calling 
it LVM), so I know some about it.

2) The Gentoo instructions don't say to, but just in case... you didn't 
put /boot and / on LVM, only on the RAID-1, correct?  LVM is only for non-
root non-boot.  (Actually, you can put / on LVM, if and only if you run an 
initrd/initramfs, but it significantly complicates things.  Keeping / off 
of LVM simplifies things considerably, so I'd recommend it.)  This is 
because while the kernel can auto-detect and configure RAID, or the RAID 
config can be fed to it on the command line.  The kernel cannot by itself 
figure out how to configure LVM -- only the LVM userspace knows how to 
read and configure LVM, so an LVM userspace and config must be available 
before it can be loaded.  This can be accomplished by using an initrd/
initramfs with LVM loaded on it, but things are MUCH less complex if / 
isn't LVM, so LVM can be loaded from the normal /.

3) You mention not quite understanding how /boot works on md/RAID -- how 
does grub know where to look?  Well, it only works on md/kernel RAID-1, 
and that only because RAID-1 is basically the same as a non-RAID setup, 
only instead of one disk, there's several, each a mirror duplicate of the 
others (but for a bit of RAID metadata).  Thus, grub basically treats each 
disk as if it wasn't in RAID, and it works, because it's organized almost 
the same as if it wasn't in RAID.  That's why you have to install grub 
separately to each disk, because it's treating them as separate disks, not 
RAID mirrors.  But it doesn't work with other RAID levels because they mix 
up data stripes, and grub doesn't know anything about that.

4) Due to personal experience recovering from a bad disk (pre-RAID, that's 
why I switched to RAID), I'd actually recommend putting everything portage 
touches or installs to on / as well.  That way, everything is kept in sync 
and you don't get into a situation where / including /bin /sbin and /etc 
are a snapshot from one point in time, while portage's database in /var/db 
is a different one, and stuff installed to /usr may be an entirely 
different one.  Not to mention /opt if you have anything installed 
there...  If all that's on /, then it should all remain in sync.  Plus 
then you don't have to worry about something boot-critical being installed 
to /usr, which isn't mounted until about midway thru the boot cycle.

4 cont) What then goes on other partitions is subdirs of the above, 
/usr/local, very likely, as you'll probably want to keep it if you 
reinstall, /home, for the same reason, /var/log, so a runaway log can't 
eat up all the space on /, it's limited to eating up everything on the log 
partition, likely /tmp, which I have as tmpfs here but which otherwise you 
may well want to be RAID-0 for speed, /var/tmp, which here is a symlink to 
my /tmp so it's on tmpfs too, very possibly /usr/src and the linux kernel 
tree it contains, as RAID-0 is fine for that as it can simply be 
redownloaded off the net if need be, same with your portage dir,
/usr/portage by default tho you can point that elsewhere (maybe to the 
same partition holding /usr/src, but if you use FEATURES=buildpkg, you 
probably want your packagedir on something with some redundancy, so not on 
the same RAID-0) if you want, etc...  If you have a system-wide mail setup 
with multiple users, you may want a separate mail partition as well (if 
not, part of /home is fine).  Desktop users may well find a separate, 
likely BIG, partition for their media storage is useful, etc...  FWIW, 
the / partition on my ~amd64 workstation with kde4 is 5 gigs (according to 
df).  On my slightly more space constrained 32-bit netbook, it's 4 gigs.  
Used space on both is ~2.4 gigs, with the various other partitions as 
mentioned separate, but with everything portage touches on /.  (That 
compares to what appears to be a 1-gig / md3 root in the guide, with /var 
and /usr on their own partitions/volumes, but they have an 8 gig /usr, a 4 
gig /var, and a 4 gig /opt, totaling 17 gigs, that's mostly on that 4-5 
gig /, here.)

5) The hexidecimal digits you mentioned during the BIOS post process 
indicate, as you guessed, BIOS POST and config process progress.  I wasn't 
aware that they're documented, but as your board is an Intel and the link 
you mentioned appears to be Intel documentation for them, it seems in your 
case they are, which is nice. =:^)

6) Your BIOS has slightly different SATA choices than mine.  Here, I have 
RAID or JBOD (plain SATA, "just a bunch of disks", as my two choices.  
JBOD mode would compare to your AHCI, which is what I'd recommend. (Seems 
Intel wants AHCI to be a standard, thus killing the need for individual 
SATA controller drivers like the SATA_SIL drivers I run here.  That'd be 
nice, but I don't know how well it's being accepted by others.) 
Compatibility mode will likely slow things down, and RAID mode would be 
firmware based RAID, which on Linux would be supported by the device-
mapper (as is LVM2).  JBOD/SATA/AHCI mode, with md/kernel RAID, is 
generally considered a better choice than firmware RAID with device-mapper 
support, well, unless you need MSWormOS RAID compatibility, in which case 
the firmware/device-mapper mode is probably better as it's more compatible.

6 cont) So I'd recommend AHCI.  However, the on-disk layout may be 
different between compatibility and AHCI mode, so it's possible the disk 
won't be readable after switching and you'd need to repartition and 
reinstall, which you were planning on doing anyway, so no big deal.

OK, now that those are covered... what's wrong with your boot?

Well, there's two possibilities.  Either the BIOS isn't finding grub 
stage-1, or grub stage-1 is found and loaded, but it can't find stage 1.5 
or 2, depending on what it needs for your setup.  Either way, that's a 
grub problem.  As long as you didn't make the mistake of putting /boot on 
your LVM, which grub doesn't groke, and since it can pretend md/kernel 
RAID-1 is an ordinary disk, we really don't need to worry about the md/
RAID or LVM until you can at LEAST get to the grub menu/prompt.  

So we have a grub problem.  That's what we have to solve first, before we 
deal with anything else.

Based on that, here's the relevant excerpt from your post (well, after a 
bit of a detour I forgot to include in the above, so we'll call this point 
7):

> NOTE: THIS INSTALL PUTS EVERYTHING ON RAID1. (/, /boot, everything)
> I didn't start out thinking I wanted to do that.

7) Well, not quite.  /boot and / are on RAID-1, yes.  But the guide puts 
the LVM2 physical volume on md4, which is created as RAID-0/striped.  I 
don't really agree with that as striped is fast but has no redundancy.  
Why you'd put stuff like /home, /usr (including stuff you may well want to 
keep in /usr/local), /var (including portage's package database in /var/
db), and presumably additional partitions as you may create them (media 
and mail partitions were the examples I mentioned above) on a non-
redundant RAID-0, I don't know.  That'd be what I wanted on RAID-1, here, 
to make sure I still had copies of it if any of the disks died.

7 cont) Actually, given that md/raid is now partitionable (years ago it 
wasn't, with LVM traditionally layered on top to overcome that), and after 
some experience of my own with LVM, I decided it wasn't worth the hassle 
of the extra LVM layer here, and when I redid my system last year, I 
killed the LVM and just use partitioned md/kernel RAID now.  If you want 
the flexibility of LVM, great, but here, I decided it simply wasn't worth 
the extra hassle of maintaining it.  So I'd recommend NOT using LVM and 
thus not having to worry about it.  But it's your choice.

OK, now on to the grub issue...

> So, the first problem is that on the reboot to see if the install
> worked the Intel BIOS reports 'no bootable media found'. I am very
> unclear how any system boots software RAID1 before software is loaded,
> assuming I understand the Gentoo instructions. The instructions I used
> to install grub where
>
> root (hd0,0)
> setup (hd0)
> root (hd1,0)
> setup (hd1)
> root (hd2,0)
> setup (hd2)

That /looks/ correct.  But particularly with RAID, grub's mapping between 
BIOS drives, kernel drives and grub drives, sometimes gets mixed up.  
That's one of the things I always hate touching, since I'm never quite 
sure if it's going to work or not, or that I'm actually telling it to 
setup where I think I'm telling it to setup, until I actually test it.

Do you happen to have a floppy on that machine?  If so, probably the most 
error resistant way to handle it is to install grub to a floppy disk, 
which unlike thumb drives and possibly CD/DVD drives, has no potential to 
interfere with the hard drive order as seen by BIOS.  Then boot the floppy 
disk to the grub menu, and run the setup from there.

One thing I discovered here is that I could only setup one disk at a time, 
regardless of whether I was doing it from (in Linux, from a floppy grub 
menu, or from a bootable USB stick grub boot menu).  Changing the root 
would seem to work after the first setup, but the second setup would have 
some weird error and testing a boot from that disk wouldn't work, so 
obviously it didn't take.

But doing it a disk at a time, root (hd0,0) , setup (hd0), reboot (or 
restart grub if doing it from Linux), root (hd1,0), setup (hd1), reboot... 
same thing for each additional disk (you have three, I have four). THAT 
worked.

However you do it, test them, both with all disks running, and with only 
one running (turn off or disconnect the others).  Having a RAID-1 system 
and installing grub to all the disks isn't going to do you a lot of good 
if when one dies, you find that it was the only one that had grub 
installed correctly!

There's another alternative that I'd actually recommend instead, however.  
The problem with a RAID-1 boot, is that if you somehow screw up something 
while updating /boot, since it's RAID-1, you've screwed it up for all 
mirrors on that RAID-1.  Since RAID-1 is simply mirroring the data across 
the multiple disks, it can be better to not RAID that partition at all, 
but to have each disk have its own /boot partition, un-RAIDed, which 
effectively becomes a /boot and one more (two in your case of three disks, 
three in my case of four disks, tho here I actually went with two separate 
RAID-1s instead) /boot backups.

That solves a couple problems at once.  First of all, when you first 
install, you install to just one, as an ordinary disk, test it, and when 
it's working and booting, you can copy that install to the others, and do 
the boot sector grub setup on each one separately, as its own disk, having 
tested that the first one is working.  Then you'd test each of the others 
as well.

Second, when you upgrade, especially when you upgrade grub, but also when 
you upgrade the kernel, you only upgrade the one.  If it works, great, you 
can then upgrade the others.  If it fails, no big deal, simply set your 
BIOS to boot from one of the others instead, and you're back to a known 
working config, since you had tested it after the /last/ upgrade, and you 
didn't yet do this upgrade to it since you were just testing this upgrade 
and it broke before you copied it to your backups.

So basically, the only difference here as opposed to the guide, is that 
you don't create /dev/md1, you configure and mount /dev/sda1 as /boot, and 
when you have your system up and running, /then/ you go back and setup 
/dev/sdb1 as your backup boot (say /mnt/boot/).  And when you get it setup 
and tested working, then you do the same thing for /dev/sdc1, except that 
you can use the same /mnt/boot/ backup mount-point when mounting it as 
well, since presumably you won't need both backups mounted at once.

Everything else will be the same, and as it was RAID-1/mirrored, you'll 
have about the same space in each /dev/sd[abc]1 partition as you did in 
the combined md1.

As for upgrading the three separate /boot and backups, as I mentioned, 
when you upgrade grub, DEFINITELY only upgrade one at a time, and test 
that the upgrade worked and you can boot from it before you touch the 
others.  For kernel upgrades, it doesn't matter too much if the backups 
are a bit behind, so you don't have to upgrade them for every kernel 
upgrade.  If you run kernel rcs or git-kernels, as I do, I'd suggest 
upgrading the backups once per kernel release (so from 2.6.32 to 2.6.33, 
for instance), so the test kernels are only on the working /boot, not its 
backups, but the backups contain at least one version of the last two 
release kernels.  Pretty much the same if you run upstream stable kernels 
(so 2.6.33, 2.6.33.1, 2.6.33.2...), or Gentoo -rX kernels.  Keep at least 
on of each of the last two kernels on the backups, tested to boot of 
course, and only update the working /boot for the stable or -rX bumps.

If you only upgrade kernels once a kernel release cycle or less (maybe 
you're still running 2.6.28.x or something), then you probably want to 
upgrade and test the backups as soon as you've upgraded and tested a new 
kernel on the working /boot.

Hope it helps...

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-30  6:39 ` [gentoo-amd64] " Duncan
@ 2010-03-30 13:56   ` Mark Knecht
  2010-03-30 18:08     ` Duncan
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-03-30 13:56 UTC (permalink / raw
  To: gentoo-amd64

On Mon, Mar 29, 2010 at 11:39 PM, Duncan <1i5t5.duncan@cox.net> wrote:
<SNIP>
>
> Hope it helps...
>
> --
> Duncan - List replies preferred.   No HTML msgs.

Immensely!

OK, a quick update and keeping it short for now:

1) I dumped the RAID install Sunday. It's new hardware and it wasn't
booting, I didn't know why but I do know how to install Gentoo without
RAID so I went for simple instead of what I wanted. That said the
machine still didn't boot. Same "no bootable media" message. After
scratching my head for an hour it dawned on me that maybe this BIOS
actually required /boot to be marked bootable. I changed that and the
non-RAID install booted and is running Gentoo at this time. This is
the only machine of 10 in this house that requires that but at least
it's a reasonable fix. The machine now runs XFCE & Gnome, X seems fine
so far, haven't messed with sound, etc., so the bootable flag was the
key this time around.

2) Even non-RAID I'm having some troubles with the machine. (kernel
bugs in dmesg)  I've asked a question in the LKML and gotten one
response, as well as on the Linux-RAID list, but I'm not making much
headway there yet. I'll likely post something here today or tomorrow
in some other thread with a better title having to do with 100% waits
for long periods of time. Those are probably non-Gentoo so I am
hesitant to start a thread here and bother anyone but I suspect that
you or others will probably have some good ideas at least about what
to look at.

3) I LOVE your idea of managing 3 /boot partitions by hand instead of
using RAID. Easy to do, completely testable ahead of time. If I ensure
that every disk can boot then no matter what disk goes down the
machine still works, at least a little. Not that much work and even if
I don't do it for awhile it doesn't matter as I can do repairs without
a CD. (well....)

4) You're correct that the guide did md4 as striped. I forgot to say
that I didn't. I used RAID1 there also as my needs for this machine
are reliability not speed.

5) Last for now, I figured that once the machine was running non-RAID
I could always redo the RAID1 install from within Gentoo instead of
using the install CD. That's where I'm at now but not sure if I'll do
that yet due to the issue in #2.

As always, thanks very much for the detailed post. Lots of good stuff there!

Cheers,
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-30 13:56   ` Mark Knecht
@ 2010-03-30 18:08     ` Duncan
  2010-03-30 20:26       ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2010-03-30 18:08 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Tue, 30 Mar 2010 06:56:14 -0700 as excerpted:

> 3) I LOVE your idea of managing 3 /boot partitions by hand instead of
> using RAID. Easy to do, completely testable ahead of time. If I ensure
> that every disk can boot then no matter what disk goes down the machine
> still works, at least a little. Not that much work and even if I don't
> do it for awhile it doesn't matter as I can do repairs without a CD.
> (well....)

That's one of those things you only tend to realize after running a RAID 
for awhile... and possibly after having grub die, for some reason I don't 
quite understand, on just a kernel update... and realizing that had I 
setup multiple independent /boot and boot-backup partitions instead of a 
single RAID-1 /boot, I'd have had the backups to boot to if I'd have 
needed it.

So call it the voice of experience! =:^)

Meanwhile, glad you figured the problem out.  A boot-flag-requiring-
BIOS... that'd explain the problem for both the RAID and no-RAID version!

100% waits for long periods...  I've seen a number of reasons for this. 
One key to remember is that I/O backups have a way of stopping many other 
things at times.  Among the reasons I've seen:

1a) Dying disk.  I've had old disks that would sometimes take some time to 
respond, especially if they had gone to sleep.  If you hear several clicks 
(aka "the click of death") as it resets the disk and tries again... it's 
time to think about either replacing the old disk, or sending in the new 
one for a replacement.  

1b) Another form of that is hard to read data sectors.  It'll typically 
try to read a bad sector quite a number of times, often for several 
minutes at a time, before either giving up or reading it correctly.  
Again, if you're seeing this, get a new disk and get your data transferred 
before it's too late!

2) I think this one was fixed and I only read of it, I didn't experience 
it myself.  Back some time ago, if a network interface were active using 
DHCP, but couldn't get a response from a DHCP server, it could cause 
pretty much the entire system to hang for some time, every time the fake/
random address normally assigned from the zero-conf reserved netblock 
expired.  The system would try to find a DHCP server again, and again, if 
one didn't answer, would eventually assign the zero-conf block fake/random 
address again, but would cause a system hang of upto a minute (the default 
timeout, AFAIK), before it would do so.  Again, this /should/ have been 
fixed quite some time ago, but one can never be sure what similar symptom 
bug may be lurking in some hardware or other.

3) Back to disks, but not the harbinger of doom that #1 is, perhaps your 
system is simply set to suspend the disks after a period of inactivity, 
and it takes them some time to spin back up.  I've had this happen to me, 
but it was years ago and back on MS.  But because of the issues I've had 
more recently with (1), I'm sure it'd still be an issue in some 
configurations.  (Fortunately, laptop mode on my netbook with 120 gig SATA 
hard drive seems to work very well and almost invisibly, to the point I 
don't worry about disk sleep there at all, as the resume is smooth enough 
I basically don't even notice -- save for the extra hour and a half of 
runtime I normally get with laptop mode active! FWIW, the thing "just 
works" in terms of both suspend2ram and hibernate/suspend2disk, as well.  
=:^)

4) Kernels before... 2.6.30 I believe... could occasionally exhibit a read/
write I/O priority inversion on ext3.  The problem had existed for 
sometime, but was attributed to the normal effects of the then default 
data=ordered as opposed to data=writeback journaling, until some massive 
stability issues with ext4 (which ubuntu had just deployed as a non-
default option for their new installs, the problem came in combining that 
with stuff like the unstable black-box nVidia drivers, which crashed 
systems in the middle of writes on occasion!) prompted a reexamination of 
a number of related previous assumptions.  2.6.30 had a quick-fix.  2.6.31 
had better fixes, and additionally and quite controversially, switched 
ext3 defaults to data=writeback, which with the new fixes, was judged 
sufficiently stable to be the default.  (As a reiserfs user who lived thru 
the period before it got proper data=ordered, I'll never trust 
data=writeback again, so I disagree with Linus decision to make it the 
ext3 default, but at least I can change that on the systems I run.)  So if 
you're running a kernel older than 2.6.30 or .31, this could potentially 
be an issue, tho it's unlikely to be /too/ bad under normal conditions.

Those are the possibilities I know of.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-30 18:08     ` Duncan
@ 2010-03-30 20:26       ` Mark Knecht
  2010-03-31  6:56         ` Duncan
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-03-30 20:26 UTC (permalink / raw
  To: gentoo-amd64

On Tue, Mar 30, 2010 at 11:08 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Tue, 30 Mar 2010 06:56:14 -0700 as excerpted:
>
>> 3) I LOVE your idea of managing 3 /boot partitions by hand instead of
>> using RAID. Easy to do, completely testable ahead of time. If I ensure
>> that every disk can boot then no matter what disk goes down the machine
>> still works, at least a little. Not that much work and even if I don't
>> do it for awhile it doesn't matter as I can do repairs without a CD.
>> (well....)
>
> That's one of those things you only tend to realize after running a RAID
> for awhile... and possibly after having grub die, for some reason I don't
> quite understand, on just a kernel update... and realizing that had I
> setup multiple independent /boot and boot-backup partitions instead of a
> single RAID-1 /boot, I'd have had the backups to boot to if I'd have
> needed it.
>
> So call it the voice of experience! =:^)
>
> Meanwhile, glad you figured the problem out.  A boot-flag-requiring-
> BIOS... that'd explain the problem for both the RAID and no-RAID version!

I've set up a duplicate boot partition on sdb and it boots. However
one thing I'm unsure if when I change the hard drive boot does the old
sdb become the new sda because it's what got booted? Or is the order
still as it was? The answer determines what I do in grub.conf as to
which drive I'm trying to use. I can figure this out later by putting
something different on each drive and looking. Might be system/BIOS
dependent.

>
> 100% waits for long periods...  I've seen a number of reasons for this.
> One key to remember is that I/O backups have a way of stopping many other
> things at times.  Among the reasons I've seen:
>

OK, so some new information is another person the RAID list is
experiencing something very similar with different hardware.

As for your ideas:

> 1a) Dying disk.
> 1b) hard to read data sectors.

All new drives, smartctl says no problems reading anything and no
registered error correction has taken place.

>
> 2) DHCP

Not using it, at least not intentionally. Doesn't mean networking
isn't doing something strange.

>
> 3) suspend the disks after a period of inactivity

This could be part of what's going on, but I don't think it's the
whole story. My drives (WD Green 1TB drives) apparently park the heads
after 8 seconds (yes 8 seconds!) of inactivity to save power. Each
time it parks it increments the Load_Cycle_Count SMART parameter. I've
been tracking this on the three drives in the system. The one I'm
currently using is incrementing while the 2 that sit unused until I
get RAID going again are not. Possibly there is something about how
these drives come out of park that creates large delays once in
awhile.

OK, now the only problem with that analysis is the other guy
experiencing this problem doesn't use this drive so that problem
requires that he has something similar happening in his drives.
Additionally I just used one of these drives in my dad's new machine
with a different motherboard and didn't see this problem, or didn't
notice it but I'll go study that and see what his system does.

>
> 4) I/O priority inversion on ext3

Now this one is an interesting idea. Maybe I should try a few
different file systems for no other reason than eliminating the file
system type as the cause. Good input.

Thanks for the ideas!

Cheers,
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-30 20:26       ` Mark Knecht
@ 2010-03-31  6:56         ` Duncan
  2010-04-01 18:57           ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2010-03-31  6:56 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:

> I've set up a duplicate boot partition on sdb and it boots. However one
> thing I'm unsure if when I change the hard drive boot does the old sdb
> become the new sda because it's what got booted? Or is the order still
> as it was? The answer determines what I do in grub.conf as to which
> drive I'm trying to use. I can figure this out later by putting
> something different on each drive and looking. Might be system/BIOS
> dependent.

That depends on your BIOS.  My current system (the workstation, now 6+ 
years old but still going strong as it was a $400+ server grade mobo) will 
boot from whatever disk I tell it to, but keeps the same BIOS disk order 
regardless -- unless I physically turn one or more of them off, of 
course.  My previous system would always switch the chosen boot drive to 
be the first one.  (I suppose it could be IDE vs. SATA as well, as the 
switcher was IDE, the stable one is SATA-1.)

So that's something I guess you figure out for yourself.  But it sounds 
like you're already well on your way...

>> 100% waits for long periods...

>> 1a) Dying disk.
>> 1b) hard to read data sectors.
> 
> All new drives, smartctl says no problems reading anything and no
> registered error correction has taken place.

That's good. =:^)  Tho of course there's an infant mortality period of the 
first few (1-3) months, too, before the statistics settle down.  So just 
because they're new doesn't necessarily mean they're not bad.

FWIW, when I switched to RAID was after having two drives go out at almost 
exactly the year point.  Needless to say I was a bit paranoid.  So when I 
got the new set to setup as RAID, the first thing I did (before I 
partitioned or otherwise put any data of value on them) was run
badblocks -w on all of them.  It took well over a day, actually ~3 days 
IIRC but don't hold me to the three.  Luckily, doing them in parallel 
didn't slow things down any, as it was the physical disk speed that was 
the bottleneck.  But I let the thing finish on all four drives, and none 
of them came up with a single badblock in any of the four patterns.  
Additionally, after writing and reading back the entire drive four times, 
smart still said nothing relocated or anything.  So I was happy.  And the 
drives have served me well, tho they're probably about at the end of their 
five year warranty right about now.

The point being... it /is/ actually possible to verify that they're 
working well before you fdisk/mkfs and load data.  Tho it does take 
awhile... days... on drives of modern size.

>> 3) suspend the disks after a period of inactivity
> 
> This could be part of what's going on, but I don't think it's the whole
> story. My drives (WD Green 1TB drives) apparently park the heads after 8
> seconds (yes 8 seconds!) of inactivity to save power. Each time it parks
> it increments the Load_Cycle_Count SMART parameter. I've been tracking
> this on the three drives in the system. The one I'm currently using is
> incrementing while the 2 that sit unused until I get RAID going again
> are not. Possibly there is something about how these drives come out of
> park that creates large delays once in awhile.

You may wish to take a second look at that, for an entirely /different/ 
reason.  If those are the ones I just googled on the WD site, they're 
rated 300K load/unload cycles.  Take a look at your BIOS spin-down 
settings, and use hdparm to get a look at the disk's powersaving and 
spindown settings.  You may wish to set the disks to something more 
reasonable, as with 8 second timeouts, that 300k cycles isn't going to 
last so long...

You may recall a couple years ago when Ubuntu accidentally shipped with 
laptop mode (or something, IDR the details) turned on by default, and 
people were watching their drives wear out before their eyes.  That's 
effectively what you're doing, with an 8-second idle timeout.  Most laptop 
drives (2.5" and 1.8") are designed for it.  Most 3.5" desktop/server 
drives are NOT designed for that tight an idle timeout spec, and in fact, 
may well last longer spinning at idle overnight, as opposed to shutting 
down every day even.

I'd at least look into it, as there's no use wearing the things out 
unnecessarily.  Maybe you'll decide to let them run that way and save the 
power, but you'll know about the available choices then, at least.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-03-31  6:56         ` Duncan
@ 2010-04-01 18:57           ` Mark Knecht
  2010-04-02  9:43             ` Duncan
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-04-01 18:57 UTC (permalink / raw
  To: gentoo-amd64

A bit long in response. Sorry.

On Tue, Mar 30, 2010 at 11:56 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:
>
>> I've set up a duplicate boot partition on sdb and it boots. However one
>> thing I'm unsure if when I change the hard drive boot does the old sdb
>> become the new sda because it's what got booted? Or is the order still
>> as it was? The answer determines what I do in grub.conf as to which
>> drive I'm trying to use. I can figure this out later by putting
>> something different on each drive and looking. Might be system/BIOS
>> dependent.
>
> That depends on your BIOS.  My current system (the workstation, now 6+
> years old but still going strong as it was a $400+ server grade mobo) will
> boot from whatever disk I tell it to, but keeps the same BIOS disk order
> regardless -- unless I physically turn one or more of them off, of
> course.  My previous system would always switch the chosen boot drive to
> be the first one.  (I suppose it could be IDE vs. SATA as well, as the
> switcher was IDE, the stable one is SATA-1.)
>
> So that's something I guess you figure out for yourself.  But it sounds
> like you're already well on your way...
>

It seems to be constant mapping meaning (I guess) that I need to
change the drive specs in grub.conf on the second drive to actually
use the second drive.

I made the titles for booting different for each grub.conf file to
ensure I was really getting grub from the second drive. My sda grub
boot menu says "2.6.33-gentoo booting from sda" on the first drive,
sdb on the second drive, etc.

<SNIP>
>
> The point being... it /is/ actually possible to verify that they're
> working well before you fdisk/mkfs and load data.  Tho it does take
> awhile... days... on drives of modern size.
>

I'm trying badblocks right now on sdc. using

badblocks -v /dev/sdc

Maybe I need to do something more strenuous? It looks like it will be
done an an hour or two. (i7-920 with SATA drives so it should be fast,
as long as I'm not just reading the buffers or something like that.

Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
hours. I'm at 18% after 28 minutes so that seems about right. (With no
errors so far assuming I'm using the right command)

>>> 3) suspend the disks after a period of inactivity
>>
>> This could be part of what's going on, but I don't think it's the whole
>> story. My drives (WD Green 1TB drives) apparently park the heads after 8
>> seconds (yes 8 seconds!) of inactivity to save power. Each time it parks
>> it increments the Load_Cycle_Count SMART parameter. I've been tracking
>> this on the three drives in the system. The one I'm currently using is
>> incrementing while the 2 that sit unused until I get RAID going again
>> are not. Possibly there is something about how these drives come out of
>> park that creates large delays once in awhile.
>
> You may wish to take a second look at that, for an entirely /different/
> reason.  If those are the ones I just googled on the WD site, they're
> rated 300K load/unload cycles.  Take a look at your BIOS spin-down
> settings, and use hdparm to get a look at the disk's powersaving and
> spindown settings.  You may wish to set the disks to something more
> reasonable, as with 8 second timeouts, that 300k cycles isn't going to
> last so long...

Very true. Here is the same drive model I put in a new machine for my
dad. It's been powered up and running Gentoo as a typical desktop
machine for about 50 days. He doesn't use it more than about an hour a
day on average. It's already hit 31K load/unload cycles. At 10% of
300K that about 1.5 years of life before I hit that spec. I've watched
his system a bit and his system seems to add 1 to the count almost
exactly every 2 minutes on average. Is that a common cron job maybe?

I looked up the spec on all three WD lines - Green, Blue and Black.
All three were 300K cycles. This issue has come up on the RAID list.
It seems that some other people are seeing this and aren't exactly
sure what Linux is doing to cause this.

I'll study hdparm and BIOS when I can reboot.

My dad's current data:

gandalf ~ # smartctl -A /dev/sda
smartctl 5.39.1 2010-01-28 r3054 [x86_64-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
Always       -       0
  3 Spin_Up_Time            0x0027   129   128   021    Pre-fail
Always       -       6525
  4 Start_Stop_Count        0x0032   100   100   000    Old_age
Always       -       21
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail
Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age
Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age
Always       -       1183
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age
Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age
Always       -       20
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age
Always       -       5
193 Load_Cycle_Count        0x0032   190   190   000    Old_age
Always       -       31240
194 Temperature_Celsius     0x0022   121   116   000    Old_age
Always       -       26
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age
Offline      -       0

gandalf ~ #


>
> You may recall a couple years ago when Ubuntu accidentally shipped with
> laptop mode (or something, IDR the details) turned on by default, and
> people were watching their drives wear out before their eyes.  That's
> effectively what you're doing, with an 8-second idle timeout.  Most laptop
> drives (2.5" and 1.8") are designed for it.  Most 3.5" desktop/server
> drives are NOT designed for that tight an idle timeout spec, and in fact,
> may well last longer spinning at idle overnight, as opposed to shutting
> down every day even.
>
> I'd at least look into it, as there's no use wearing the things out
> unnecessarily.  Maybe you'll decide to let them run that way and save the
> power, but you'll know about the available choices then, at least.
>

Yeah, that's important. Thanks. If I can solve all these RAID problems
then maybe I'll look at adding RAID to his box with better drives or
something.

Note that on my system only I'm seeing real problems in
/var/log/message, non-RAID, like 1000's of these:

Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 45276264 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46309336 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567488 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567680 on sda3

or

Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555752 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555760 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555768 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555776 on sda3


However I see NONE of that on my dad's machine using the same drive
but different chipset.

The above problems seem to result in this sort of problem when I try
going with RAID as I tried again this monring:

INFO: task kjournald:5064 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald     D ffff880028351580     0  5064      2 0x00000000
 ffff8801ac91a190 0000000000000046 0000000000000000 ffffffff81067110
 000000000000dcf8 ffff880180863fd8 0000000000011580 0000000000011580
 ffff88014165ba20 ffff8801ac89a834 ffff8801af920150 ffff8801ac91a418
Call Trace:
 [<ffffffff81067110>] ? __alloc_pages_nodemask+0xfa/0x58c
 [<ffffffff8129174a>] ? md_make_request+0xde/0x119
 [<ffffffff810a9576>] ? sync_buffer+0x0/0x40
 [<ffffffff81334305>] ? io_schedule+0x3e/0x54
 [<ffffffff810a95b1>] ? sync_buffer+0x3b/0x40
 [<ffffffff81334789>] ? __wait_on_bit+0x41/0x70
 [<ffffffff810a9576>] ? sync_buffer+0x0/0x40
 [<ffffffff81334823>] ? out_of_line_wait_on_bit+0x6b/0x77
 [<ffffffff81040a66>] ? wake_bit_function+0x0/0x23
 [<ffffffff8111f400>] ? journal_commit_transaction+0xb56/0x1112
 [<ffffffff81334280>] ? schedule+0x8f4/0x93b
 [<ffffffff81335e3d>] ? _raw_spin_lock_irqsave+0x18/0x34
 [<ffffffff81040a38>] ? autoremove_wake_function+0x0/0x2e
 [<ffffffff81335bcc>] ? _raw_spin_unlock_irqrestore+0x12/0x2c
 [<ffffffff8112278c>] ? kjournald+0xe2/0x20a
 [<ffffffff81040a38>] ? autoremove_wake_function+0x0/0x2e
 [<ffffffff811226aa>] ? kjournald+0x0/0x20a
 [<ffffffff81040665>] ? kthread+0x79/0x81
 [<ffffffff81002c94>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff810405ec>] ? kthread+0x0/0x81
 [<ffffffff81002c90>] ? kernel_thread_helper+0x0/0x10
Thanks,
Mark



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-04-01 18:57           ` Mark Knecht
@ 2010-04-02  9:43             ` Duncan
  2010-04-02 17:18               ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Duncan @ 2010-04-02  9:43 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Thu, 01 Apr 2010 11:57:47 -0700 as excerpted:

> A bit long in response. Sorry.
> 
> On Tue, Mar 30, 2010 at 11:56 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:
>>
>>> [W]hen I change the hard drive boot does the old sdb become the new
>>> sda because it's what got booted? Or is the order still as it was?

>> That depends on your BIOS.

> It seems to be constant mapping meaning (I guess) that I need to change
> the drive specs in grub.conf on the second drive to actually use the
> second drive.
> 
> I made the titles for booting different for each grub.conf file to
> ensure I was really getting grub from the second drive. My sda grub boot
> menu says "2.6.33-gentoo booting from sda" on the first drive, sdb on
> the second drive, etc.

Making the titles different is a very good idea.  It's what I ended up 
doing too, as otherwise, it can get confusing pretty fast.

Something else you might want to do, for purposes of identifying the 
drives at the grub boot prompt if something goes wrong or you are 
otherwise trying to boot something on another drive, is create a (probably 
empty) differently named file on each one, say grub.sda, grub.sdb, etc.

That way, if you end up at the boot prompt you can do a find /grub.sda 
(or /grub/grub.sda, or whatever), and grub will return a list of the 
drives with that file, in this case, only one drive, thus identifying your 
normal sda drive.

You can of course do similar by cat-ing the grub.conf file on each one, 
since you are keeping your titles different, but that's a bit more 
complicated than simply doing a find on the appropriate file, to get your 
bearings straight on which is which in the event something screws up.

>>
>> The point being... [using badblocks] it /is/ actually possible to
>> verify that they're working well before you fdisk/mkfs and load data.
>> Tho it does take awhile... days... on drives of modern size.
>>
> I'm trying badblocks right now on sdc. using
> 
> badblocks -v /dev/sdc
> 
> Maybe I need to do something more strenuous? It looks like it will be
> done an an hour or two. (i7-920 with SATA drives so it should be fast,
> as long as I'm not just reading the buffers or something like that.
> 
> Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
> hours. I'm at 18% after 28 minutes so that seems about right. (With no
> errors so far assuming I'm using the right command)

I used the -w switch here, which actually goes over the disk a total of 8 
times, alternating writing and then reading back to verify the written 
pattern, for four different write patterns (0xaa, 0x55, 0xff, 0x00, which 
is alternating 10101010, alternating 01010101, all ones, all zeroes).

But that's "data distructive".  IOW, it effectively wipes the disk.  Doing 
it when the disks were new, before I fdisked them let alone mkfs-ed and 
started loading data, was fine, but it's not something you do if you have 
unbacked up data on them that you want to keep!

Incidently, that's not /quite/ the infamous US-DOD 7-pass wipe, as it's 
only 4 passes, but it should reasonably ensure against ordinary recovery, 
in any case, if you have reason to wipe your disks...  Well, except for 
any blocks the disk internals may have detected as bad and rewritten 
elsewhere, you can get the SMART data on that.  But a 4-pass wipe, as 
badblocks -w does, should certainly be good for the normal case, and is 
already way better than just an fdisk, which doesn't even wipe anything 
but the allocation tables!

But back to the timing.  Since the -w switch does a total of 8 passes (4 
each write and read, alternating) while you're doing just one with just
-v, it'll obviously take 8 times the time.  So 80K seconds... 22+ hours.

So I guess it's not days... just about a day.  (Probably something more, 
as the first part of the disk, near the outside edge, should go faster 
than the last part, so figure a bit over a day, maybe 30 hours...)

[8 second spin-down timeouts]

> Very true. Here is the same drive model I put in a new machine for my
> dad. It's been powered up and running Gentoo as a typical desktop
> machine for about 50 days. He doesn't use it more than about an hour a
> day on average. It's already hit 31K load/unload cycles. At 10% of 300K
> that about 1.5 years of life before I hit that spec. I've watched his
> system a bit and his system seems to add 1 to the count almost exactly
> every 2 minutes on average. Is that a common cron job maybe?

It's unlikely to be a cron job.  But check your logging, and check what 
sort of atime you're using on your mounts (relatime is the new kernel 
default, but it was atime until relatively recently, say 2.6.30 or .31 or 
some such, and noatime is recommended unless you have something that 
actually depends on atime, alpine is known to need it for mail, and some 
backup software uses it, tho little else on a modern system will, I always 
use noatime on my real disk mounts, as opposed to say tmpfs, here).  If 
there's something writing to the log every two minutes or less, and the 
buffers are set to timeout dirty data and flush to disk every two 
minutes...  And simply accessing a file will change the atime on it if you 
have that turned on, thus necessitating a write to disk to update the 
atime, with those dirty buffers flushed every X minutes or seconds as well.

> I looked up the spec on all three WD lines - Green, Blue and Black. All
> three were 300K cycles. This issue has come up on the RAID list. It
> seems that some other people are seeing this and aren't exactly sure
> what Linux is doing to cause this.

It's probably not just Linux, but a combination of Linux and the drive 
defaults.

> I'll study hdparm and BIOS when I can reboot.
> 
> My dad's current data:

> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED 
> WHEN_FAILED RAW_VALUE

>   4 Start_Stop_Count        0x0032   100   100   000    Old_age
> Always       -       21

>   9 Power_On_Hours          0x0032   099   099   000    Old_age
> Always       -       1183

>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age
> Always       -       20

> 193 Load_Cycle_Count        0x0032   190   190   000    Old_age Always  
>     -       31240

Here's my comparable numbers, several years old Seagate 7200.8 series:

  4 Start_Stop_Count        0x0032   100   100   020    Old_age   
Always       -       996

  9 Power_On_Hours          0x0032   066   066   000    Old_age   
Always       -       30040

 12 Power_Cycle_Count       0x0032   099   099   020    Old_age   
Always       -       1045

Note that I don't have #193, the load-cycle counts.  There's a couple 
different technologies here.  The ramp-type load/unload yours uses is 
typical of the smaller 2.5" laptop drives.  These are designed for far 
shorter idle/standby timeouts and thus a far higher cycle count, load 
cycles, typical rating 300,000 to 600,000.  Standard desktop/server drives 
use a contact park method and a lower power cycle count, typically 50,000 
or so.  That's the difference.

At 300,000 load cycle count rating, your WDs are at the lower end of the 
ramp-type ratings, but still far higher than comparable contact power-
cycle ratings.  Even tho the ramp-type they use is good for far more 
cycles, as you mentioned, you're already at 10% after only 50 days.

My old Seagates, OTOH, about 4.5 years old best I can figure (I bought 
them around October, 30K operating hours ~3.5 years, and I run them most 
but not all of the time, so 4.5 years is a good estimate), rated for only 
50,000 contact start/stop cycles (they're NOT the ramp type), but SMART 
says only about 1000. or 2% of the rating, gone.  (If you check the stats 
they seem to recommend replacing at 20%, assuming that's a percentage, 
which looks likely, but either way, that's a metric I don't need to worry 
about any time soon.)

OTOH, at 30,000+ operating hours (about 3.5 years if on constantly, as I 
mentioned above), that one's running rather lower.  Again assuming it's a 
percentage metric, it would appear they rate them @ 90,000 hours.  (I 
looked up the specsheets tho, and couldn't see anything like that listed, 
only 5 years lifetime and warranty, which would be about half that, 45,000 
hours.  But given the 0 threshold there, it appears they expect the start-
stop cycles to be more critical, so they may indeed rate it 90,000 
operating hours.)  That'd be three and a half years of use, straight thru, 
so yeah, I've had them, probably four and half years now, probably five in 
October -- I don't have them spin down at all and often leave my system on 
for days at a time, but not /all/ the time, so 3.5 years of use in 4.5 
years sounds reasonable.

> Yeah, that's important. Thanks. If I can solve all these RAID problems
> then maybe I'll look at adding RAID to his box with better drives or
> something.

One thing they recommend with RAID, which I did NOT do, BTW, and which I'm 
beginning to worry about since I'm approaching the end of my 5 year 
warranties, is buying either different brands or models, or at least 
ensuring you're getting different lot numbers of the same model.  The idea 
being, if they're all the same model and lot number, and they're all part 
of the same RAID so in similar operating conditions, they're all likely to 
go out pretty close to each other.  That's one reason to be glad I'm 
running 4-way RAID-1, I suppose, as one hopes that when they start going, 
even if they are the same model and lot number, at least one of the four 
can hang on long enough for me to buy replacements and transfer the 
critical data.  But I also have an external 1 TB USB I bought, kept off 
most of the time as opposed to the RAID disks which are on most of the 
time, that I've got an external backup on, as well as the backups on the 
RAIDs, tho the external one isn't as regularly synced.  But in the event 
all four RAID drives die on me, I HAVE test-booted from a USB thumb drive 
(the external 1TB isn't bootable -- good thing I tested, eh!), to the 
external 1TB, and CAN recover from it, if I HAVE to.

> Note that on my system only I'm seeing real problems in
> /var/log/message, non-RAID, like 1000's of these:
> 
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 45276264 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46309336 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567488 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567680 on sda3
> 
> or
> 
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555752 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555760 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555768 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555776 on
> sda3

That doesn't look so good...

> However I see NONE of that on my dad's machine using the same drive but
> different chipset.
> 
> The above problems seem to result in this sort of problem when I try
> going with RAID as I tried again this monring:
> 
> INFO: task kjournald:5064 blocked for more than 120 seconds. "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[snipped the trace]

Ouch!  Blocked for 2 minutes...

Yes, between the logs and the 2-minute hung-task, that does look like some 
serious issues, chipset or other...

Talking about which...

Can you try different SATA cables?  I'm assuming you and your dad aren't 
using the same cables.  Maybe it's the cables, not the chipset.

Also, consider slowing the data down.  Disable UDMA or reduce it to a 
lower speed, or check the pinouts and try jumpering OPT1 to force SATA-1 
speeds (150 MB/sec instead of 300 MB/sec) as detailed here (watch the 
wrap!):

http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?
p_faqid=1337

If that solves the issue, then you know it's related to signal timing.  

Unfortunately, this can be mobo related.  I had very similar issues with 
memory at one point, and had to slow it down from the rated PC3200, to 
PC3000 speed (declock it from 200 MHz to 183 MHz), in the BIOS.  
Unfortunately, initially the BIOS didn't have a setting for that; it 
wasn't until a BIOS update that I got it.  Until I got the update and 
declocked it, it would work most of the time, but was borderline.  The 
thing was, the memory was solid and tested so in memtest86+, but that 
tests memory cells, not speed, and at the rated speed, that memory and 
that board just didn't like each other, and there'd be occasional issues 
(bunzip2 erroring out due to checksum mismatch was a common one, and 
occasional crashes). Ultimately, I fixed the problem when I upgraded 
memory.

So having experienced the issue with memory, I know exactly how 
frustrating it can be.  But if you slow it down with the jumper and it 
works, then you can try different cables, or take off the jumper and try 
lower UDMA speeds (but still higher than SATA-1/150MB/sec), using hdparm 
or something.  Or exchange either the drives or the mobo, if you can, or 
buy an add-on SATA card and disable the onboard one.

Oh, and double-check the kernel driver you are using for it as well.  
Maybe there's another that'll work better, or driver options you can feed 
to it, or something.

Oh, and if you hadn't re-fdisked, re-created new md devices, remkfsed, and 
reloaded the system from backup, after you switched to AHCI, try that.  
AHCI and the kernel driver for it is almost certainly what you want, not 
compatibility mode, but that could potentially screw things up too, if you 
switched it and didn't redo the disk afterward.

I do wish you luck!  Seeing those errors brought back BAD memories of the 
memory problems I had, so while yours is disk not memory, I can definitely 
sympathize!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-04-02  9:43             ` Duncan
@ 2010-04-02 17:18               ` Mark Knecht
  2010-04-03 23:13                 ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-04-02 17:18 UTC (permalink / raw
  To: gentoo-amd64

Good stuff. I'll snip out the less important to keep the response
shorter but don't think for a second that I didn't appreciate it. I
do!

On Fri, Apr 2, 2010 at 2:43 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Thu, 01 Apr 2010 11:57:47 -0700 as excerpted:
<SNIP>
>
> Making the titles different is a very good idea.  It's what I ended up
> doing too, as otherwise, it can get confusing pretty fast.
>
> Something else you might want to do, for purposes of identifying the
> drives at the grub boot prompt if something goes wrong or you are
> otherwise trying to boot something on another drive, is create a (probably
> empty) differently named file on each one, say grub.sda, grub.sdb, etc.

I'll consider that, once I get the hard problems solved.
<SNIP>
>> Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
>> hours. I'm at 18% after 28 minutes so that seems about right. (With no
>> errors so far assuming I'm using the right command)
>
> I used the -w switch here, which actually goes over the disk a total of 8
> times, alternating writing and then reading back to verify the written
> pattern, for four different write patterns (0xaa, 0x55, 0xff, 0x00, which
> is alternating 10101010, alternating 01010101, all ones, all zeroes).

OK, makes sense then.

I ran one pass of badblocks on each of the drives. No problem found.

I know some Linux folks don't like Spinrite but I've had good luck
with it so that's running now. Problem is it cannot run the drives at
the same time and it looks like it wants at least 24 hours to do the
whole drive so using it would take 3 days. I will likely let it run
through the first drive (I'm busy today) and then tomorrow drop back
into Linux and possibly try your badblocks on all 3 drives. I'm not
overly concerned about losing the install.

<SNIP>
>
>
> [8 second spin-down timeouts]
>
>> Very true. Here is the same drive model I put in a new machine for my
>> dad. It's been powered up and running Gentoo as a typical desktop
>> machine for about 50 days. He doesn't use it more than about an hour a
>> day on average. It's already hit 31K load/unload cycles. At 10% of 300K
>> that about 1.5 years of life before I hit that spec. I've watched his
>> system a bit and his system seems to add 1 to the count almost exactly
>> every 2 minutes on average. Is that a common cron job maybe?
>
> It's unlikely to be a cron job.  But check your logging, and check what
> sort of atime you're using on your mounts (relatime is the new kernel
> default, but it was atime until relatively recently, say 2.6.30 or .31 or
> some such, and noatime is recommended unless you have something that
> actually depends on atime, alpine is known to need it for mail, and some
> backup software uses it, tho little else on a modern system will, I always
> use noatime on my real disk mounts, as opposed to say tmpfs, here).  If
> there's something writing to the log every two minutes or less, and the
> buffers are set to timeout dirty data and flush to disk every two
> minutes...  And simply accessing a file will change the atime on it if you
> have that turned on, thus necessitating a write to disk to update the
> atime, with those dirty buffers flushed every X minutes or seconds as well.

Here is fstab from my dad's machine which racks up 30
Load_Cycle_Counts and hour:

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
LABEL="myboot"          /boot           ext2            noauto,noatime  1 2
LABEL="myroot"          /               ext3            noatime         0 1
LABEL="myswap"          none            swap            sw              0 0
LABEL="homeherb"        /home/herb      ext3            noatime         0 1
/dev/cdrom              /mnt/cdrom      auto            noauto,ro       0 0
#/dev/fd0               /mnt/floppy     auto            noauto          0 0

# glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
# POSIX shared memory (shm_open, shm_unlink).
# (tmpfs is a dynamically expandable/shrinkable ramdisk, and will
#  use almost no memory if not populated with files)
shm                     /dev/shm        tmpfs
nodev,nosuid,noexec     0 0

On the other hand there is some cron stuff going on every 10 minutes
or so. Possibly it's not 1 event ever 2 minutes but maybe 5 events
every 10 minutes?

Apr  2 07:10:01 gandalf cron[6310]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 07:20:01 gandalf cron[6322]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 07:30:01 gandalf cron[6335]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 07:40:01 gandalf cron[6348]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 07:50:01 gandalf cron[6361]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 07:59:01 gandalf cron[6374]: (root) CMD (rm -f
/var/spool/cron/lastrun/cron.hourly)
Apr  2 08:00:01 gandalf cron[6376]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:10:01 gandalf cron[6388]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:20:01 gandalf cron[6401]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:30:01 gandalf cron[6414]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:40:01 gandalf cron[6427]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:50:01 gandalf cron[6440]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 08:59:01 gandalf cron[6453]: (root) CMD (rm -f
/var/spool/cron/lastrun/cron.hourly)
Apr  2 09:00:01 gandalf cron[6455]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 09:10:01 gandalf cron[6467]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr  2 09:18:01 gandalf sshd[6479]: Accepted keyboard-interactive/pam
for root from 67.188.27.80 port 51981 ssh2
Apr  2 09:18:01 gandalf sshd[6479]: pam_unix(sshd:session): session
opened for user root by (uid=0)


<SNIP>
>
> Note that I don't have #193, the load-cycle counts.  There's a couple
> different technologies here.  The ramp-type load/unload yours uses is
> typical of the smaller 2.5" laptop drives.  These are designed for far
> shorter idle/standby timeouts and thus a far higher cycle count, load
> cycles, typical rating 300,000 to 600,000.  Standard desktop/server drives
> use a contact park method and a lower power cycle count, typically 50,000
> or so.  That's the difference.

I also purchased two Enterprise Edition drives - the 500GB size. They
are also spec'ed at 300K

http://www.wdc.com/en/products/products.asp?DriveID=489

My intention was to use them in a RAID0 and then back them up daily to
RAID1 for more safety. However I'm starting to think this TLER feature
may well be part of this problem. I don't want to start using them
however until I understand this 30/minute issue. No reason to wear
everything out!

<SNIP>
>
> One thing they recommend with RAID, which I did NOT do, BTW, and which I'm
> beginning to worry about since I'm approaching the end of my 5 year
> warranties, is buying either different brands or models, or at least
> ensuring you're getting different lot numbers of the same model.  The idea
> being, if they're all the same model and lot number, and they're all part
> of the same RAID so in similar operating conditions, they're all likely to
> go out pretty close to each other.  That's one reason to be glad I'm
> running 4-way RAID-1, I suppose, as one hopes that when they start going,
> even if they are the same model and lot number, at least one of the four
> can hang on long enough for me to buy replacements and transfer the
> critical data.

Exactly! My plan for this box is a 3 disk RAID1 as 3 disks is all it will hold.

Most folks don't understand that if 1 drive has a 1% chance of failing
then 3 drives is more like a 3% chance of failing assuming they are
are truly independent. If they all come from the same lot and 1 fails
then it's logically more likely that the other 2 will fail in the next
few days or weeks. Certainly much faster then getting them from
different companies.


<SNIP>
>>
>> INFO: task kjournald:5064 blocked for more than 120 seconds. "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> [snipped the trace]
>
> Ouch!  Blocked for 2 minutes...
>
> Yes, between the logs and the 2-minute hung-task, that does look like some
> serious issues, chipset or other...
>
> Talking about which...
>
> Can you try different SATA cables?  I'm assuming you and your dad aren't
> using the same cables.  Maybe it's the cables, not the chipset.

Now that's an interesting thought. n my other machines I used the
cables Intel shipped with the MB. However in this case I couldn't
because the SATA connectors don't point upward but come out
horizontally. Due to proximity to the drive container I had to get 90
degree cables and all 3 drives are using those right now. I can switch
two of the drives to the Intel cables.

That said Spinrite has been running for hours without and problem at
all and it will tell me if there are delays, sectors not found, etc.,
so if it was as blatant a problem as it appears to be when running
Linux then I really think I would have seen it by now. I would have
guessed I would have seen it running badblocks also, but possibly not.

>
> Also, consider slowing the data down.  Disable UDMA or reduce it to a
> lower speed, or check the pinouts and try jumpering OPT1 to force SATA-1
> speeds (150 MB/sec instead of 300 MB/sec) as detailed here (watch the
> wrap!):
>
> http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?
> p_faqid=1337
>
> If that solves the issue, then you know it's related to signal timing.

Will try it.

>
> Unfortunately, this can be mobo related.  I had very similar issues with
> memory at one point, and had to slow it down from the rated PC3200, to
> PC3000 speed (declock it from 200 MHz to 183 MHz), in the BIOS.
> Unfortunately, initially the BIOS didn't have a setting for that; it
> wasn't until a BIOS update that I got it.  Until I got the update and
> declocked it, it would work most of the time, but was borderline.  The
> thing was, the memory was solid and tested so in memtest86+, but that
> tests memory cells, not speed, and at the rated speed, that memory and
> that board just didn't like each other, and there'd be occasional issues
> (bunzip2 erroring out due to checksum mismatch was a common one, and
> occasional crashes). Ultimately, I fixed the problem when I upgraded
> memory.

OK, so I have 6 of these drives and multiple PCs. While not a perfect
test I can try putting a couple into another machine and building a 2
drive RAID1 just to see what happens.
>
> So having experienced the issue with memory, I know exactly how
> frustrating it can be.  But if you slow it down with the jumper and it
> works, then you can try different cables, or take off the jumper and try
> lower UDMA speeds (but still higher than SATA-1/150MB/sec), using hdparm
> or something.  Or exchange either the drives or the mobo, if you can, or
> buy an add-on SATA card and disable the onboard one.
>
> Oh, and double-check the kernel driver you are using for it as well.
> Maybe there's another that'll work better, or driver options you can feed
> to it, or something.

The kernel driver is ahci. Don't know that I have any alternatives
when booting AHCI from BIOS, but I can look at the other modes with
other drivers and see if the problems still occurs. That's a bit of
work but probably worth it.

This is all a big table of experiments that eventually limit the
problem to a single location. (Hopefully!)

>
> Oh, and if you hadn't re-fdisked, re-created new md devices, remkfsed, and
> reloaded the system from backup, after you switched to AHCI, try that.
> AHCI and the kernel driver for it is almost certainly what you want, not
> compatibility mode, but that could potentially screw things up too, if you
> switched it and didn't redo the disk afterward.
>
> I do wish you luck!  Seeing those errors brought back BAD memories of the
> memory problems I had, so while yours is disk not memory, I can definitely
> sympathize!

As always, thanks for the help. I'm very interested, and yes, even a
little frustrated! ;-)

Cheers,
Mark



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-04-02 17:18               ` Mark Knecht
@ 2010-04-03 23:13                 ` Mark Knecht
  2010-04-05 18:17                   ` Mark Knecht
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-04-03 23:13 UTC (permalink / raw
  To: gentoo-amd64

On Fri, Apr 2, 2010 at 10:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
<SNIP>
>
> I also purchased two Enterprise Edition drives - the 500GB size. They
> are also spec'ed at 300K
>
> http://www.wdc.com/en/products/products.asp?DriveID=489
>
> My intention was to use them in a RAID0 and then back them up daily to
> RAID1 for more safety. However I'm starting to think this TLER feature
> may well be part of this problem. I don't want to start using them
> however until I understand this 30/minute issue. No reason to wear
> everything out!
>
> <SNIP>

Duncan,
   Just a quick follow-up. I tested the WD Green drives for the last
36 hours in Spinrite and found no problems. I think the drives are OK.

   I then took the two 500GB Raid Enterprise drives above and made a
new RAID1 on which I did a quick Gentoo install. This uses the same
SATA controller, same cables and same drivers as I was using earlier
with the Green drives. While doing the install I saw no problems or
delays on them. Now I've got a new RAID1 install but cannot get it to
boot as grub doesn't seem to automatically assemble the RAID1 device.
I'm moderately confident that once I figure out how to get grub to
understand / is on RAID1 then I may be in good shape.

   Thanks for your help,

Cheers,
Mark

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-04-03 23:13                 ` Mark Knecht
@ 2010-04-05 18:17                   ` Mark Knecht
  2010-04-06 14:00                     ` Duncan
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Knecht @ 2010-04-05 18:17 UTC (permalink / raw
  To: gentoo-amd64

On Sat, Apr 3, 2010 at 4:13 PM, Mark Knecht <markknecht@gmail.com> wrote:
> On Fri, Apr 2, 2010 at 10:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
> <SNIP>
>>
>> I also purchased two Enterprise Edition drives - the 500GB size. They
>> are also spec'ed at 300K
>>
>> http://www.wdc.com/en/products/products.asp?DriveID=489
>>
>> My intention was to use them in a RAID0 and then back them up daily to
>> RAID1 for more safety. However I'm starting to think this TLER feature
>> may well be part of this problem. I don't want to start using them
>> however until I understand this 30/minute issue. No reason to wear
>> everything out!
>>
>> <SNIP>
>
> Duncan,
>   Just a quick follow-up. I tested the WD Green drives for the last
> 36 hours in Spinrite and found no problems. I think the drives are OK.
>
>   I then took the two 500GB Raid Enterprise drives above and made a
> new RAID1 on which I did a quick Gentoo install. This uses the same
> SATA controller, same cables and same drivers as I was using earlier
> with the Green drives. While doing the install I saw no problems or
> delays on them. Now I've got a new RAID1 install but cannot get it to
> boot as grub doesn't seem to automatically assemble the RAID1 device.
> I'm moderately confident that once I figure out how to get grub to
> understand / is on RAID1 then I may be in good shape.
>
>   Thanks for your help,
>
> Cheers,
> Mark
>

OK, after a solid day of testing it really looks like the cause is the
WD10EARS drive. With RAID1 on these new enterprise drives I no longer
see 100% waits and I have no error messages anywhere that I can find
them. Additionally the early SMART data looks like there's not the
same LOAD_CYCLE_COUNT incrementing problem either.

I sort of feel like this should be reported to someone - LKML or maybe
the SATA driver folks. It's completely reproducible, and I'd be happy
to help some developer debug it if they wanted to, but I'm not sure
how to go about something that specific.

Anyway, it appears my problems are solved. I'm using RAID for / and
your ideas about multiple /boot partitions. I haven't tested the other
boots yet but I'm sure it will likely work.

Thanks,
Mark



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [gentoo-amd64] Re: RAID1 boot - no bootable media found
  2010-04-05 18:17                   ` Mark Knecht
@ 2010-04-06 14:00                     ` Duncan
  0 siblings, 0 replies; 12+ messages in thread
From: Duncan @ 2010-04-06 14:00 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Mon, 05 Apr 2010 11:17:21 -0700 as excerpted:

> Anyway, it appears my problems are solved. I'm using RAID for / and your
> ideas about multiple /boot partitions. I haven't tested the other boots
> yet but I'm sure it will likely work.

Good to see that.  As you may have noticed, I've been busy and haven't 
replied to... this makes three of your posts (this is just a quick reply, 
not my usual detail).

I still have them marked unread, and will try to get back to them and see 
if there's any loose ends to comment on, soon (a week or so), tho no 
promises.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2010-04-06 14:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-28 17:14 [gentoo-amd64] RAID1 boot - no bootable media found Mark Knecht
2010-03-30  6:39 ` [gentoo-amd64] " Duncan
2010-03-30 13:56   ` Mark Knecht
2010-03-30 18:08     ` Duncan
2010-03-30 20:26       ` Mark Knecht
2010-03-31  6:56         ` Duncan
2010-04-01 18:57           ` Mark Knecht
2010-04-02  9:43             ` Duncan
2010-04-02 17:18               ` Mark Knecht
2010-04-03 23:13                 ` Mark Knecht
2010-04-05 18:17                   ` Mark Knecht
2010-04-06 14:00                     ` Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox