From: Marc Joliet <marcec@gmx.de>
To: gentoo-amd64@lists.gentoo.org
Subject: Re: [gentoo-amd64] Soliciting new RAID ideas
Date: Wed, 28 May 2014 21:20:18 +0200 [thread overview]
Message-ID: <20140528212018.04387c61@marcec> (raw)
In-Reply-To: <20140528152658.GA13493@sgi.com>
[-- Attachment #1: Type: text/plain, Size: 4318 bytes --]
Am Wed, 28 May 2014 08:26:58 -0700
schrieb Bob Sanders <rsanders@sgi.com>:
>
> Marc Joliet, mused, then expounded:
> > Am Tue, 27 May 2014 15:39:38 -0700
> > schrieb Bob Sanders <rsanders@sgi.com>:
> >
> > While I am far from a filesystem/storage expert (I see myself as a mere user),
> > the cited threads lead me to believe that this is most likely an
> > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
> > suggest reading them in their entirety.
> >
> > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
> > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
> > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
> > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
> >
>
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> memory bit and no ECC memory:
>
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/
Thanks for explicitly linking that. I didn't read it the first time around,
but just read through most of it, then reread the threads [0] and [3] above and
*think* that I understand the problem (and how it doesn't apply to BTRFS)
better now.
IIUC, the claim is: data is written to disk, but it must go through the RAM
first, obviously, where it is corrupted (due to a permanent bit flip caused,
e.g., by deteriorating hardware). At some later point, when the data is read
back from disk, it might happen to load around the damaged location in RAM,
where it is further corrupted. At this point the checksum fails, and ZFS
corrects the data in RAM (using parity information!), where it is immediately
corrupted again (because apparently it is corrected at the same physical
location in RAM? perhaps this is specific to correction via parity?). This
*additionally* corrupted data is then written back to disk (without any further
checks).
So the point is that, apparently, without ECC RAM, you could get a (long-term)
cascade of errors, especially during a scrub. The likelihood of such permanent
RAM corruption happening in the first place is another question entirely.
The various posts in [0] then basically say that regardless of whether this
really is true of ZFS, it certainly doesn't apply to BTRFS, for various
reasons. I suppose this quote from [1] (see above) says it most clearly:
> In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about
> reconstructing corrupted data from parity information:
>
> > Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted
> bit. Remember, the checksum data was calculated after the corruption from the first memory error
> occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM.
>
> i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors
> automatically from the parity information. I start to suspect that there is confusion here between
> checksumming for data integrity and parity information. If this is really how ZFS works, then if memory
> corruption interferes with this process, then I can see how a scrub could be devastating. I don't know if
> ZFS really works like this. It sounds very odd to do this without an additional checksum check. This sounds
> very different to what you say below that btrfs does, which is only to check against redundantly-stored
> copies, which I agree sounds much safer.
The rest is also relevant, but I think the point that the data is corrected via
parity information, as opposed to using a known-good redundant copy of the data
(which I originally missed, and thus got confused), is the key point in
understanding the (supposed) difference in behaviour between ZFS and BTRFS.
All this assumes, of course, that the FreeNAS forum post that ignited this
discussion is correct in the first place.
> Thanks Mark! Interesting discussion on btrfs.
>
> Bob
You're welcome! I agree, it's an interesting discussion. And regarding the
misspelling of my name: no problem :-) .
--
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2014-05-28 19:20 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
2014-05-27 22:39 ` Bob Sanders
2014-05-27 22:58 ` Harry Holt
2014-05-27 23:38 ` thegeezer
2014-05-28 0:26 ` Rich Freeman
2014-05-28 3:12 ` [gentoo-amd64] btrfs Was: " Duncan
2014-05-28 7:29 ` thegeezer
2014-05-28 20:32 ` Marc Joliet
2014-05-29 6:41 ` [gentoo-amd64] " Duncan
2014-05-29 17:57 ` Marc Joliet
2014-05-29 17:59 ` Rich Freeman
2014-05-29 18:25 ` Mark Knecht
2014-05-29 21:05 ` Frank Peters
2014-05-30 2:04 ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan
2014-05-30 2:44 ` Frank Peters
2014-05-30 6:25 ` [gentoo-amd64] " Duncan
2014-06-04 16:41 ` [gentoo-amd64] " Mark Knecht
2014-06-05 2:00 ` [gentoo-amd64] " Duncan
2014-06-05 18:59 ` Mark Knecht
2014-06-06 12:11 ` Duncan
[not found] ` <Alo71o01J1aVA4001lo9xP>
2014-06-06 17:07 ` Duncan
2014-05-27 23:32 ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
2014-05-27 23:51 ` Marc Joliet
2014-05-28 15:26 ` Bob Sanders
2014-05-28 15:28 ` Bob Sanders
2014-05-28 16:10 ` Rich Freeman
2014-05-28 19:20 ` Marc Joliet [this message]
2014-05-28 19:56 ` Bob Sanders
2014-05-29 7:08 ` [gentoo-amd64] " Duncan
2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140528212018.04387c61@marcec \
--to=marcec@gmx.de \
--cc=gentoo-amd64@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox