From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-user+bounces-144314-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	by finch.gentoo.org (Postfix) with ESMTP id 39A2613837A
	for <garchives@archives.gentoo.org>; Tue,  8 Jan 2013 21:51:18 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 2075821C03A;
	Tue,  8 Jan 2013 21:51:03 +0000 (UTC)
Received: from mail-ee0-f51.google.com (mail-ee0-f51.google.com [74.125.83.51])
	(using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id 1A48621C006
	for <gentoo-user@lists.gentoo.org>; Tue,  8 Jan 2013 21:49:28 +0000 (UTC)
Received: by mail-ee0-f51.google.com with SMTP id d4so474545eek.10
        for <gentoo-user@lists.gentoo.org>; Tue, 08 Jan 2013 13:49:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=x-received:date:from:to:subject:message-id:in-reply-to:references
         :organization:x-mailer:mime-version:content-type
         :content-transfer-encoding;
        bh=yLtH5yZ0U+jO3KIooV0SeCl0zXo+oqvTuZ7bub2abto=;
        b=oqV6XLUVoc8Vh6r2LgU8QCqE7YDHB8E598BD46euzMv5gjUN1iQOIOvz4r4wsVt/d6
         gJp9qHKVTxK8mlH4vRI+hOuSNKvbuGLv2HKBphZmjUN8ibB1lnN5ybry/Z+/qMZgqHHk
         nqT76VNiTQ33Yt2eF/c3TIqFloEF8yBM7pr/dhlOMs6GAJM4ZI76LscZP1XkcywQDell
         SyqJOfqk5LePIl6cXxdqqgsBCJTetBs/cN4osgeDS3GNGHxFQQ41A4mr5/3xbomCMLXb
         zkm56P1vMwWgLfHm7RpMfaroVX556xMzPowrt/tqAr25BOk4ERxIKGloH2JflPY3pmGD
         1/Mg==
X-Received: by 10.14.194.195 with SMTP id m43mr177674514een.44.1357681767773;
        Tue, 08 Jan 2013 13:49:27 -0800 (PST)
Received: from khamul.example.com (196-210-238-77.dynamic.isadsl.co.za. [196.210.238.77])
        by mx.google.com with ESMTPS id v46sm137842674eep.1.2013.01.08.13.49.25
        (version=SSLv3 cipher=OTHER);
        Tue, 08 Jan 2013 13:49:26 -0800 (PST)
Date: Tue, 8 Jan 2013 23:45:04 +0200
From: Alan McKinnon <alan.mckinnon@gmail.com>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Re: OT: Fighting bit rot
Message-ID: <20130108234504.08c19c9c@khamul.example.com>
In-Reply-To: <kchtg5$dku$1@ger.gmane.org>
References: <50EB2BF7.4040109@binarywings.net>
	<20130108012016.2f02c68c@khamul.example.com>
	<50EBCA77.8030603@binarywings.net>
	<20130108095510.04f84040@khamul.example.com>
	<50EC4660.5090208@binarywings.net>
	<CAA2qdGUn8pf4WKsKugFeY20aXrciyQiwpigGVs+5xkjW4hbBsQ@mail.gmail.com>
	<kchtg5$dku$1@ger.gmane.org>
Organization: Internet Solutions
X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.14; x86_64-pc-linux-gnu)
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Archives-Salt: f253f695-4186-4c75-9908-c75e7733d206
X-Archives-Hash: 26d938655bdeb19f9cd3a94b9bdb5b37

On Tue, 8 Jan 2013 19:53:41 +0000 (UTC)
Grant Edwards <grant.b.edwards@gmail.com> wrote:

> On 2013-01-08, Pandu Poluan <pandu@poluan.info> wrote:
> > On Jan 8, 2013 11:20 PM, "Florian Philipp" <lists@binarywings.net>
> > wrote:
> >>
> >
> > -- snip --
> >
> >>
> >> Hmm, good idea, albeit similar to the `md5sum -c`. Either tool
> >> leaves you with the problem of distinguishing between legitimate
> >> changes (i.e. a user wrote to the file) and decay.
> >>
> >> When you have completely static content, md5sum, rsync and friends
> >> are sufficient. But if you have content that changes from time to
> >> time, the number of false-positives would be too high. In this
> >> case, I think you could easily distinguish by comparing both file
> >> content and time stamps.
> >>
> >> Now, that of course introduces the problem that decay could occur
> >> in the same time frame as a legitimate change, thus masking the
> >> decay. To reduce this risk, you have to reduce the checking
> >> interval.
> >>
> >> Regards,
> >> Florian Philipp
> >
> > IMO, we're all barking up the wrong tree here...
> >
> > Before a file's content can change without user involvement, bit
> > rot must first get through the checksum (CRC?) of the hard disk
> > itself. There will be no 'gradual degradation of data', just
> > 'catastrophic data loss'.
> 
> When a hard drive starts to fail, you don't unknowingly get back
> "rotten" data with some bits flipped.  You get either a "seek error"
> or "read error", and no data at all.  IIRC, the same is true for
> attempts to read a failing CD.

I see what Florian is getting at here, and he's perfectly correct.

We techie types often like to think our storage is purely binary, the
cells are either on or off and they never change unless we
deliberately make them change. We think this way because we wrap our
storage in layers to make it look that way, in the style of an API.


The truth is that our storage is subject to decay. Harddrives are
magnetic at heart, and atoms have to align and stay aligned for the
drive to work. Floppies are infinitely worse at this, but drives are
not immune. Writeable CDs do not have physical pits and lands like
factory original discs have, they use chemicals to make reflective and
non-reflective spots. The list of points of corruption is long and
they all happen after the data has been committed to physical storage.

Worse, you only know about the corruption by reading it, there is no
other way to discover if the medium and the data are still OK. He wants
to read the medium occasionally and verify it while the backups are
still usable, and not wait for the point of no return - the "read error"
from a medium that long since failed.

Maybe Florian's data is valuable enough to warrant worth the effort. I
know mine isn't, but his might be.


-- 
Alan McKinnon
alan.mckinnon@gmail.com