From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 39A2613837A for ; Tue, 8 Jan 2013 21:51:18 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 2075821C03A; Tue, 8 Jan 2013 21:51:03 +0000 (UTC) Received: from mail-ee0-f51.google.com (mail-ee0-f51.google.com [74.125.83.51]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 1A48621C006 for ; Tue, 8 Jan 2013 21:49:28 +0000 (UTC) Received: by mail-ee0-f51.google.com with SMTP id d4so474545eek.10 for ; Tue, 08 Jan 2013 13:49:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:date:from:to:subject:message-id:in-reply-to:references :organization:x-mailer:mime-version:content-type :content-transfer-encoding; bh=yLtH5yZ0U+jO3KIooV0SeCl0zXo+oqvTuZ7bub2abto=; b=oqV6XLUVoc8Vh6r2LgU8QCqE7YDHB8E598BD46euzMv5gjUN1iQOIOvz4r4wsVt/d6 gJp9qHKVTxK8mlH4vRI+hOuSNKvbuGLv2HKBphZmjUN8ibB1lnN5ybry/Z+/qMZgqHHk nqT76VNiTQ33Yt2eF/c3TIqFloEF8yBM7pr/dhlOMs6GAJM4ZI76LscZP1XkcywQDell SyqJOfqk5LePIl6cXxdqqgsBCJTetBs/cN4osgeDS3GNGHxFQQ41A4mr5/3xbomCMLXb zkm56P1vMwWgLfHm7RpMfaroVX556xMzPowrt/tqAr25BOk4ERxIKGloH2JflPY3pmGD 1/Mg== X-Received: by 10.14.194.195 with SMTP id m43mr177674514een.44.1357681767773; Tue, 08 Jan 2013 13:49:27 -0800 (PST) Received: from khamul.example.com (196-210-238-77.dynamic.isadsl.co.za. [196.210.238.77]) by mx.google.com with ESMTPS id v46sm137842674eep.1.2013.01.08.13.49.25 (version=SSLv3 cipher=OTHER); Tue, 08 Jan 2013 13:49:26 -0800 (PST) Date: Tue, 8 Jan 2013 23:45:04 +0200 From: Alan McKinnon To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Re: OT: Fighting bit rot Message-ID: <20130108234504.08c19c9c@khamul.example.com> In-Reply-To: References: <50EB2BF7.4040109@binarywings.net> <20130108012016.2f02c68c@khamul.example.com> <50EBCA77.8030603@binarywings.net> <20130108095510.04f84040@khamul.example.com> <50EC4660.5090208@binarywings.net> Organization: Internet Solutions X-Mailer: Claws Mail 3.9.0 (GTK+ 2.24.14; x86_64-pc-linux-gnu) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Archives-Salt: f253f695-4186-4c75-9908-c75e7733d206 X-Archives-Hash: 26d938655bdeb19f9cd3a94b9bdb5b37 On Tue, 8 Jan 2013 19:53:41 +0000 (UTC) Grant Edwards wrote: > On 2013-01-08, Pandu Poluan wrote: > > On Jan 8, 2013 11:20 PM, "Florian Philipp" > > wrote: > >> > > > > -- snip -- > > > >> > >> Hmm, good idea, albeit similar to the `md5sum -c`. Either tool > >> leaves you with the problem of distinguishing between legitimate > >> changes (i.e. a user wrote to the file) and decay. > >> > >> When you have completely static content, md5sum, rsync and friends > >> are sufficient. But if you have content that changes from time to > >> time, the number of false-positives would be too high. In this > >> case, I think you could easily distinguish by comparing both file > >> content and time stamps. > >> > >> Now, that of course introduces the problem that decay could occur > >> in the same time frame as a legitimate change, thus masking the > >> decay. To reduce this risk, you have to reduce the checking > >> interval. > >> > >> Regards, > >> Florian Philipp > > > > IMO, we're all barking up the wrong tree here... > > > > Before a file's content can change without user involvement, bit > > rot must first get through the checksum (CRC?) of the hard disk > > itself. There will be no 'gradual degradation of data', just > > 'catastrophic data loss'. > > When a hard drive starts to fail, you don't unknowingly get back > "rotten" data with some bits flipped. You get either a "seek error" > or "read error", and no data at all. IIRC, the same is true for > attempts to read a failing CD. I see what Florian is getting at here, and he's perfectly correct. We techie types often like to think our storage is purely binary, the cells are either on or off and they never change unless we deliberately make them change. We think this way because we wrap our storage in layers to make it look that way, in the style of an API. The truth is that our storage is subject to decay. Harddrives are magnetic at heart, and atoms have to align and stay aligned for the drive to work. Floppies are infinitely worse at this, but drives are not immune. Writeable CDs do not have physical pits and lands like factory original discs have, they use chemicals to make reflective and non-reflective spots. The list of points of corruption is long and they all happen after the data has been committed to physical storage. Worse, you only know about the corruption by reading it, there is no other way to discover if the medium and the data are still OK. He wants to read the medium occasionally and verify it while the backups are still usable, and not wait for the point of no return - the "read error" from a medium that long since failed. Maybe Florian's data is valuable enough to warrant worth the effort. I know mine isn't, but his might be. -- Alan McKinnon alan.mckinnon@gmail.com