On Jan 9, 2013 2:06 AM, "Florian Philipp" wrote: > > Am 08.01.2013 18:41, schrieb Pandu Poluan: > > > > On Jan 8, 2013 11:20 PM, "Florian Philipp" > > wrote: > >> > > > > -- snip -- > > > [...] > >> > >> When you have completely static content, md5sum, rsync and friends are > >> sufficient. But if you have content that changes from time to time, the > >> number of false-positives would be too high. In this case, I think you > >> could easily distinguish by comparing both file content and time stamps. > >> > [...] > > > > IMO, we're all barking up the wrong tree here... > > > > Before a file's content can change without user involvement, bit rot > > must first get through the checksum (CRC?) of the hard disk itself. > > There will be no 'gradual degradation of data', just 'catastrophic data > > loss'. > > > > Unfortunately, that's only partly true. Latent disk errors are a well > researched topic [1-3]. CRCs are not perfectly reliable. The trick is to > detect and correct errors while you still have valid backups or other > types of redundancy. > > The only way to do this is regular scrubbing. That's why professional > archival solutions offer some kind of self-healing which is usually just > the same as what I proposed (plus whatever on-access integrity checks > the platform supports) [4]. > > > I would rather focus my efforts on ensuring that my backups are always > > restorable, at least until the most recent time of archival. > > > > That's the point: > a) You have to detect when you have to restore from backup. > b) You have to verify that the backup itself is still valid. > c) You have to avoid situations where undetected errors creep into the > backup. > > I'm not talking about a purely theoretical possibility. I have > experienced just that: Some data that I have kept lying around for years > was corrupted. > > [1] Schwarz et.al: Disk Scrubbing in Large, Archival Storage Systems > http://www.cse.scu.edu/~tschwarz/Papers/mascots04.pdf > > [2] Baker et.al: A fresh look at the reliability of long-term digital > storage > http://arxiv.org/pdf/cs/0508130 > > [3] Bairavasundaram et.al: An Analysis of Latent Sector Errors in Disk > Drives > http://bnrg.eecs.berkeley.edu/~randy/Courses/CS294.F07/11.1.pdf > > [4] > http://uk.emc.com/collateral/analyst-reports/kci-evaluation-of-emc-centera.pdf > > Regards, > Florian Philipp > Interesting reads... thanks for the link! Hmm... if I'm in your position, I think this is what I'll do: 1. Make a set of MD5 'checksums', one per file for ease of update. 2. Compare the checksums with the actual files before opening a file. If mismatch, notify. 3. When file handle is closed, recalculate. Protect the set of MD5 periodically using par2. Also protect your backups using par2, for that matter (that's what I always do when I archive something to optical media). Of course, you can outright use par2 to protect and ECC your data, but the time needed to generate the .par files *every time* would be too much, methinks... Rgds, --