From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1NmqTC-0001It-Vz for garchives@archives.gentoo.org; Wed, 03 Mar 2010 15:26:43 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id DF130E0E70; Wed, 3 Mar 2010 15:26:14 +0000 (UTC) Received: from mail-pz0-f189.google.com (mail-pz0-f189.google.com [209.85.222.189]) by pigeon.gentoo.org (Postfix) with ESMTP id A1B65E0E70 for ; Wed, 3 Mar 2010 15:26:14 +0000 (UTC) Received: by pzk27 with SMTP id 27so1713924pzk.2 for ; Wed, 03 Mar 2010 07:26:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=LLUhr8lDAMWA4atiVYot+5vYsRyANpK/VLtpdLdnBp4=; b=MfDg9pz2m3IXDy0nmQSIldIgm/8JmFoioyrxnalklydnPRRPUJ2nUrVystguwBLbVc HFBG8AGC7/7s0YItt7/yObTic2cLK0qxmOfxpc2QiamMsFSk4Qp8xurPdy5s8poak++X Hrcwhvzkzb+k1TrW1xA+OjFeEQiBIe+fA08O8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=Afb1MSZGZqzLZK7dGSrDcMx+OY2spHUhLdwYS/OaCugwS5vMQj6KEPDWF84BCeEIiW 5YdfTV86U0gvFdfHIvh6LNkiBgfS08nBagmqg7eDcMyp68I1nD+J/2n6UcIJNWKGnc4X Wt3mRgFmq/e0yXrCqzXdA4rlDcevkRgsnr9gM= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.142.152.40 with SMTP id z40mr1253969wfd.211.1267629973881; Wed, 03 Mar 2010 07:26:13 -0800 (PST) In-Reply-To: References: <838C106F-28D0-4DFB-822F-520EAECEE034@stellar.eclipse.co.uk> <5bdc1c8b1003030600t18807114g3edc81fa6a86500e@mail.gmail.com> Date: Wed, 3 Mar 2010 07:26:13 -0800 Message-ID: <5bdc1c8b1003030726u90f837fw16015f0e724251fb@mail.gmail.com> Subject: Re: [gentoo-user] Filesystem corruption - reiserfs? - won't boot, "filesystem couldn't be fixed :(" From: Mark Knecht To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: c5334251-ca88-455a-bbbd-9b41cee0dd28 X-Archives-Hash: 2f6a51c62de738c353f355275160a0e9 On Wed, Mar 3, 2010 at 6:26 AM, Stroller w= rote: > > On 3 Mar 2010, at 14:00, Mark Knecht wrote: >> >> On Wed, Mar 3, 2010 at 4:24 AM, Stroller >> wrote: >>> >>> There seem to have been a few people posting with filesystem corruption >>> in >>> the last week or two. It seems to be my turn, so I hope it isn't >>> contagious. >>> The cause here is quite clear - whilst rummaging in the server cupboard >>> yesterday, power to the machine was accidentally disconnected. >> >> ... >> =C2=A0Sorry for your problems. I've had a rash of machine problems over >> the last 6 weeks. No fun. I feel for you. >> >> =C2=A0In my most recent case what looked like a simple disk corruption >> problem was really a prelude to the drive just plain going bad. Have >> you tried smartctl to see what it says about the drive at this point? >> >> =C2=A0It would be even more frustrating to chroot in, do all the work, >> think you had it fixed and then the underlying foundation of your >> house crumbles beneath you 3 weeks from now. > > I don't think this is a problem. I would love to know what others think o= f > the `smartctl` output: > > > root@sysresccd /root % smartctl -H /dev/sda > smartctl version 5.38 [i486-pc-linux-gnu] Copyright (C) 2002-8 Bruce Alle= n > Home page is http://smartmontools.sourceforge.net/ > > =3D=3D=3D START OF READ SMART DATA SECTION =3D=3D=3D > SMART overall-health self-assessment test result: PASSED > Please note the following marginal Attributes: > ID# ATTRIBUTE_NAME =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0FLAG =C2=A0 =C2=A0 V= ALUE WORST THRESH TYPE =C2=A0 =C2=A0 =C2=A0UPDATED > =C2=A0WHEN_FAILED RAW_VALUE > =C2=A09 Power_On_Seconds =C2=A0 =C2=A0 =C2=A0 =C2=A00x0012 =C2=A0 001 =C2= =A0 001 =C2=A0 020 =C2=A0 =C2=A0Old_age =C2=A0 Always > FAILING_NOW 44803h+12m+16s > > root@sysresccd /root % smartctl -i /dev/sda > smartctl version 5.38 [i486-pc-linux-gnu] Copyright (C) 2002-8 Bruce Alle= n > Home page is http://smartmontools.sourceforge.net/ > > =3D=3D=3D START OF INFORMATION SECTION =3D=3D=3D > Model Family: =C2=A0 =C2=A0 Fujitsu MPA..MPG series > Device Model: =C2=A0 =C2=A0 FUJITSU MPF3204AT > Serial Number: =C2=A0 =C2=A005030567 > Firmware Version: 0028 > User Capacity: =C2=A0 =C2=A020,496,236,544 bytes > Device is: =C2=A0 =C2=A0 =C2=A0 =C2=A0In smartctl database [for details u= se: -P show] > ATA Version is: =C2=A0 5 > ATA Standard is: =C2=A0ATA/ATAPI-5 T13 1321D revision 1 > Local Time is: =C2=A0 =C2=A0Wed Mar =C2=A03 14:14:31 2010 UTC > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > root@sysresccd /root % > > > This looks to me like smartctl is going "OMG! What an ancient drive!" - i= t's > a 20gig EIDE drive and if my pocket calculator is correct (44803/24/365), > it's seen 5 years of active use - and that's the "marginal attribute" > referred to. > > Like I said, the power plug was accidentally pulled on this drive, so I'm > inclined to attribute the corruption only to that, not to the drive actua= lly > failing. > > The drive is in a computer that has rarely been turned off in the last > couple of years, and is also in a warm environment, conditions which are > ideal. I appreciate the latter seems unintuitive, but in fact studies hav= e > showed that drives in somewhat warm environments last longer than those t= hat > are cooled. > > That it passes the "SMART overall-health self-assessment test" suggests t= o > me that it is chugging away quite happily. > > I would have dismissed your concerns were it not for the capitalised > "FAILING_NOW" in the output. Like I say, I think this is just smartctl > declaring "OMG! this drive is old!", but I open this matter to the list f= or > discussion (should you wish). > > I think I'm actually nearly ready to migrate off this system. The power w= as > actually pulled as I installed 3 new (to me) rackmount machines in the > server cupboard - the plan is to have identical machines running RAID, so > that in the case of ANY problems I have spares available. I have take > nightly backups of the important data on this machine, however I'd prefer= it > to run just a couple or a few weeks longer to allow me to migrate at my o= wn > leisure. > > Stroller. I've had two machines go bad due to hard drive problems in the last 6 weeks. One drive was 4.5 years old, the other 6 years old. I have no experience with smart. I'm just learning about it. However it is generated by the microcontroller in the hard drive as per the view of the drive manufacturer so if the drive is telling you it's failing then... My 4.5 year failure actually stopped producing smart output somewhere along the way before it failed. The 6 year drive I wasn't using smart at the time so I had no data from it but it was in an environment where the UPS went through a lot of abuse. I sounds like you have good backups so just make sure they are good and do what you want. - Mark