From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 03B0413877A for ; Wed, 25 Jun 2014 13:16:08 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id DEF13E08A8; Wed, 25 Jun 2014 13:16:00 +0000 (UTC) Received: from mail-yh0-f46.google.com (mail-yh0-f46.google.com [209.85.213.46]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id D00FCE082D for ; Wed, 25 Jun 2014 13:15:59 +0000 (UTC) Received: by mail-yh0-f46.google.com with SMTP id c41so1122769yho.33 for ; Wed, 25 Jun 2014 06:15:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=0joDUb3F3ZsxpX0E8j+V4Edzsq3VOLP6ie/ERChV7tM=; b=oTBsMh/CkqF2J5uLKf+DVX+nS2Oo3+k+mHkjgKcc/+LQFPtECPeK2PeoTnxSgWJ+A7 pwc7YZhmAz2VC6Z7fgAd2J04U1xCcWSigO7aXxxm+ad1koxnWaxd+zRcM1XAqGqxpOBl B2px4pqEd7NTxhdvI1p8mirEu9RW82LrtIYUpqI1Jxg1w/eESHsbgh7F1rG7TmO/HqZs zawWoEj73k/3vKf7sL9PpX4OeKsujwM6UtspGejNJ8DWvH6Eyy6PiXRhgG0C1bT0JqQK hXf4dRFIhkCKfmB7jzh14nL4wi3+Kn6SsuI8G0UnZ1ekSvg2AAixo9WSvdMXlR2K7ayB wZ+w== X-Received: by 10.236.79.103 with SMTP id h67mr11349826yhe.114.1403702158809; Wed, 25 Jun 2014 06:15:58 -0700 (PDT) Received: from [192.168.2.5] (adsl-65-0-120-204.jan.bellsouth.net. [65.0.120.204]) by mx.google.com with ESMTPSA id n68sm5060091yhe.23.2014.06.25.06.15.57 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Jun 2014 06:15:58 -0700 (PDT) Message-ID: <53AACB8D.6010300@gmail.com> Date: Wed, 25 Jun 2014 08:15:57 -0500 From: Dale User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0 SeaMonkey/2.25 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] smartctrl drive error @60% References: <53AA050F.4070907@gmail.com> <49620f42-d9c3-43b1-9f01-1250e52eb950@email.android.com> <53AA587F.8090300@gmail.com> <53AA7D11.6070909@thegeezer.net> <53AA7EF5.2000903@gmail.com> <53AAA791.4050506@thegeezer.net> In-Reply-To: <53AAA791.4050506@thegeezer.net> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Archives-Salt: c453db75-fe0f-48bf-90b4-aa541ca285a2 X-Archives-Hash: 8ba2d4cf06e61ba2456359b850544e9d thegeezer wrote: > On 06/25/2014 08:49 AM, Dale wrote: >> thegeezer wrote: >>> this is pretty bad. >> Here is the output: >> >> root@fireball / # smartctl -a /dev/sdc >> smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) >> Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org >> >> === START OF INFORMATION SECTION === >> Model Family: Seagate Barracuda 7200.14 (AF) >> Device Model: ST3000DM001-9YN166 >> Serial Number: Z1F0PKT5 >> LU WWN Device Id: 5 000c50 04d79e15c >> Firmware Version: CC4C >> User Capacity: 3,000,592,982,016 bytes [3.00 TB] >> Sector Sizes: 512 bytes logical, 4096 bytes physical >> Rotation Rate: 7200 rpm >> Device is: In smartctl database [for details use: -P show] >> ATA Version is: ATA8-ACS T13/1699-D revision 4 >> SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) >> Local Time is: Wed Jun 25 02:46:39 2014 CDT >> >> ==> WARNING: A firmware update for this drive is available, >> see the following Seagate web pages: >> http://knowledge.seagate.com/articles/en_US/FAQ/207931en >> http://knowledge.seagate.com/articles/en_US/FAQ/223651en > interesting - not seen that before might be worth a nose I was thinking the same thing myself. How does it know there is a update was another question I had. >> SMART support is: Available - device has SMART capability. >> SMART support is: Enabled >> >> === START OF READ SMART DATA SECTION === >> SMART overall-health self-assessment test result: PASSED >> >> General SMART Values: >> Offline data collection status: (0x00) Offline data collection activity >> was never started. >> Auto Offline Data Collection: >> Disabled. >> Self-test execution status: ( 118) The previous self-test completed >> having >> the read element of the test failed. >> Total time to complete Offline >> data collection: ( 584) seconds. >> Offline data collection >> capabilities: (0x73) SMART execute Offline immediate. >> Auto Offline data collection >> on/off support. >> Suspend Offline collection upon new >> command. >> No Offline surface scan supported. >> Self-test supported. >> Conveyance Self-test supported. >> Selective Self-test supported. >> SMART capabilities: (0x0003) Saves SMART data before entering >> power-saving mode. >> Supports SMART auto save timer. >> Error logging capability: (0x01) Error logging supported. >> General Purpose Logging supported. >> Short self-test routine >> recommended polling time: ( 1) minutes. >> Extended self-test routine >> recommended polling time: ( 340) minutes. >> Conveyance self-test routine >> recommended polling time: ( 2) minutes. >> SCT capabilities: (0x3085) SCT Status supported. >> >> SMART Attributes Data Structure revision number: 10 >> Vendor Specific SMART Attributes with Thresholds: >> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE >> UPDATED WHEN_FAILED RAW_VALUE >> 1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail >> Always - 234421760 > you can happily ignore this error rate, it is usual for it to be high > and htere is hardware correction for it > >> 3 Spin_Up_Time 0x0003 092 092 000 Pre-fail >> Always - 0 >> 4 Start_Stop_Count 0x0032 100 100 020 Old_age >> Always - 33 > 33 power cycles seem very low but further down we see the power on time > is just under two years which is also erring towards the lighter side of > the mtbf About the only time I shutdown is when the power fails. My puter only pulls about 150 watts so I just leave it running 24/7. > >> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail >> Always - 0 > zero reallocated sectors suggests there is space to do reallocation > >> 7 Seek_Error_Rate 0x000f 079 060 030 Pre-fail >> Always - 99909120 >> 9 Power_On_Hours 0x0032 082 082 000 Old_age >> Always - 16379 > almost two years of power on time > >> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail >> Always - 0 >> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age >> Always - 34 >> 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age >> Always - 0 >> 184 End-to-End_Error 0x0032 100 100 099 Old_age >> Always - 0 >> 187 Reported_Uncorrect 0x0032 100 100 000 Old_age >> Always - 0 >> 188 Command_Timeout 0x0032 100 100 000 Old_age >> Always - 0 0 0 >> 189 High_Fly_Writes 0x003a 100 100 000 Old_age >> Always - 0 >> 190 Airflow_Temperature_Cel 0x0022 069 063 045 Old_age >> Always - 31 (Min/Max 26/33) >> 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age >> Always - 0 >> 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age >> Always - 9 >> 193 Load_Cycle_Count 0x0032 093 093 000 Old_age >> Always - 14284 >> 194 Temperature_Celsius 0x0022 031 040 000 Old_age >> Always - 31 (0 17 0 0 0) >> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age >> Always - 104 > 197 > this says there are 104 pending sectors i.e. bad blocks on the drive > that have not been reallocatd yet Wonder why it hasn't? Isn't it supposed to do that sort of thing itself? > >> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age >> Offline - 104 > this says it was not able to reallocate. which is odd because of the > entry 5 being zero Uh oh. > >> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age >> Always - 0 >> 240 Head_Flying_Hours 0x0000 100 253 000 Old_age >> Offline - 15955h+37m+28.932s >> 241 Total_LBAs_Written 0x0000 100 253 000 Old_age >> Offline - 52221690631887 >> 242 Total_LBAs_Read 0x0000 100 253 000 Old_age >> Offline - 74848968465606 >> >> SMART Error Log Version: 1 >> No Errors Logged >> >> SMART Self-test log structure revision number 1 >> Num Test_Description Status Remaining >> LifeTime(hours) LBA_of_first_error >> # 1 Extended offline Completed: read failure 60% >> 16365 2905482560 >> # 2 Extended offline Completed: read failure 60% >> 16352 2905482560 >> # 3 Extended offline Completed without error 00% >> 8044 - >> # 4 Extended offline Completed without error 00% >> 3121 - >> # 5 Extended offline Completed without error 00% >> 1548 - >> # 6 Short offline Completed without error 00% >> 1141 - >> # 7 Extended offline Completed without error 00% >> 719 - >> # 8 Extended offline Completed without error 00% >> 525 - >> # 9 Short offline Completed without error 00% >> 516 - >> #10 Extended offline Completed without error 00% >> 18 - >> #11 Extended offline Completed without error 00% >> 5 - >> #12 Short offline Completed without error 00% >> 0 - >> >> SMART Selective self-test log data structure revision number 1 >> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS >> 1 0 0 Not_testing >> 2 0 0 Not_testing >> 3 0 0 Not_testing >> 4 0 0 Not_testing >> 5 0 0 Not_testing >> Selective self-test flags (0x0): >> After scanning selected spans, do NOT read-scan remainder of disk. >> If Selective self-test is pending on power-up, resume after 0 minute delay. >> >> root@fireball / # >> >> Does that help shed any light on this situation? If you need more >> info, just let me know. Off to newegg. BRB >> >> Dale >> >> :-) :-) >> > 104 bad blocks is not a sign of a happy disk. > i would replace urgently > > also consider running smartd or a smartmonitor plugin for munin as the > test log suggests you last ran a test after the first year of usage > > I usually just run the test manually but I sort of had family stuff going on for the past year, almost a year anyway. Sort of behind on things although I have been doing my normal updates. I ordered a drive. It should be here tomorrow. In the meantime, I shutdown and re-seated all the cables, power too. I got the test running again but results is a few hours off yet. It did pass the short test tho. I'm not sure that it means much. Thanks much. Dale :-) :-)