From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id B217C13877A for ; Wed, 25 Jun 2014 10:42:38 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 7992EE08CE; Wed, 25 Jun 2014 10:42:32 +0000 (UTC) Received: from uberouter3.guranga.net (unknown [78.25.223.226]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 56335E085A for ; Wed, 25 Jun 2014 10:42:31 +0000 (UTC) Received: from [192.168.151.100] (unknown [192.168.151.100]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by uberouter3.guranga.net (Postfix) with ESMTPSA id 12E7482E6B for ; Wed, 25 Jun 2014 11:42:30 +0100 (BST) Message-ID: <53AAA791.4050506@thegeezer.net> Date: Wed, 25 Jun 2014 11:42:25 +0100 From: thegeezer User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] smartctrl drive error @60% References: <53AA050F.4070907@gmail.com> <49620f42-d9c3-43b1-9f01-1250e52eb950@email.android.com> <53AA587F.8090300@gmail.com> <53AA7D11.6070909@thegeezer.net> <53AA7EF5.2000903@gmail.com> In-Reply-To: <53AA7EF5.2000903@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Archives-Salt: 2a441c61-4b77-450e-8c5e-9f860a2278bb X-Archives-Hash: fd09ea95b3655ee40cfa21cb2c5fb932 On 06/25/2014 08:49 AM, Dale wrote: > thegeezer wrote: >> this is pretty bad. > Here is the output: > > root@fireball / # smartctl -a /dev/sdc > smartctl 6.1 2013-03-16 r3800 [x86_64-linux-3.14.0-gentoo] (local build) > Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org > > === START OF INFORMATION SECTION === > Model Family: Seagate Barracuda 7200.14 (AF) > Device Model: ST3000DM001-9YN166 > Serial Number: Z1F0PKT5 > LU WWN Device Id: 5 000c50 04d79e15c > Firmware Version: CC4C > User Capacity: 3,000,592,982,016 bytes [3.00 TB] > Sector Sizes: 512 bytes logical, 4096 bytes physical > Rotation Rate: 7200 rpm > Device is: In smartctl database [for details use: -P show] > ATA Version is: ATA8-ACS T13/1699-D revision 4 > SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) > Local Time is: Wed Jun 25 02:46:39 2014 CDT > > ==> WARNING: A firmware update for this drive is available, > see the following Seagate web pages: > http://knowledge.seagate.com/articles/en_US/FAQ/207931en > http://knowledge.seagate.com/articles/en_US/FAQ/223651en interesting - not seen that before might be worth a nose > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION === > SMART overall-health self-assessment test result: PASSED > > General SMART Values: > Offline data collection status: (0x00) Offline data collection activity > was never started. > Auto Offline Data Collection: > Disabled. > Self-test execution status: ( 118) The previous self-test completed > having > the read element of the test failed. > Total time to complete Offline > data collection: ( 584) seconds. > Offline data collection > capabilities: (0x73) SMART execute Offline immediate. > Auto Offline data collection > on/off support. > Suspend Offline collection upon new > command. > No Offline surface scan supported. > Self-test supported. > Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging supported. > Short self-test routine > recommended polling time: ( 1) minutes. > Extended self-test routine > recommended polling time: ( 340) minutes. > Conveyance self-test routine > recommended polling time: ( 2) minutes. > SCT capabilities: (0x3085) SCT Status supported. > > SMART Attributes Data Structure revision number: 10 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED WHEN_FAILED RAW_VALUE > 1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail > Always - 234421760 you can happily ignore this error rate, it is usual for it to be high and htere is hardware correction for it > 3 Spin_Up_Time 0x0003 092 092 000 Pre-fail > Always - 0 > 4 Start_Stop_Count 0x0032 100 100 020 Old_age > Always - 33 33 power cycles seem very low but further down we see the power on time is just under two years which is also erring towards the lighter side of the mtbf > 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail > Always - 0 zero reallocated sectors suggests there is space to do reallocation > 7 Seek_Error_Rate 0x000f 079 060 030 Pre-fail > Always - 99909120 > 9 Power_On_Hours 0x0032 082 082 000 Old_age > Always - 16379 almost two years of power on time > 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail > Always - 0 > 12 Power_Cycle_Count 0x0032 100 100 020 Old_age > Always - 34 > 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age > Always - 0 > 184 End-to-End_Error 0x0032 100 100 099 Old_age > Always - 0 > 187 Reported_Uncorrect 0x0032 100 100 000 Old_age > Always - 0 > 188 Command_Timeout 0x0032 100 100 000 Old_age > Always - 0 0 0 > 189 High_Fly_Writes 0x003a 100 100 000 Old_age > Always - 0 > 190 Airflow_Temperature_Cel 0x0022 069 063 045 Old_age > Always - 31 (Min/Max 26/33) > 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age > Always - 0 > 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age > Always - 9 > 193 Load_Cycle_Count 0x0032 093 093 000 Old_age > Always - 14284 > 194 Temperature_Celsius 0x0022 031 040 000 Old_age > Always - 31 (0 17 0 0 0) > 197 Current_Pending_Sector 0x0012 100 100 000 Old_age > Always - 104 197 this says there are 104 pending sectors i.e. bad blocks on the drive that have not been reallocatd yet > 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age > Offline - 104 this says it was not able to reallocate. which is odd because of the entry 5 being zero > 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age > Always - 0 > 240 Head_Flying_Hours 0x0000 100 253 000 Old_age > Offline - 15955h+37m+28.932s > 241 Total_LBAs_Written 0x0000 100 253 000 Old_age > Offline - 52221690631887 > 242 Total_LBAs_Read 0x0000 100 253 000 Old_age > Offline - 74848968465606 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining > LifeTime(hours) LBA_of_first_error > # 1 Extended offline Completed: read failure 60% > 16365 2905482560 > # 2 Extended offline Completed: read failure 60% > 16352 2905482560 > # 3 Extended offline Completed without error 00% > 8044 - > # 4 Extended offline Completed without error 00% > 3121 - > # 5 Extended offline Completed without error 00% > 1548 - > # 6 Short offline Completed without error 00% > 1141 - > # 7 Extended offline Completed without error 00% > 719 - > # 8 Extended offline Completed without error 00% > 525 - > # 9 Short offline Completed without error 00% > 516 - > #10 Extended offline Completed without error 00% > 18 - > #11 Extended offline Completed without error 00% > 5 - > #12 Short offline Completed without error 00% > 0 - > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 0 0 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Not_testing > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute delay. > > root@fireball / # > > Does that help shed any light on this situation? If you need more > info, just let me know. Off to newegg. BRB > > Dale > > :-) :-) > 104 bad blocks is not a sign of a happy disk. i would replace urgently also consider running smartd or a smartmonitor plugin for munin as the test log suggests you last ran a test after the first year of usage