From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 9D6F5158003 for ; Fri, 2 Jun 2023 01:59:09 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 7A52BE087F; Fri, 2 Jun 2023 01:59:04 +0000 (UTC) Received: from ciao.gmane.io (ciao.gmane.io [116.202.254.214]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 1F950E0871 for ; Fri, 2 Jun 2023 01:59:04 +0000 (UTC) Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1q4u4E-0006yz-TZ for gentoo-user@lists.gentoo.org; Fri, 02 Jun 2023 03:59:02 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-user@lists.gentoo.org From: Grant Edwards Subject: [gentoo-user] HD self test getting stuck at 90% remaining [solved] Date: Fri, 2 Jun 2023 01:58:58 -0000 (UTC) Message-ID: User-Agent: slrn/1.0.3 (Linux) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply X-Archives-Salt: 74ff2fdd-3f21-4b3f-a2f7-ec2a5ee464d4 X-Archives-Hash: e2182f383d2ddc2a3fcb2d61246bb8b1 I just bought a new 6TB WD Red Plus drive to replace a couple older drives (one of which had generated an uncorrectable error email from smartd). I decided that I'd stick it in an external USB-3 SATA drive "dock" thingy from Thermaltake and do some testing before installing it. Using smartctl, I ran a "short" self test and a "conveyance" test, and both passed. Then I started a "long" (extended) self test. According to the drive, that should take 640 minutes (a bit under 11 hours). It almost immediately reported test running with 90% remaining. A couple hours later, it still said 90% remaining. I stopped the test, ran another short test. Messed around with some other smartctl commands, and started another "long" test. Again, it immediately said 90% remaining. After 20 hours, it still said 90% remaining. Tests getting stuck like that seems to be pretty common. Some time spent Googling found me two suggestions: * The test is stalled because the disk is busy. * The test is stalled because the disk is idle. Apparently, they're both valid. The self-test runs in the background when the drive is not busy, so if the drive is heavily loaded, self tests can take a lot longer. But, if the drive is _completely_ idle, it might spin-down and go to sleep (which pauses the test). This is reportedly more likely to happen when the drive is attached via a USB-SATA adapter [I don't know if that's true]. Sure enough, if I waited 5-10 minutes and asked for the test status, I would hear the drive spin-up when queried. So I ran a "watch" command to query the drive every 10 seconds: watch -n10 smartctl -d sat -c /dev/sdc That seems to keep the drive spinning without doing any actual R/W ops, and within an hour the status changed to 80% remaining. About 6 hours after that, it's now 20% remaining. So the moral to the story is: when running an extended self test on a drive, you don't want it to be busy, but you also don't want it so idle that it spins down goes to sleep. Maybe everybody else already knew that, but it took me an hour or two to figure it out...