* [gentoo-user] Hard drive error from SMART @ 2022-04-12 1:27 Dale 2022-04-12 8:05 ` Wols Lists ` (3 more replies) 0 siblings, 4 replies; 57+ messages in thread From: Dale @ 2022-04-12 1:27 UTC (permalink / raw To: gentoo-user Howdy, As some know, I recently moved a LOT of data around. Seems to have stressed one of my drives. I got a email from SMART reporting a error. It's info: The following warning/error was logged by the smartd daemon: Device: /dev/sdd [SAT], 1 Currently unreadable (pending) sectors The following warning/error was logged by the smartd daemon: Device: /dev/sdd [SAT], 1 Offline uncorrectable sectors This is from smartctl. ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 083 064 044 Pre-fail Always - 23544426 3 Spin_Up_Time 0x0003 087 086 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 50 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 4 7 Seek_Error_Rate 0x000f 094 060 045 Pre-fail Always - 2694155454 9 Power_On_Hours 0x0032 073 073 000 Old_age Always - 24299 (121 195 0) 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 35 184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 086 000 Old_age Always - 14 14 14 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 061 059 040 Old_age Always - 39 (Min/Max 30/41) 191 G-Sense_Error_Rate 0x0032 092 092 000 Old_age Always - 17952 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 498 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 1044 194 Temperature_Celsius 0x0022 039 041 000 Old_age Always - 39 (0 18 0 0 0) 195 Hardware_ECC_Recovered 0x001a 031 001 000 Old_age Always - 23544426 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 203 Run_Out_Cancel 0x00b3 100 100 099 Pre-fail Always - 0 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 24215h+54m+57.249s 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 18070332014 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 18343277504 The nutshell is #5 up there. #198 was a issue until I ran the long selftest. It moved to #5 plus added 3 or 4 it seems. According to google results, it should be fine for now. Still, a replacement drive is on the way and I've unmount the drives for that LVM. They still spinning and running a selftest but nothing else should be accessing them. This is also from the selftest. SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Self-test routine in progress 90% 24299 - # 2 Short offline Completed without error 00% 24298 - # 3 Extended offline Completed without error 00% 24291 - # 4 Extended offline Aborted by host 10% 24266 - # 5 Short offline Completed without error 00% 24218 - # 6 Short offline Completed without error 00% 24194 - # 7 Short offline Completed without error 00% 24171 - # 8 Short offline Completed without error 00% 24146 - The one I aborted was because it was stuck on 10% for well over a day. The whole test doesn't take that long, or shouldn't anyway. I restarted it shortly after that. I might add, the test did take many hours longer than it estimated which from my past experience is quite odd. It's usually pretty accurate. Still, it completed and shows it passed, just has a boo boo on it. I also did a file system check it fixed a couple problems and a bunch of little things I see corrected often on bootup. Something about length of something. Seems trivial. Given the low number and it showing it corrected that error, and then passed a short and long test, is this drive "safe enough" to keep in service? I have backups just in case but just curious what others know from experience. At least this isn't one of those nasty messages that the drive will die within 24 hours. I got one of those ages ago and it didn't miss it by much. A little over 30 hours or so later, it was a door stop. It would spin but it couldn't even be seen by the BIOS. Maybe drives are getting better and SMART is getting better as well. Thoughts. Replace as soon as drive arrives or wait and see? Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 1:27 [gentoo-user] Hard drive error from SMART Dale @ 2022-04-12 8:05 ` Wols Lists 2022-04-12 13:01 ` Dale 2022-04-12 14:20 ` Laurence Perkins ` (2 subsequent siblings) 3 siblings, 1 reply; 57+ messages in thread From: Wols Lists @ 2022-04-12 8:05 UTC (permalink / raw To: gentoo-user On 12/04/2022 02:27, Dale wrote: > The one I aborted was because it was stuck on 10% for well over a day. > The whole test doesn't take that long, or shouldn't anyway. I restarted > it shortly after that. I might add, the test did take many hours longer > than it estimated which from my past experience is quite odd. It's > usually pretty accurate. Still, it completed and shows it passed, just > has a boo boo on it. I also did a file system check it fixed a couple > problems and a bunch of little things I see corrected often on bootup. > Something about length of something. Seems trivial. Given that the firmware SOMETIMES gets its knickers in a twist, especially consumer drives (not sure what yours are?), and read errors are a dime a dozen, I wouldn't worry that much about ONE error. Do another SMART test after your next reboot. Any NEW errors will be a red flag, but just this one again? Don't worry. > > Given the low number and it showing it corrected that error, and then > passed a short and long test, is this drive "safe enough" to keep in > service? I have backups just in case but just curious what others know > from experience. At least this isn't one of those nasty messages that > the drive will die within 24 hours. I got one of those ages ago and it > didn't miss it by much. A little over 30 hours or so later, it was a > door stop. It would spin but it couldn't even be seen by the BIOS. > Maybe drives are getting better and SMART is getting better as well. SMART is a lot better than it was, but remember, it only picks up wear and tear. Mechanical failure is just as deadly, and usually strikes out of the blue. I saw some stats somewhere it's something like 1/3, 2/3 wear and tear picked up by SMART, and mechanical failure undetectable by smart. Can't remember which stat was which. > > Thoughts. Replace as soon as drive arrives or wait and see? If you get a couple of errors, then no more for months, the drive is probably fine. If you get new errors every time you test, ditch it ASAP. Either way, make sure it's backed up! Cheers, Wol ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 8:05 ` Wols Lists @ 2022-04-12 13:01 ` Dale 0 siblings, 0 replies; 57+ messages in thread From: Dale @ 2022-04-12 13:01 UTC (permalink / raw To: gentoo-user Wols Lists wrote: > On 12/04/2022 02:27, Dale wrote: >> The one I aborted was because it was stuck on 10% for well over a day. >> The whole test doesn't take that long, or shouldn't anyway. I restarted >> it shortly after that. I might add, the test did take many hours longer >> than it estimated which from my past experience is quite odd. It's >> usually pretty accurate. Still, it completed and shows it passed, just >> has a boo boo on it. I also did a file system check it fixed a couple >> problems and a bunch of little things I see corrected often on bootup. >> Something about length of something. Seems trivial. > > Given that the firmware SOMETIMES gets its knickers in a twist, > especially consumer drives (not sure what yours are?), and read errors > are a dime a dozen, I wouldn't worry that much about ONE error. > > Do another SMART test after your next reboot. Any NEW errors will be a > red flag, but just this one again? Don't worry. That seems to be what my google searches revealed. After all, nothing is perfect. I'm sometimes surprised that drives aren't shipped with a couple of these. I'll keep my backups up to date as usual tho. ;-) >> >> Given the low number and it showing it corrected that error, and then >> passed a short and long test, is this drive "safe enough" to keep in >> service? I have backups just in case but just curious what others know >> from experience. At least this isn't one of those nasty messages that >> the drive will die within 24 hours. I got one of those ages ago and it >> didn't miss it by much. A little over 30 hours or so later, it was a >> door stop. It would spin but it couldn't even be seen by the BIOS. >> Maybe drives are getting better and SMART is getting better as well. > > SMART is a lot better than it was, but remember, it only picks up wear > and tear. Mechanical failure is just as deadly, and usually strikes > out of the blue. I saw some stats somewhere it's something like 1/3, > 2/3 wear and tear picked up by SMART, and mechanical failure > undetectable by smart. Can't remember which stat was which. My understanding is that SMART detects media problems and sometimes even when a electronic component is getting out of spec. However, it is unlikely to detect that the spindle motor or the mechanism that moves the heads is about to go out. It can detect some things but not everything. From my understanding, it is mostly about monitoring the magnetic media itself. It is however, better than nothing at all. >> >> Thoughts. Replace as soon as drive arrives or wait and see? > > If you get a couple of errors, then no more for months, the drive is > probably fine. If you get new errors every time you test, ditch it ASAP. > > Either way, make sure it's backed up! > > Cheers, > Wol > > Sounds like a plan. Drive should be here Friday. I'll keep a eye on it. It's down to 10% on long selftest and no errors reported yet. I'll keep the drive unmounted until Friday tho, just in case. Thanks for the opinions. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [gentoo-user] Hard drive error from SMART 2022-04-12 1:27 [gentoo-user] Hard drive error from SMART Dale 2022-04-12 8:05 ` Wols Lists @ 2022-04-12 14:20 ` Laurence Perkins 2022-04-12 14:57 ` Rich Freeman 2022-04-15 15:49 ` Dale 3 siblings, 0 replies; 57+ messages in thread From: Laurence Perkins @ 2022-04-12 14:20 UTC (permalink / raw To: gentoo-user@lists.gentoo.org > -----Original Message----- > From: Dale <rdalek1967@gmail.com> > Sent: Monday, April 11, 2022 6:28 PM > To: gentoo-user@lists.gentoo.org > Subject: [gentoo-user] Hard drive error from SMART > > Given the low number and it showing it corrected that error, and then passed a short and long test, is this drive "safe enough" to keep in service? I have backups just in case but just curious what others know from experience. At least this isn't one of those nasty messages that the drive will die within 24 hours. I got one of those ages ago and it didn't miss it by much. A little over 30 hours or so later, it was a door stop. It would spin but it couldn't even be seen by the BIOS. > Maybe drives are getting better and SMART is getting better as well. > > Thoughts. Replace as soon as drive arrives or wait and see? > > Dale > > :-) :-) > When it's just one or two errors like that and they don't keep going up I tend to treat it as an isolated incident, but the drive still goes into the pool I use with RAID just in case. Preferably a setup where you can lose more than one disk without losing the data. Note that, depending on where the bad sector is, when it gets remapped the extra seek necessary to read that logical address could slow the drive down substantially. Make sure your filesystem's root inode or something doesn't end up on top of it. Sometimes I miss the old drives where all this was handled by the OS and so you knew exactly what sector was bad and your filesystem could be told to just not use it. Made scanning for bad sectors more annoying, but deciding how bad the drive was rather easier. LMP ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 1:27 [gentoo-user] Hard drive error from SMART Dale 2022-04-12 8:05 ` Wols Lists 2022-04-12 14:20 ` Laurence Perkins @ 2022-04-12 14:57 ` Rich Freeman 2022-04-12 17:08 ` Dale 2022-04-15 15:49 ` Dale 3 siblings, 1 reply; 57+ messages in thread From: Rich Freeman @ 2022-04-12 14:57 UTC (permalink / raw To: gentoo-user On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: > > Thoughts. Replace as soon as drive arrives or wait and see? > So, first of all just about all my hard drives are in a RAID at this point, so I have a higher tolerance for issues. If a drive is under warranty I'll usually try to see if they will RMA it. More often than not they will, and in that case there is really no reason not to. I'll do advance shipping and replace the drive before sending the old one back so that I mostly have redundancy the whole time. If it isn't under warranty then I'll scrub it and see what happens. I'll of course do SMART self-tests, but usually an error like this won't actually clear until you overwrite the offline sector so that the drive can reallocate it. A RAID scrub/resilver/etc will overwrite the sector with the correct contents which will allow this to happen. (Otherwise there is no way for the drive to recover - if it knew what was stored there it wouldn't have an error in the first place.) If an error comes back then I'll replace the drive. My drives are pretty large at this point so I don't like keeping unreliable drives around. It just increases the risk of double failures, given that a large hard drive can take more than a day to replace. Write speeds just don't keep pace with capacities. I do have offline backups but I shudder at the thought of how long one of those would take to restore. -- Rich ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 14:57 ` Rich Freeman @ 2022-04-12 17:08 ` Dale 2022-04-12 17:21 ` Laurence Perkins ` (2 more replies) 0 siblings, 3 replies; 57+ messages in thread From: Dale @ 2022-04-12 17:08 UTC (permalink / raw To: gentoo-user Rich Freeman wrote: > On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: >> Thoughts. Replace as soon as drive arrives or wait and see? >> > So, first of all just about all my hard drives are in a RAID at this > point, so I have a higher tolerance for issues. > > If a drive is under warranty I'll usually try to see if they will RMA > it. More often than not they will, and in that case there is really > no reason not to. I'll do advance shipping and replace the drive > before sending the old one back so that I mostly have redundancy the > whole time. > > If it isn't under warranty then I'll scrub it and see what happens. > I'll of course do SMART self-tests, but usually an error like this > won't actually clear until you overwrite the offline sector so that > the drive can reallocate it. A RAID scrub/resilver/etc will overwrite > the sector with the correct contents which will allow this to happen. > (Otherwise there is no way for the drive to recover - if it knew what > was stored there it wouldn't have an error in the first place.) > > If an error comes back then I'll replace the drive. My drives are > pretty large at this point so I don't like keeping unreliable drives > around. It just increases the risk of double failures, given that a > large hard drive can take more than a day to replace. Write speeds > just don't keep pace with capacities. I do have offline backups but I > shudder at the thought of how long one of those would take to restore. > Sadly, I don't have RAID here but to be honest, I really need to have it given the data and my recent luck with hard drives. Drives used to get dumped because they were just to small to use anymore. Nowadays, they seem to break in some fashion long before their usefulness ends their lives. I remounted the drives and did a backup. For anyone running up on this, just in case one of the files got corrupted, I used a little trick to see if I can figure out which one may be bad if any. I took my rsync commands from my little script and ran them one at a time with --dry-run added. If a file was to be updated on the backup that I hadn't changed or added, I was going to check into it before updating my backups. It could be that the backup file was still good and the file on my drive reporting problems was bad. In that case, I would determine which was good and either restore it from backups or allow it to be updated if needed. Either way, I should have a good file since the drive claims to have fixed the problem. Now let us pray. :-D Drive isn't under warranty. I may have to start buying new drives from dealers. Sometimes I find drives that are pulled from systems and have very few hours on them. Still, warranty may not last long. Saves a lot of money tho. USPS claims drive is on the way. Left a distribution point and should update again when it gets close. First said Saturday, then said Friday. I think Friday is about right but if the wind blows right, maybe Thursday. I hope I have another port and power cable plug for the swap out. At least now, I can unmount it and swap without a lot of rebooting. Since it's on LVM, that part is easy. Regretfully I have experience on that process. :/ Thanks to all. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [gentoo-user] Hard drive error from SMART 2022-04-12 17:08 ` Dale @ 2022-04-12 17:21 ` Laurence Perkins 2022-04-12 18:22 ` Dale 2022-04-12 19:17 ` Wols Lists 2022-04-12 17:39 ` Frank Steinmetzger 2022-04-12 19:27 ` Rich Freeman 2 siblings, 2 replies; 57+ messages in thread From: Laurence Perkins @ 2022-04-12 17:21 UTC (permalink / raw To: gentoo-user@lists.gentoo.org > -----Original Message----- > From: Dale <rdalek1967@gmail.com> > Sent: Tuesday, April 12, 2022 10:08 AM > To: gentoo-user@lists.gentoo.org > Subject: Re: [gentoo-user] Hard drive error from SMART > > Rich Freeman wrote: > > On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: > >> Thoughts. Replace as soon as drive arrives or wait and see? > >> > > So, first of all just about all my hard drives are in a RAID at this > > point, so I have a higher tolerance for issues. > > > > If a drive is under warranty I'll usually try to see if they will RMA > > it. More often than not they will, and in that case there is really > > no reason not to. I'll do advance shipping and replace the drive > > before sending the old one back so that I mostly have redundancy the > > whole time. > > > > If it isn't under warranty then I'll scrub it and see what happens. > > I'll of course do SMART self-tests, but usually an error like this > > won't actually clear until you overwrite the offline sector so that > > the drive can reallocate it. A RAID scrub/resilver/etc will overwrite > > the sector with the correct contents which will allow this to happen. > > (Otherwise there is no way for the drive to recover - if it knew what > > was stored there it wouldn't have an error in the first place.) > > > > If an error comes back then I'll replace the drive. My drives are > > pretty large at this point so I don't like keeping unreliable drives > > around. It just increases the risk of double failures, given that a > > large hard drive can take more than a day to replace. Write speeds > > just don't keep pace with capacities. I do have offline backups but I > > shudder at the thought of how long one of those would take to restore. > > > > > Sadly, I don't have RAID here but to be honest, I really need to have it given the data and my recent luck with hard drives. Drives used to get dumped because they were just to small to use anymore. Nowadays, they seem to break in some fashion long before their usefulness ends their lives. > > I remounted the drives and did a backup. For anyone running up on this, just in case one of the files got corrupted, I used a little trick to see if I can figure out which one may be bad if any. I took my rsync commands from my little script and ran them one at a time with --dry-run added. If a file was to be updated on the backup that I hadn't changed or added, I was going to check into it before updating my backups. It could be that the backup file was still good and the file on my drive reporting problems was bad. In that case, I would determine which was good and either restore it from backups or allow it to be updated if needed. Either way, I should have a good file since the drive claims to have fixed the problem. Now let us pray. :-D > > Drive isn't under warranty. I may have to start buying new drives from dealers. Sometimes I find drives that are pulled from systems and have very few hours on them. Still, warranty may not last long. Saves a lot of money tho. > > USPS claims drive is on the way. Left a distribution point and should update again when it gets close. First said Saturday, then said Friday. I think Friday is about right but if the wind blows right, maybe Thursday. > > I hope I have another port and power cable plug for the swap out. At least now, I can unmount it and swap without a lot of rebooting. Since it's on LVM, that part is easy. Regretfully I have experience on that process. :/ > > Thanks to all. > > Dale > > :-) :-) > > You can get up to 16X SATA PCI-e cards these days for pretty cheap. So as long as you have the power to run another drive or two there's not much reason not to do RAID on the important stuff. Also, the SATA protocol allows for port expanders, which are also pretty cheap. One of my favorite things about BTRFS is the data checksums. If the drive returns garbage, it turns into a read error. Also, if you can't do real RAID, but have excess space you can tell it to keep two copies of everything. Doesn't help with total drive failure, but does protect against the occasional failed sector. If you don't mind writes taking twice as long anyway. LMP ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 17:21 ` Laurence Perkins @ 2022-04-12 18:22 ` Dale 2022-04-12 19:41 ` Laurence Perkins 2022-04-12 19:17 ` Wols Lists 1 sibling, 1 reply; 57+ messages in thread From: Dale @ 2022-04-12 18:22 UTC (permalink / raw To: gentoo-user Laurence Perkins wrote: >> -----Original Message----- >> From: Dale <rdalek1967@gmail.com> >> Sent: Tuesday, April 12, 2022 10:08 AM >> To: gentoo-user@lists.gentoo.org >> Subject: Re: [gentoo-user] Hard drive error from SMART >> >> Rich Freeman wrote: >>> On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: >>>> Thoughts. Replace as soon as drive arrives or wait and see? >>>> >>> So, first of all just about all my hard drives are in a RAID at this >>> point, so I have a higher tolerance for issues. >>> >>> If a drive is under warranty I'll usually try to see if they will RMA >>> it. More often than not they will, and in that case there is really >>> no reason not to. I'll do advance shipping and replace the drive >>> before sending the old one back so that I mostly have redundancy the >>> whole time. >>> >>> If it isn't under warranty then I'll scrub it and see what happens. >>> I'll of course do SMART self-tests, but usually an error like this >>> won't actually clear until you overwrite the offline sector so that >>> the drive can reallocate it. A RAID scrub/resilver/etc will overwrite >>> the sector with the correct contents which will allow this to happen. >>> (Otherwise there is no way for the drive to recover - if it knew what >>> was stored there it wouldn't have an error in the first place.) >>> >>> If an error comes back then I'll replace the drive. My drives are >>> pretty large at this point so I don't like keeping unreliable drives >>> around. It just increases the risk of double failures, given that a >>> large hard drive can take more than a day to replace. Write speeds >>> just don't keep pace with capacities. I do have offline backups but I >>> shudder at the thought of how long one of those would take to restore. >>> >> >> Sadly, I don't have RAID here but to be honest, I really need to have it given the data and my recent luck with hard drives. Drives used to get dumped because they were just to small to use anymore. Nowadays, they seem to break in some fashion long before their usefulness ends their lives. >> >> I remounted the drives and did a backup. For anyone running up on this, just in case one of the files got corrupted, I used a little trick to see if I can figure out which one may be bad if any. I took my rsync commands from my little script and ran them one at a time with --dry-run added. If a file was to be updated on the backup that I hadn't changed or added, I was going to check into it before updating my backups. It could be that the backup file was still good and the file on my drive reporting problems was bad. In that case, I would determine which was good and either restore it from backups or allow it to be updated if needed. Either way, I should have a good file since the drive claims to have fixed the problem. Now let us pray. :-D >> >> Drive isn't under warranty. I may have to start buying new drives from dealers. Sometimes I find drives that are pulled from systems and have very few hours on them. Still, warranty may not last long. Saves a lot of money tho. >> >> USPS claims drive is on the way. Left a distribution point and should update again when it gets close. First said Saturday, then said Friday. I think Friday is about right but if the wind blows right, maybe Thursday. >> >> I hope I have another port and power cable plug for the swap out. At least now, I can unmount it and swap without a lot of rebooting. Since it's on LVM, that part is easy. Regretfully I have experience on that process. :/ >> >> Thanks to all. >> >> Dale >> >> :-) :-) >> >> > You can get up to 16X SATA PCI-e cards these days for pretty cheap. So as long as you have the power to run another drive or two there's not much reason not to do RAID on the important stuff. Also, the SATA protocol allows for port expanders, which are also pretty cheap. > > One of my favorite things about BTRFS is the data checksums. If the drive returns garbage, it turns into a read error. Also, if you can't do real RAID, but have excess space you can tell it to keep two copies of everything. Doesn't help with total drive failure, but does protect against the occasional failed sector. If you don't mind writes taking twice as long anyway. > > LMP I looked into a card a good while back and they were pretty pricey at the time. You happen to have some search terms I can search for on ebay, Amazon etc? I know some chipsets work better on Linux out of the box. I don't need to buy one that doesn't work or only works with the threat of a sledge hammer. lol I've also looked into that other thing, SAS? or something. It's been a while tho. I'm pretty good at doing backups. I do Gentoo updates on Saturday, and sometimes Sunday. While the updates are downloading, I update my backups. It's almost like a religion for me. I was just more cautious earlier. I suspect a file could be corrupted somewhere but wanted to be sure it wasn't something important. I have some files that if lost, I may not can download again. They don't exist. A few I got from some Govt archive that are really old but since removed, or at least I can't find them anymore. I've given serious thought to switching to BTRFS. Thing is, I'm still trying to get LVM figured out. Plus, LVM is well maintained and should be for a good long while, plus it works for me. Still, if I could afford to have several new drives all at once, I'd certainly play with it. It could very well be better. The one thing I wish, LVM had a GUI where you could do everything from it. During my recent rearrangement of drives, I learned that you can't do a lot of things within webmin. It does some things but not everything. Plus, you have to have a running GUI to use it. In that case, I had to unmount /home which meant no KDE, so no Webmin either. Still, that could cause trouble too. I dunno. Thanks. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [gentoo-user] Hard drive error from SMART 2022-04-12 18:22 ` Dale @ 2022-04-12 19:41 ` Laurence Perkins 2022-04-12 21:50 ` Wol 2022-04-12 21:59 ` Dale 0 siblings, 2 replies; 57+ messages in thread From: Laurence Perkins @ 2022-04-12 19:41 UTC (permalink / raw To: gentoo-user@lists.gentoo.org >-----Original Message----- >From: Dale <rdalek1967@gmail.com> >Sent: Tuesday, April 12, 2022 11:22 AM >To: gentoo-user@lists.gentoo.org >Subject: Re: [gentoo-user] Hard drive error from SMART > >Laurence Perkins wrote: >>> -----Original Message----- >>> From: Dale <rdalek1967@gmail.com> >>> Sent: Tuesday, April 12, 2022 10:08 AM >>> To: gentoo-user@lists.gentoo.org >>> Subject: Re: [gentoo-user] Hard drive error from SMART >>> >>> Rich Freeman wrote: >>>> On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: >>>>> Thoughts. Replace as soon as drive arrives or wait and see? >>>>> >>>> So, first of all just about all my hard drives are in a RAID at this >>>> point, so I have a higher tolerance for issues. >>>> >>>> If a drive is under warranty I'll usually try to see if they will >>>> RMA it. More often than not they will, and in that case there is >>>> really no reason not to. I'll do advance shipping and replace the >>>> drive before sending the old one back so that I mostly have >>>> redundancy the whole time. >>>> >>>> If it isn't under warranty then I'll scrub it and see what happens. >>>> I'll of course do SMART self-tests, but usually an error like this >>>> won't actually clear until you overwrite the offline sector so that >>>> the drive can reallocate it. A RAID scrub/resilver/etc will >>>> overwrite the sector with the correct contents which will allow this to happen. >>>> (Otherwise there is no way for the drive to recover - if it knew >>>> what was stored there it wouldn't have an error in the first place.) >>>> >>>> If an error comes back then I'll replace the drive. My drives are >>>> pretty large at this point so I don't like keeping unreliable drives >>>> around. It just increases the risk of double failures, given that a >>>> large hard drive can take more than a day to replace. Write speeds >>>> just don't keep pace with capacities. I do have offline backups but >>>> I shudder at the thought of how long one of those would take to restore. >>>> >>> >>> Sadly, I don't have RAID here but to be honest, I really need to have it given the data and my recent luck with hard drives. Drives used to get dumped because they were just to small to use anymore. Nowadays, they seem to break in some fashion long before their usefulness ends their lives. >>> >>> I remounted the drives and did a backup. For anyone running up on >>> this, just in case one of the files got corrupted, I used a little >>> trick to see if I can figure out which one may be bad if any. I took >>> my rsync commands from my little script and ran them one at a time >>> with --dry-run added. If a file was to be updated on the backup that >>> I hadn't changed or added, I was going to check into it before >>> updating my backups. It could be that the backup file was still good >>> and the file on my drive reporting problems was bad. In that case, I >>> would determine which was good and either restore it from backups or >>> allow it to be updated if needed. Either way, I should have a good >>> file since the drive claims to have fixed the problem. Now let us >>> pray. :-D >>> >>> Drive isn't under warranty. I may have to start buying new drives from dealers. Sometimes I find drives that are pulled from systems and have very few hours on them. Still, warranty may not last long. Saves a lot of money tho. >>> >>> USPS claims drive is on the way. Left a distribution point and should update again when it gets close. First said Saturday, then said Friday. I think Friday is about right but if the wind blows right, maybe Thursday. >>> >>> I hope I have another port and power cable plug for the swap out. At >>> least now, I can unmount it and swap without a lot of rebooting. >>> Since it's on LVM, that part is easy. Regretfully I have experience >>> on that process. :/ >>> >>> Thanks to all. >>> >>> Dale >>> >>> :-) :-) >>> >>> >> You can get up to 16X SATA PCI-e cards these days for pretty cheap. So as long as you have the power to run another drive or two there's not much reason not to do RAID on the important stuff. Also, the SATA protocol allows for port expanders, which are also pretty cheap. >> >> One of my favorite things about BTRFS is the data checksums. If the drive returns garbage, it turns into a read error. Also, if you can't do real RAID, but have excess space you can tell it to keep two copies of everything. Doesn't help with total drive failure, but does protect against the occasional failed sector. If you don't mind writes taking twice as long anyway. >> >> LMP > > >I looked into a card a good while back and they were pretty pricey at the time. You happen to have some search terms I can search for on ebay, Amazon etc? I know some chipsets work better on Linux out of the box. I don't need to buy one that doesn't work or only works with the threat of a sledge hammer. lol I've also looked into that other thing, SAS? or something. It's been a while tho. > >I'm pretty good at doing backups. I do Gentoo updates on Saturday, and sometimes Sunday. While the updates are downloading, I update my backups. It's almost like a religion for me. I was just more cautious earlier. I suspect a file could be corrupted somewhere but wanted to be sure it wasn't something important. I have some files that if lost, I may not can download again. They don't exist. A few I got from some Govt archive that are really old but since removed, or at least I can't find them anymore. > >I've given serious thought to switching to BTRFS. Thing is, I'm still trying to get LVM figured out. Plus, LVM is well maintained and should be for a good long while, plus it works for me. Still, if I could afford to have several new drives all at once, I'd certainly play with it. It could very well be better. The one thing I wish, LVM had a GUI where you could do everything from it. During my recent rearrangement of drives, I learned that you can't do a lot of things within webmin. It does some things but not everything. Plus, you have to have a running GUI to use it. In that case, I had to unmount /home which meant no KDE, so no Webmin either. Still, that could cause trouble too. I dunno. > >Thanks. > >Dale > >:-) :-) > > I went with a couple of https://www.amazon.com/MZHOU-Profile-Bracket-Support-Converter/dp/B08L7W8QFT/ in a couple different sizes for two of my mass storage systems and they seem to be doing OK. The difference between the cheap vendors and the expensive vendors these days tends to be quality control. So plug it in, load it up, run it hard for a few hours. If it doesn't die relatively quickly you're usually good. Especially if you have RAID with checksums it's difficult for a controller to mangle things too badly even if it does have an issue. Remember: Data does not exist if it doesn't exist in at least three places. So you still want off-site backups in case your house burns down. Especially for irreplaceable things. If you have friends who also want off-site backups and you leave your machines running all the time then tahoe-lafs is pretty decent. For that matter they don't even have to really be friends, you really only have to be able to trust them to not selfishly hog all the space. I use BTRFS RAID1 for a lot of stuff. So far it's been pretty good at catching dropped bits and recovering from failures. It has a bit of the RAID issue where a drive could fail while you're doing a recovery since it only guarantees integrity with one dud drive regardless of the number of drives in the pool. But since each chunk is only written to two drives instead of spread across all of them the rebuild time stays relatively short and even if another drive does fail you'll only lose some of the data instead of all of it. This also means that the wasted space when your drives aren't all the same size is kept to a minimum. ZFS and similar are arguably better for larger arrays, but are also more hassle to set up. LVM is good for being able to swap out drives easily but with the modern, huge drives you really want data checksums if you can get them. Otherwise all it takes is a flipped bit somewhere to wreck your data and drive firmware doesn't always notice. I think you can do that with LVM, but I've never looked into it for certain. LMP ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 19:41 ` Laurence Perkins @ 2022-04-12 21:50 ` Wol 2022-04-12 22:05 ` Laurence Perkins 2022-04-12 22:16 ` Julien Roy 2022-04-12 21:59 ` Dale 1 sibling, 2 replies; 57+ messages in thread From: Wol @ 2022-04-12 21:50 UTC (permalink / raw To: gentoo-user On 12/04/2022 20:41, Laurence Perkins wrote: > LVM is good for being able to swap out drives easily but with the modern, huge drives you really want data checksums if you can get them. Otherwise all it takes is a flipped bit somewhere to wreck your data and drive firmware doesn't always notice. I think you can do that with LVM, but I've never looked into it for certain. Look at that link for my system that I posted. I use dm-integrity, so a flipped bit will trigger a failure at the raid-5 level and recover. For those people looking at btrfs - note that parity-raid (5 or 6) is not a wise idea at the moment so you don't get two-failure protection ... Cheers, Wol ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [gentoo-user] Hard drive error from SMART 2022-04-12 21:50 ` Wol @ 2022-04-12 22:05 ` Laurence Perkins 2022-04-12 22:16 ` Julien Roy 1 sibling, 0 replies; 57+ messages in thread From: Laurence Perkins @ 2022-04-12 22:05 UTC (permalink / raw To: gentoo-user@lists.gentoo.org >-----Original Message----- >From: Wol <antlists@youngman.org.uk> >Sent: Tuesday, April 12, 2022 2:51 PM >To: gentoo-user@lists.gentoo.org >Subject: Re: [gentoo-user] Hard drive error from SMART > >On 12/04/2022 20:41, Laurence Perkins wrote: >> LVM is good for being able to swap out drives easily but with the modern, huge drives you really want data checksums if you can get them. Otherwise all it takes is a flipped bit somewhere to wreck your data and drive firmware doesn't always notice. I think you can do that with LVM, but I've never looked into it for certain. > >Look at that link for my system that I posted. I use dm-integrity, so a flipped bit will trigger a failure at the raid-5 level and recover. > >For those people looking at btrfs - note that parity-raid (5 or 6) is not a wise idea at the moment so you don't get two-failure protection ... Specifically if the system crashes or has a power failure there may be some data left hanging until it can complete a scrub. Disk failures during that period may lose some of said data. How much of a risk that is depends on the stability of your power and kernel and how much data turnover you have. I only use it on systems with UPS power and additional backups. Needs careful monitoring of the drives too since system crashes due to drive failures can leave you in rather a sticky mess. > >Cheers, >Wol > > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 21:50 ` Wol 2022-04-12 22:05 ` Laurence Perkins @ 2022-04-12 22:16 ` Julien Roy 1 sibling, 0 replies; 57+ messages in thread From: Julien Roy @ 2022-04-12 22:16 UTC (permalink / raw To: Gentoo User [-- Attachment #1: Type: text/plain, Size: 667 bytes --] > For those people looking at btrfs - note that parity-raid (5 or 6) is not a wise idea at the moment so you don't get two-failure protection ... > > Cheers, > Wol > I've been reading that this is less and less true. The write-hole issue is rather old now (first reported around 2016 I think?) From what I read from various sources, the developpers have made some progress and the problem is getting harder and harder to reproduce, for instance, [1]. Although some people recommend using RAID1 for the metadata, and RAID5/6 for the data, just in case. Julien [1] https://unixsheikh.com/articles/battle-testing-zfs-btrfs-and-mdadm-dm.html#btrfs-raid-5 [-- Attachment #2: Type: text/html, Size: 1248 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 19:41 ` Laurence Perkins 2022-04-12 21:50 ` Wol @ 2022-04-12 21:59 ` Dale 1 sibling, 0 replies; 57+ messages in thread From: Dale @ 2022-04-12 21:59 UTC (permalink / raw To: gentoo-user Laurence Perkins wrote: > I went with a couple of https://www.amazon.com/MZHOU-Profile-Bracket-Support-Converter/dp/B08L7W8QFT/ in a couple different sizes for two of my mass storage systems and they seem to be doing OK. > > The difference between the cheap vendors and the expensive vendors these days tends to be quality control. So plug it in, load it up, run it hard for a few hours. If it doesn't die relatively quickly you're usually good. > > Especially if you have RAID with checksums it's difficult for a controller to mangle things too badly even if it does have an issue. > > Remember: Data does not exist if it doesn't exist in at least three places. So you still want off-site backups in case your house burns down. Especially for irreplaceable things. > > If you have friends who also want off-site backups and you leave your machines running all the time then tahoe-lafs is pretty decent. For that matter they don't even have to really be friends, you really only have to be able to trust them to not selfishly hog all the space. > > I use BTRFS RAID1 for a lot of stuff. So far it's been pretty good at catching dropped bits and recovering from failures. It has a bit of the RAID issue where a drive could fail while you're doing a recovery since it only guarantees integrity with one dud drive regardless of the number of drives in the pool. But since each chunk is only written to two drives instead of spread across all of them the rebuild time stays relatively short and even if another drive does fail you'll only lose some of the data instead of all of it. This also means that the wasted space when your drives aren't all the same size is kept to a minimum. > > ZFS and similar are arguably better for larger arrays, but are also more hassle to set up. > > LVM is good for being able to swap out drives easily but with the modern, huge drives you really want data checksums if you can get them. Otherwise all it takes is a flipped bit somewhere to wreck your data and drive firmware doesn't always notice. I think you can do that with LVM, but I've never looked into it for certain. > > LMP I looked at that card and read some of the reviews. Some claim they had issues but I suspect a driver problem. Can you do a lspci -k and see what driver it uses for that card on your system? If yours works fine, I'd want to use the same driver. That is a lot of drives tho. I need to build a NAS thingy. lol Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 17:21 ` Laurence Perkins 2022-04-12 18:22 ` Dale @ 2022-04-12 19:17 ` Wols Lists 2022-04-12 22:00 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Wols Lists @ 2022-04-12 19:17 UTC (permalink / raw To: gentoo-user On 12/04/2022 18:21, Laurence Perkins wrote: > You can get up to 16X SATA PCI-e cards these days for pretty cheap. So as long as you have the power to run another drive or two there's not much reason not to do RAID on the important stuff. Also, the SATA protocol allows for port expanders, which are also pretty cheap. > > One of my favorite things about BTRFS is the data checksums. If the drive returns garbage, it turns into a read error. Also, if you can't do real RAID, but have excess space you can tell it to keep two copies of everything. Doesn't help with total drive failure, but does protect against the occasional failed sector. If you don't mind writes taking twice as long anyway. https://raid.wiki.kernel.org/index.php/Linux_Raid https://raid.wiki.kernel.org/index.php/System2020 That system in the second link is the system being used to type this message ... Cheers, Wol ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 19:17 ` Wols Lists @ 2022-04-12 22:00 ` Dale 2022-04-12 22:03 ` Mark Knecht 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-12 22:00 UTC (permalink / raw To: gentoo-user Wols Lists wrote: > On 12/04/2022 18:21, Laurence Perkins wrote: >> You can get up to 16X SATA PCI-e cards these days for pretty cheap. >> So as long as you have the power to run another drive or two there's >> not much reason not to do RAID on the important stuff. Also, the >> SATA protocol allows for port expanders, which are also pretty cheap. >> >> One of my favorite things about BTRFS is the data checksums. If the >> drive returns garbage, it turns into a read error. Also, if you >> can't do real RAID, but have excess space you can tell it to keep two >> copies of everything. Doesn't help with total drive failure, but >> does protect against the occasional failed sector. If you don't mind >> writes taking twice as long anyway. > > https://raid.wiki.kernel.org/index.php/Linux_Raid > > https://raid.wiki.kernel.org/index.php/System2020 > > That system in the second link is the system being used to type this > message ... > > Cheers, > Wol > > Neat setup. I need something similar for a NAS setup thingy. Just got way to much going on right now. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 22:00 ` Dale @ 2022-04-12 22:03 ` Mark Knecht 0 siblings, 0 replies; 57+ messages in thread From: Mark Knecht @ 2022-04-12 22:03 UTC (permalink / raw To: Gentoo User On Tue, Apr 12, 2022 at 3:01 PM Dale <rdalek1967@gmail.com> wrote: <SNIP> > Neat setup. I need something similar for a NAS setup thingy. Just got > way to much going on right now. > > Dale > > :-) :-) > LOL. Watching this thread made me start a round of backups to my NAS thingy Dale. ;-) Mark ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 17:08 ` Dale 2022-04-12 17:21 ` Laurence Perkins @ 2022-04-12 17:39 ` Frank Steinmetzger 2022-04-12 18:09 ` Laurence Perkins 2022-04-12 19:27 ` Rich Freeman 2 siblings, 1 reply; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-12 17:39 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 2396 bytes --] Am Tue, Apr 12, 2022 at 12:08:24PM -0500 schrieb Dale: > Rich Freeman wrote: > > On Mon, Apr 11, 2022 at 9:27 PM Dale <rdalek1967@gmail.com> wrote: > >> Thoughts. Replace as soon as drive arrives or wait and see? > >> > > So, first of all just about all my hard drives are in a RAID at this > > point, so I have a higher tolerance for issues. > Sadly, I don't have RAID here but to be honest, I really need to have it > given the data and my recent luck with hard drives. Plus, if you do a Raid 5 or Raid-Z1, you use your capacity more efficiently with just three drives. However, when I was building my NAS 5½ years ago, there was already an article about Raid-5 becoming obsolete due to the ever rising drive capacity. Because if you have a failed drive and need to replace and rebuild, the chances that another drive fails during rebuild rises with the drive capacity. > Drives used to get dumped because they were just to small to use anymore. > Nowadays, they seem to break in some fashion long before their usefulness > ends their lives. I recently bought a passive mini-pc (zotac zbox) and just for the fun of it installed a 160 GB HDD that maxes out at aronud 40 MiB/s. You do NOT want to run a modern Linux desktop on such a drive. :D > I remounted the drives and did a backup. For anyone running up on this, > just in case one of the files got corrupted, I used a little trick to > see if I can figure out which one may be bad if any. I took my rsync > commands from my little script and ran them one at a time with --dry-run > added. I actually developed a tool for that. It creates and checks md5 checksums recursively and *per directory*. Whenever I copy stuff from somewhere, like a music album, I do an immediate md5 run on that directory. And when I later copy that stuff around, I simply run the tool again on the copy (after the FS cache was flushed, for example by unmounting and remounting) to see whether the checksums are still valid. You can find it on github: https://github.com/felf/dh It’s a single-file python application, because I couldn’t be bothered with the myriad ways of creating a python package. ;-) -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. A horse comes into a bar. Barkeep: “Hey!” Horse: “Sure.” [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [gentoo-user] Hard drive error from SMART 2022-04-12 17:39 ` Frank Steinmetzger @ 2022-04-12 18:09 ` Laurence Perkins 2022-04-12 18:54 ` Frank Steinmetzger 0 siblings, 1 reply; 57+ messages in thread From: Laurence Perkins @ 2022-04-12 18:09 UTC (permalink / raw To: gentoo-user@lists.gentoo.org > -----Original Message----- > From: Frank Steinmetzger <Warp_7@gmx.de> > Sent: Tuesday, April 12, 2022 10:39 AM > To: gentoo-user@lists.gentoo.org > Subject: Re: [gentoo-user] Hard drive error from SMART > > > I actually developed a tool for that. It creates and checks md5 checksums recursively and *per directory*. Whenever I copy stuff from somewhere, like a music album, I do an immediate md5 run on that directory. And when I later copy that stuff around, I simply run the tool again on the copy (after the FS cache was flushed, for example by unmounting and remounting) to see whether the checksums are still valid. > > You can find it on github: https://github.com/felf/dh It’s a single-file python application, because I couldn’t be bothered with the myriad ways of creating a python package. ;-) > > -- > Grüße | Greetings | Salut | Qapla’ > Please do not share anything from, with or about me on any social network. > > A horse comes into a bar. > Barkeep: “Hey!” > Horse: “Sure.” > There's also app-crypt/md5deep Does a number of hashes, is threaded, has options for piecewise hashing and a matching mode for using the hashes to find duplicates. Also a number of input and output filters for those cases where you don't want to hash everything. Also can output a number of formats, but reformatting is generally trivial. LMP ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 18:09 ` Laurence Perkins @ 2022-04-12 18:54 ` Frank Steinmetzger 0 siblings, 0 replies; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-12 18:54 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1344 bytes --] Am Tue, Apr 12, 2022 at 06:09:13PM +0000 schrieb Laurence Perkins: > > I actually developed a tool for that. It creates and checks md5 > > checksums recursively and *per directory*. Whenever I copy stuff from > > somewhere, like a music album, I do an immediate md5 run on that > > directory. And when I later copy that stuff around, I simply run the > > tool again on the copy (after the FS cache was flushed, for example by > > unmounting and remounting) to see whether the checksums are still valid. > > > There's also app-crypt/md5deep > > Does a number of hashes, is threaded, has options for piecewise hashing and a matching mode for using the hashes to find duplicates. Also a number of input and output filters for those cases where you don't want to hash everything. I knew about md5deep when I started with my own tool (as can be read in the readme ;-) ). But md5deep used one single md5 file at a tree’s root, whereas I wanted one file per directory in a tree. The reason being that I wanted to be able to copy individual directories and still check their hashes without editing checksum files. -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. If you were born feet-first, then, for a short moment, you wore your mother as a hat. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 17:08 ` Dale 2022-04-12 17:21 ` Laurence Perkins 2022-04-12 17:39 ` Frank Steinmetzger @ 2022-04-12 19:27 ` Rich Freeman 2022-04-12 22:03 ` Dale 2 siblings, 1 reply; 57+ messages in thread From: Rich Freeman @ 2022-04-12 19:27 UTC (permalink / raw To: gentoo-user On Tue, Apr 12, 2022 at 1:08 PM Dale <rdalek1967@gmail.com> wrote: > > I remounted the drives and did a backup. For anyone running up on this, > just in case one of the files got corrupted, I used a little trick to > see if I can figure out which one may be bad if any. I took my rsync > commands from my little script and ran them one at a time with --dry-run > added. If a file was to be updated on the backup that I hadn't changed > or added, I was going to check into it before updating my backups. Unless you're using the --checksum option on rsync this isn't likely to be effective. By default rsync only looks at size and mtime, so it isn't going to back up a file unless you intentionally changed it. If data was silently corrupted this wouldn't detect a change at all without the --checksum option. Ultimately if you care about silent corruptions you're best off using a solution that actually achieves this. btrfs, zfs, or something whipped up with dm-integrity would be best. At a file level you could store multiple files and hashes, or use a solution like PAR2. Plain mdadm raid1 will fix issues if the drive detects and reports errors (the drive typically has a checksum to do this, but it is a black box and may not always work). The other solutions will reliably detect and possibly recover errors even if the drive fails to detect them (a so-called silent error). Just about all my linux data these days is on a solution that detects silent errors - zfs or lizardfs. On ssd-based systems where I don't want to invest in mirroring I still run zfs to detect errors and just use frequent backups (ssds are small anyway so they're cheap to frequently back up, especially if they're on zfs where there are send-based backup scripts for this, and typically this is for OS drives where things don't change much anyway). -- Rich ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 19:27 ` Rich Freeman @ 2022-04-12 22:03 ` Dale 2022-04-12 22:49 ` Frank Steinmetzger 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-12 22:03 UTC (permalink / raw To: gentoo-user Rich Freeman wrote: > On Tue, Apr 12, 2022 at 1:08 PM Dale <rdalek1967@gmail.com> wrote: >> I remounted the drives and did a backup. For anyone running up on this, >> just in case one of the files got corrupted, I used a little trick to >> see if I can figure out which one may be bad if any. I took my rsync >> commands from my little script and ran them one at a time with --dry-run >> added. If a file was to be updated on the backup that I hadn't changed >> or added, I was going to check into it before updating my backups. > Unless you're using the --checksum option on rsync this isn't likely > to be effective. By default rsync only looks at size and mtime, so it > isn't going to back up a file unless you intentionally changed it. If > data was silently corrupted this wouldn't detect a change at all > without the --checksum option. > > Ultimately if you care about silent corruptions you're best off using > a solution that actually achieves this. btrfs, zfs, or something > whipped up with dm-integrity would be best. At a file level you could > store multiple files and hashes, or use a solution like PAR2. Plain > mdadm raid1 will fix issues if the drive detects and reports errors > (the drive typically has a checksum to do this, but it is a black box > and may not always work). The other solutions will reliably detect > and possibly recover errors even if the drive fails to detect them (a > so-called silent error). > > Just about all my linux data these days is on a solution that detects > silent errors - zfs or lizardfs. On ssd-based systems where I don't > want to invest in mirroring I still run zfs to detect errors and just > use frequent backups (ssds are small anyway so they're cheap to > frequently back up, especially if they're on zfs where there are > send-based backup scripts for this, and typically this is for OS > drives where things don't change much anyway). > My hope was if it was corrupted and something changed then I'd see it in the list. If nothing changed then rsync wouldn't change anything on the backups either. I'll look into that option tho. May be something for the future. ;-) I suspect it would slow things down quite a bit tho. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 22:03 ` Dale @ 2022-04-12 22:49 ` Frank Steinmetzger 2022-04-12 23:01 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-12 22:49 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1624 bytes --] Am Tue, Apr 12, 2022 at 05:03:01PM -0500 schrieb Dale: > Rich Freeman wrote: > > On Tue, Apr 12, 2022 at 1:08 PM Dale <rdalek1967@gmail.com> wrote: > >> I remounted the drives and did a backup. For anyone running up on this, > >> just in case one of the files got corrupted, I used a little trick to > >> see if I can figure out which one may be bad if any. I took my rsync > >> commands from my little script and ran them one at a time with --dry-run > >> added. If a file was to be updated on the backup that I hadn't changed > >> or added, I was going to check into it before updating my backups. > > Unless you're using the --checksum option on rsync this isn't likely > > to be effective. > My hope was if it was corrupted and something changed then I'd see it in > the list. If nothing changed then rsync wouldn't change anything on the > backups either. I'll look into that option tho. May be something for > the future. ;-) I suspect it would slow things down quite a bit tho. The advantage of an integrity scheme (like ZFS or comparing with a checksum file) over your rsync approach is that you only need to read all the datas™ from one drive instead of two. Plus: if rsync actually detects a change, it doesn’t know which of the two drives introduced the error. You need to find out yourself after the fact (which probably won’t be hard, but still, it’s one more manual step). -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. “An itching nose must be scratched.” … Kosh (Star Wreck) [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 22:49 ` Frank Steinmetzger @ 2022-04-12 23:01 ` Dale 2022-04-12 23:59 ` Frank Steinmetzger 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-12 23:01 UTC (permalink / raw To: gentoo-user Frank Steinmetzger wrote: > Am Tue, Apr 12, 2022 at 05:03:01PM -0500 schrieb Dale: >> Rich Freeman wrote: >>> On Tue, Apr 12, 2022 at 1:08 PM Dale <rdalek1967@gmail.com> wrote: >>>> I remounted the drives and did a backup. For anyone running up on this, >>>> just in case one of the files got corrupted, I used a little trick to >>>> see if I can figure out which one may be bad if any. I took my rsync >>>> commands from my little script and ran them one at a time with --dry-run >>>> added. If a file was to be updated on the backup that I hadn't changed >>>> or added, I was going to check into it before updating my backups. >>> Unless you're using the --checksum option on rsync this isn't likely >>> to be effective. >> My hope was if it was corrupted and something changed then I'd see it in >> the list. If nothing changed then rsync wouldn't change anything on the >> backups either. I'll look into that option tho. May be something for >> the future. ;-) I suspect it would slow things down quite a bit tho. > The advantage of an integrity scheme (like ZFS or comparing with a checksum > file) over your rsync approach is that you only need to read all the datas™ > from one drive instead of two. Plus: if rsync actually detects a change, it > doesn’t know which of the two drives introduced the error. You need to find > out yourself after the fact (which probably won’t be hard, but still, it’s > one more manual step). > In this case, if something had changed, I'd have no problem manually checking the file to be sure which was good and which was bad. Given the error is recent on my drive, I'd suspect the backups to still be a good file. For that reason, I'd suspect the backup file to be good therefore not to be overwritten. I was trying to avoid a bad file replacing a good file on the backup which then destroys all good files and leaves only bad ones. This is why I like that SMART at least let me know there is a problem. Sometimes things has to be done manually which is often the best way. Just depends on the situation I guess. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 23:01 ` Dale @ 2022-04-12 23:59 ` Frank Steinmetzger 2022-04-13 0:43 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-12 23:59 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1791 bytes --] Am Tue, Apr 12, 2022 at 06:01:11PM -0500 schrieb Dale: > > The advantage of an integrity scheme (like ZFS or comparing with a checksum > > file) over your rsync approach is that you only need to read all the datas™ > > from one drive instead of two. Plus: if rsync actually detects a change, it > > doesn’t know which of the two drives introduced the error. You need to find > > out yourself after the fact (which probably won’t be hard, but still, it’s > > one more manual step). > > In this case, if something had changed, I'd have no problem manually > checking the file to be sure which was good and which was bad. Consider a big video file, which I know you like to accumulate from youtube and the likes. How do you find out the broken one? By watching it and trying to find the one image or audio frame that is garbled? The drive might return zeros or other garbage (bit flip) instead of actual content without SMART noticing it (uncorrectable error). > Given > the error is recent on my drive, I'd suspect the backups to still be a > good file. For that reason, I'd suspect the backup file to be good > therefore not to be overwritten. I was trying to avoid a bad file > replacing a good file on the backup which then destroys all good files > and leaves only bad ones. This is why I like that SMART at least let me > know there is a problem. I also tend to rely on smart, but it’s not all-knowing and probably not infallible. > Sometimes things has to be done manually which is often the best way. > Just depends on the situation I guess. -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. The only thing still keeping me here is Earth’s gravity. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 23:59 ` Frank Steinmetzger @ 2022-04-13 0:43 ` Dale 0 siblings, 0 replies; 57+ messages in thread From: Dale @ 2022-04-13 0:43 UTC (permalink / raw To: gentoo-user Frank Steinmetzger wrote: > Am Tue, Apr 12, 2022 at 06:01:11PM -0500 schrieb Dale: > >>> The advantage of an integrity scheme (like ZFS or comparing with a checksum >>> file) over your rsync approach is that you only need to read all the datas™ >>> from one drive instead of two. Plus: if rsync actually detects a change, it >>> doesn’t know which of the two drives introduced the error. You need to find >>> out yourself after the fact (which probably won’t be hard, but still, it’s >>> one more manual step). >> In this case, if something had changed, I'd have no problem manually >> checking the file to be sure which was good and which was bad. > Consider a big video file, which I know you like to accumulate from youtube > and the likes. How do you find out the broken one? By watching it and trying > to find the one image or audio frame that is garbled? The drive might return > zeros or other garbage (bit flip) instead of actual content without SMART > noticing it (uncorrectable error). > In this case, I'd likely rename one file and keep them both until I can figure out which is good. That said, I'd certainly keep the backup copy because odds are, it is good since the error came well after my last backup. At this point tho, I don't know what file was on that bad spot. >> Given >> the error is recent on my drive, I'd suspect the backups to still be a >> good file. For that reason, I'd suspect the backup file to be good >> therefore not to be overwritten. I was trying to avoid a bad file >> replacing a good file on the backup which then destroys all good files >> and leaves only bad ones. This is why I like that SMART at least let me >> know there is a problem. > I also tend to rely on smart, but it’s not all-knowing and probably not > infallible. > > This is very true. I mentioned elsewhere that things like spindle motor failure or the motor that moves the heads are usually not detectable. Some component failures can be detected but not all or even most from what I've read. Basically, the best you can hope for is SMART seeing a bad spot on the media itself. That it seems it can detect most of the time. TL;DR next two paragraphs. Just a interesting story along this line. I used to work in parts at a fortune 500 office company. We had millions of dollars of just computer stuff in inventory just for computers. That was in early 90's. They also had copiers and their parts, paper etc etc. We used a NCR computer for a computer system for the whole company. At the end of the building was a speed bump so people wouldn't go flying down the one lane road between the building and fence on the property line. One day a large truck almost empty went a little faster than normal over the last speed bump. It shook the building to the point I could feel it about 150 feet away. The computer room was like 50 feet away from that side of the building. It seems the hard drive felt it very well. One, maybe more, of the head(s) got under the media and started peeling it off the platter and made a really ugly screeching sound. No routine shutdown, they just pulled the plug. As you can imagine tho, it did no good. Even way back then drives of that speed were spinning fast enough. I suspect even by the time a person could blink it was way past fixing. That of course was way before SMART came along but SMART would never be able to predict such a failure. Even NCR said it was likely a 1 in a million chance that the truck hits just when the head was moving over a weak spot. Several thousand dollars later, and a private plane bringing in a new drive, the drive was replaced. Of course, the idiot in charge had no backups that were of any use. All of them were several weeks old, likely over a month. Luckily he stayed far away from me for at least a month. Otherwise, I'd likely still be in jail, with my hands around the neck of his corpse. :-@ SMART isn't a sure thing but it can help in some cases which is better than nothing at all. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-12 1:27 [gentoo-user] Hard drive error from SMART Dale ` (2 preceding siblings ...) 2022-04-12 14:57 ` Rich Freeman @ 2022-04-15 15:49 ` Dale 2022-04-15 16:50 ` John Covici 2022-04-16 13:16 ` Frank Steinmetzger 3 siblings, 2 replies; 57+ messages in thread From: Dale @ 2022-04-15 15:49 UTC (permalink / raw To: gentoo-user Howdy, I got the drive and pvmove is doing its thing. I would like to unplug one of the drives and physically move them around without shutting down my system. Is there a way to tell LVM to disable the drives while I'm doing this and restart them when done? I found the command vgchange -a n<name> but I'm not sure if that is correct. Honestly, I want to be really sure before I unplug things. I assume the "n" changes to "y" to restart them? Thanks. Dale :-) :-) P. S. BTW, the drive has passed two new tests with no error. The tests are slower than usual tho. I'm not sure why tho. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-15 15:49 ` Dale @ 2022-04-15 16:50 ` John Covici 2022-04-15 17:29 ` Dale 2022-04-16 13:16 ` Frank Steinmetzger 1 sibling, 1 reply; 57+ messages in thread From: John Covici @ 2022-04-15 16:50 UTC (permalink / raw To: gentoo-user On Fri, 15 Apr 2022 11:49:21 -0400, Dale wrote: > > Howdy, > > I got the drive and pvmove is doing its thing. I would like to unplug > one of the drives and physically move them around without shutting down > my system. Is there a way to tell LVM to disable the drives while I'm > doing this and restart them when done? I found the command vgchange -a > n<name> but I'm not sure if that is correct. Honestly, I want to be > really sure before I unplug things. I assume the "n" changes to "y" to > restart them? > > Thanks. > > Dale > > :-) :-) > > P. S. BTW, the drive has passed two new tests with no error. The tests > are slower than usual tho. I'm not sure why tho. > No, you can't do that till the pmove is over. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici wb2una covici@ccs.covici.com ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-15 16:50 ` John Covici @ 2022-04-15 17:29 ` Dale 2022-04-16 1:43 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-15 17:29 UTC (permalink / raw To: gentoo-user John Covici wrote: > On Fri, 15 Apr 2022 11:49:21 -0400, > Dale wrote: >> Howdy, >> >> I got the drive and pvmove is doing its thing. I would like to unplug >> one of the drives and physically move them around without shutting down >> my system. Is there a way to tell LVM to disable the drives while I'm >> doing this and restart them when done? I found the command vgchange -a >> n<name> but I'm not sure if that is correct. Honestly, I want to be >> really sure before I unplug things. I assume the "n" changes to "y" to >> restart them? >> >> Thanks. >> >> Dale >> >> :-) :-) >> >> P. S. BTW, the drive has passed two new tests with no error. The tests >> are slower than usual tho. I'm not sure why tho. >> > No, you can't do that till the pmove is over. > Yea. I was planning to wait until pvmove was done. It actually finished not to long after I sent the message. It was what prompted me to see if this is possible. I found a page that talks about it but the info didn't explain it much. I'm pretty sure that is the right command but given the limited info, I wasn't sure. Reading the man page helped a little but still wasn't 100% sure then either. Thing is, I only have to unplug and move one of the two drives on that group. Sounds like the right command tho. If not, someone speak up. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-15 17:29 ` Dale @ 2022-04-16 1:43 ` Dale 0 siblings, 0 replies; 57+ messages in thread From: Dale @ 2022-04-16 1:43 UTC (permalink / raw To: gentoo-user Dale wrote: > John Covici wrote: >> On Fri, 15 Apr 2022 11:49:21 -0400, >> Dale wrote: >>> Howdy, >>> >>> I got the drive and pvmove is doing its thing. I would like to unplug >>> one of the drives and physically move them around without shutting down >>> my system. Is there a way to tell LVM to disable the drives while I'm >>> doing this and restart them when done? I found the command vgchange -a >>> n<name> but I'm not sure if that is correct. Honestly, I want to be >>> really sure before I unplug things. I assume the "n" changes to "y" to >>> restart them? >>> >>> Thanks. >>> >>> Dale >>> >>> :-) :-) >>> >>> P. S. BTW, the drive has passed two new tests with no error. The tests >>> are slower than usual tho. I'm not sure why tho. >>> >> No, you can't do that till the pmove is over. >> > > Yea. I was planning to wait until pvmove was done. It actually > finished not to long after I sent the message. It was what prompted me > to see if this is possible. I found a page that talks about it but the > info didn't explain it much. I'm pretty sure that is the right command > but given the limited info, I wasn't sure. Reading the man page helped > a little but still wasn't 100% sure then either. Thing is, I only have > to unplug and move one of the two drives on that group. > > Sounds like the right command tho. If not, someone speak up. > > Dale > > :-) :-) > For anyone searching and running up on this thread. That command did work to disable the drive. I'm not sure if I should have used pvchange to disable /dev/sdk1 or not. The problem I did run into was getting it back. I ran the command to enable it but it didn't work as expected. I had files missing. So, I unmounted it, ran pvscan, vgscan and lvscan in that order. I then ran the command above again to be sure and remounted the LV group. It worked that time. All files were there. So, either one has to rescan them or I should have also ran pvchange to disable as well. Maybe someone else can expand on this. While I'm at it. Is there a way to reset the sdk part? The old was sdd and I was hoping when I moved the drive, it would change with it. The reason is, usually when I hook up my external drives, they use sdk. I'm sort of set up for that. A couple other things use sdk as well. I'm not sure if there is a easy way to do that or not. Wonder if it will reset when I reboot??? Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-15 15:49 ` Dale 2022-04-15 16:50 ` John Covici @ 2022-04-16 13:16 ` Frank Steinmetzger 2022-04-16 14:59 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-16 13:16 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 689 bytes --] Am Fri, Apr 15, 2022 at 10:49:21AM -0500 schrieb Dale: > Howdy, > > I got the drive and pvmove is doing its thing. I would like to unplug > one of the drives and physically move them around without shutting down > my system. Is there a way to tell LVM to disable the drives while I'm > doing this and restart them when done? Be aware that SATA hot-plugging must be enabled in the BIOS for each individual SATA port (at least that’s the case on my board). I’m not sure what a difference it actually makes, though. -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. Be regular. Eat cron flakes. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 13:16 ` Frank Steinmetzger @ 2022-04-16 14:59 ` Dale 2022-04-16 17:20 ` Rich Freeman 2022-04-16 17:22 ` Michael 0 siblings, 2 replies; 57+ messages in thread From: Dale @ 2022-04-16 14:59 UTC (permalink / raw To: gentoo-user Frank Steinmetzger wrote: > Am Fri, Apr 15, 2022 at 10:49:21AM -0500 schrieb Dale: >> Howdy, >> >> I got the drive and pvmove is doing its thing. I would like to unplug >> one of the drives and physically move them around without shutting down >> my system. Is there a way to tell LVM to disable the drives while I'm >> doing this and restart them when done? > Be aware that SATA hot-plugging must be enabled in the BIOS for each > individual SATA port (at least that’s the case on my board). I’m not sure > what a difference it actually makes, though. > I enabled that the first time I cut the system on after building it. I couldn't think of any reason not to have it enabled really. It would be like making USB require rebooting before plugging/unplugging something. Certainly better than the old IDE days. I have googled and can not find a way to reset udev and it naming drives. I may have to rework some things since the drive kept the sdk instead of switching to sdd when I made the physical change. Thing is, I suspect it will when I reboot the next time. It also triggered messages from SMART too. It got upset that it couldn't find sdd anymore. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 14:59 ` Dale @ 2022-04-16 17:20 ` Rich Freeman 2022-04-16 17:45 ` Dale 2022-04-16 17:22 ` Michael 1 sibling, 1 reply; 57+ messages in thread From: Rich Freeman @ 2022-04-16 17:20 UTC (permalink / raw To: gentoo-user On Sat, Apr 16, 2022 at 10:59 AM Dale <rdalek1967@gmail.com> wrote: > > I have googled and can not find a way to reset udev and it naming > drives. I may have to rework some things since the drive kept the sdk > instead of switching to sdd when I made the physical change. Thing is, > I suspect it will when I reboot the next time. IMO it is best to make that not matter. If you're referencing drives by letter in configuration files, you're just asking for some change to re-order things and cause problems. You're using LVM, so all the drives should be assembled based on their embedded metadata. It is fine to reference whatever temporary device name you're using when running pvmove/pvcreate since that doesn't really get stored anywhere. If you are directly mounting anything without using LVM then it is best to use labels/uuids/etc to identify partitions. > It also triggered > messages from SMART too. It got upset that it couldn't find sdd anymore. That is typical when hotswapping. I believe smartd only scans drives at startup, and of course if a drive does go offline it isn't a bad thing that it is noisy about it. From a quick read of the manpage SIGHUP might or might not get it to rescan the drives, and if not you can just restart it. The daemon works by polling so if there are any pending issues they should still get picked up after restarting the daemon. -- Rich ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 17:20 ` Rich Freeman @ 2022-04-16 17:45 ` Dale 2022-04-16 18:58 ` Neil Bothwick 2022-04-16 19:41 ` Frank Steinmetzger 0 siblings, 2 replies; 57+ messages in thread From: Dale @ 2022-04-16 17:45 UTC (permalink / raw To: gentoo-user Rich Freeman wrote: > On Sat, Apr 16, 2022 at 10:59 AM Dale <rdalek1967@gmail.com> wrote: >> I have googled and can not find a way to reset udev and it naming >> drives. I may have to rework some things since the drive kept the sdk >> instead of switching to sdd when I made the physical change. Thing is, >> I suspect it will when I reboot the next time. > IMO it is best to make that not matter. If you're referencing drives > by letter in configuration files, you're just asking for some change > to re-order things and cause problems. > > You're using LVM, so all the drives should be assembled based on their > embedded metadata. It is fine to reference whatever temporary device > name you're using when running pvmove/pvcreate since that doesn't > really get stored anywhere. If you are directly mounting anything > without using LVM then it is best to use labels/uuids/etc to identify > partitions. I have to use sd** when using cryptsetup to decrypt the drive. I haven't found a way around that that is easier yet. My command was something like cryptsetup open /dev/sdk1 <name> and then it asks for the password. After that, I use UUID and a entry in fstab to mount. If there is a easier way, I'm open to it. I have three external drives and as long as I only power them up one at a time, they all used sdk. Now they use sdd and I keep trying to type in sdk, from habit. :/ My next project, find a good external drive enclosure like the three I got now. They no longer available tho. I like them because they have a fan, a eSATA port and a nifty display to let me know things are working. Really a good price for the features. I don't like USB connected drives. Long story. >> It also triggered >> messages from SMART too. It got upset that it couldn't find sdd anymore. > That is typical when hotswapping. I believe smartd only scans drives > at startup, and of course if a drive does go offline it isn't a bad > thing that it is noisy about it. From a quick read of the manpage > SIGHUP might or might not get it to rescan the drives, and if not you > can just restart it. The daemon works by polling so if there are any > pending issues they should still get picked up after restarting the > daemon. > Yea, it is a good thing. I just disabled it for sdd, enabled for the new sdk and restarted the service. It was happy then but getting a email from SMART always makes my heart beat a few extra beats and sometimes causes me to swallow big too. It's rarely good news. Maybe the next reboot will sort things out. Then I get to switch everything back to the old way again. :/ Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 17:45 ` Dale @ 2022-04-16 18:58 ` Neil Bothwick 2022-04-16 20:09 ` Alan Mackenzie 2022-04-16 22:39 ` Dale 2022-04-16 19:41 ` Frank Steinmetzger 1 sibling, 2 replies; 57+ messages in thread From: Neil Bothwick @ 2022-04-16 18:58 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 837 bytes --] On Sat, 16 Apr 2022 12:45:20 -0500, Dale wrote: > > You're using LVM, so all the drives should be assembled based on their > > embedded metadata. It is fine to reference whatever temporary device > > name you're using when running pvmove/pvcreate since that doesn't > > really get stored anywhere. If you are directly mounting anything > > without using LVM then it is best to use labels/uuids/etc to identify > > partitions. > > I have to use sd** when using cryptsetup to decrypt the drive. I > haven't found a way around that that is easier yet. My command was > something like cryptsetup open /dev/sdk1 <name> and then it asks for the > password. Use /dev/disks/by/partlabel/foo or /dev/disks/by-partuuid/bar. -- Neil Bothwick If Yoda so strong in force is, why words in right order he cannot put? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 18:58 ` Neil Bothwick @ 2022-04-16 20:09 ` Alan Mackenzie 2022-04-16 21:40 ` Neil Bothwick 2022-04-16 22:39 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Alan Mackenzie @ 2022-04-16 20:09 UTC (permalink / raw To: gentoo-user On Sat, Apr 16, 2022 at 19:58:22 +0100, Neil Bothwick wrote: > -- > Neil Bothwick > If Yoda so strong in force is, why words in right order he cannot put? Vielleicht, weil seine Muttersprache Deutsch ist. :-) -- Alan Mackenzie (Nuremberg, Germany). ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 20:09 ` Alan Mackenzie @ 2022-04-16 21:40 ` Neil Bothwick 0 siblings, 0 replies; 57+ messages in thread From: Neil Bothwick @ 2022-04-16 21:40 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 298 bytes --] On Sat, 16 Apr 2022 20:09:23 +0000, Alan Mackenzie wrote: > > If Yoda so strong in force is, why words in right order he cannot > > put? > > Vielleicht, weil seine Muttersprache Deutsch ist. :-) RLFO -- Neil Bothwick SITCOM: Single Income, Two Children, Oppressive Mortgage [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 18:58 ` Neil Bothwick 2022-04-16 20:09 ` Alan Mackenzie @ 2022-04-16 22:39 ` Dale 2022-04-17 3:13 ` Rich Freeman 1 sibling, 1 reply; 57+ messages in thread From: Dale @ 2022-04-16 22:39 UTC (permalink / raw To: gentoo-user Neil Bothwick wrote: > On Sat, 16 Apr 2022 12:45:20 -0500, Dale wrote: > >>> You're using LVM, so all the drives should be assembled based on their >>> embedded metadata. It is fine to reference whatever temporary device >>> name you're using when running pvmove/pvcreate since that doesn't >>> really get stored anywhere. If you are directly mounting anything >>> without using LVM then it is best to use labels/uuids/etc to identify >>> partitions. >> I have to use sd** when using cryptsetup to decrypt the drive. I >> haven't found a way around that that is easier yet. My command was >> something like cryptsetup open /dev/sdk1 <name> and then it asks for the >> password. > Use /dev/disks/by/partlabel/foo or /dev/disks/by-partuuid/bar. > > That's even more typing than /dev/sdk. Some things I do easily by using tab completion and all. When mounting, I let fstab remember the UUID for it. Very little typing and don't have to remember things. ;-) It's not like UUIDs are made to remember either. :-[ I think I put a label on the drive but things are a bit different when using cryptsetup. At least I think they are. The easiest thing, just having the replacement drive as sdd again and me having sdk as my external drive. I still think a reboot is going to correct this. I can't imagine it not given how the drives are plugged in. I just wish there was a easy solution in the meantime. To be honest, I've had several times where this would come in handy. This is just yet another one. Your way would be consistent tho. If I could script this, it would be the best way to do it. Script it once, done. Of course, we know my scripting skills are minimal at best. If you could say I even have scripting skills. lol Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 22:39 ` Dale @ 2022-04-17 3:13 ` Rich Freeman 2022-04-17 4:44 ` Dale 2022-04-17 18:38 ` Dale 0 siblings, 2 replies; 57+ messages in thread From: Rich Freeman @ 2022-04-17 3:13 UTC (permalink / raw To: gentoo-user On Sat, Apr 16, 2022 at 6:39 PM Dale <rdalek1967@gmail.com> wrote: > > Neil Bothwick wrote: > > Use /dev/disks/by/partlabel/foo or /dev/disks/by-partuuid/bar. > > > > That's even more typing than /dev/sdk. Some things I do easily by using > tab completion and all. When mounting, I let fstab remember the UUID > for it. That's what copy/paste is for. How often are you editing your crypttab anyway? This way when you move drives around they still work. > It's not like UUIDs are made to remember either. blkid is your friend. This is for config files, not random mounting/unmounting. I use the dynamic device nodes all the time if I'm just plugging a drive in and looking at it. However, if I'm going to put it in a config file I use a persistent ID so that I'm not running into breakage anytime things change. When I'm setting it up it is just a few extra seconds to look up the UUID and copy/paste it. When the system randomly breaks I have to go digging through logs and config files to figure out what went wrong. It pays for me to spend a little more time on getting my config right when everything is fresh in my head, because when I'm troubleshooting it will take a little while just to figure out what I did when I set it up. Here is an example of one of my cryptsetup files: cd1 UUID="1cbd5860-3469-41f7-8658-acd83d1957a0" /cd1.key (This is using a random key stored in a file, which works for this particular situation. Obviously the drive is only as secure as that file.) The corresponding drive blkid output is: /dev/sdb1: UUID="1cbd5860-3469-41f7-8658-acd83d1957a0" TYPE="crypto_LUKS" PARTUUID="a4a383a8-24c2-f74b-94d8-ca4ffc366327" Oh, and look at that - the first drive I set up on this system is actually the second drive that got assigned a device name. It was probably /dev/sda1 when I first set it up, and I added another drive since then. The contained drive shows up as: /dev/mapper/cd1: UUID="a2721813-4d10-4f69-ab2a-4beb0d6e95d7" TYPE="ext4" (No LVM here - this is storage for a distributed filesystem so the volume management is effectively above the filesystem level. I can add other drives to the cluster and they're in the pool, and if I want to move data off this drive I can just edit a config file and the data will be moved while online. The encryption is mainly so that if a drive fails I don't have to worry about anybody recovering data from it.) -- Rich ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 3:13 ` Rich Freeman @ 2022-04-17 4:44 ` Dale 2022-04-17 8:34 ` Neil Bothwick 2022-04-17 18:38 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Dale @ 2022-04-17 4:44 UTC (permalink / raw To: gentoo-user Rich Freeman wrote: > On Sat, Apr 16, 2022 at 6:39 PM Dale <rdalek1967@gmail.com> wrote: >> Neil Bothwick wrote: >>> Use /dev/disks/by/partlabel/foo or /dev/disks/by-partuuid/bar. >>> >> That's even more typing than /dev/sdk. Some things I do easily by using >> tab completion and all. When mounting, I let fstab remember the UUID >> for it. > That's what copy/paste is for. How often are you editing your > crypttab anyway? This way when you move drives around they still > work. What is crypttab? I type in the command manually. It's what the howtos showed. I can't find a crypttab file. This may make things easier. My usual names are 8tb, 6tb and pri, short for private. Ran out of other names. ROFL > >> It's not like UUIDs are made to remember either. > blkid is your friend. > > This is for config files, not random mounting/unmounting. I use the > dynamic device nodes all the time if I'm just plugging a drive in and > looking at it. However, if I'm going to put it in a config file I use > a persistent ID so that I'm not running into breakage anytime things > change. > > When I'm setting it up it is just a few extra seconds to look up the > UUID and copy/paste it. When the system randomly breaks I have to go > digging through logs and config files to figure out what went wrong. > It pays for me to spend a little more time on getting my config right > when everything is fresh in my head, because when I'm troubleshooting > it will take a little while just to figure out what I did when I set > it up. > > Here is an example of one of my cryptsetup files: > cd1 UUID="1cbd5860-3469-41f7-8658-acd83d1957a0" /cd1.key > > (This is using a random key stored in a file, which works for this > particular situation. Obviously the drive is only as secure as that > file.) > > The corresponding drive blkid output is: > /dev/sdb1: UUID="1cbd5860-3469-41f7-8658-acd83d1957a0" > TYPE="crypto_LUKS" PARTUUID="a4a383a8-24c2-f74b-94d8-ca4ffc366327" > > Oh, and look at that - the first drive I set up on this system is > actually the second drive that got assigned a device name. It was > probably /dev/sda1 when I first set it up, and I added another drive > since then. > > The contained drive shows up as: > /dev/mapper/cd1: UUID="a2721813-4d10-4f69-ab2a-4beb0d6e95d7" TYPE="ext4" > > (No LVM here - this is storage for a distributed filesystem so the > volume management is effectively above the filesystem level. I can > add other drives to the cluster and they're in the pool, and if I want > to move data off this drive I can just edit a config file and the data > will be moved while online. The encryption is mainly so that if a > drive fails I don't have to worry about anybody recovering data from > it.) > I use passwords here. I just type in sdk1 and it worked before this drive move. I never tried to go any further than the howtos I found about using cryptsetup. No clue on the file. I don't see one here and don't recall reading about it either. Gonna google on that a bit. Interesting. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 4:44 ` Dale @ 2022-04-17 8:34 ` Neil Bothwick 2022-04-17 18:45 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Neil Bothwick @ 2022-04-17 8:34 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 691 bytes --] On Sat, 16 Apr 2022 23:44:58 -0500, Dale wrote: > >> That's even more typing than /dev/sdk. Some things I do easily by > >> using tab completion and all. When mounting, I let fstab remember > >> the UUID for it. > > That's what copy/paste is for. How often are you editing your > > crypttab anyway? This way when you move drives around they still > > work. > > What is crypttab? I type in the command manually. Then use a shell alias, even less typing. -- Neil Bothwick Ninety-Ninety Rule Of Project Schedules - The first ninety percent of the task takes ninety percent of the time, and the last ten percent takes the other ninety percent of the time. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 8:34 ` Neil Bothwick @ 2022-04-17 18:45 ` Dale 2022-04-18 13:21 ` Neil Bothwick 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-17 18:45 UTC (permalink / raw To: gentoo-user Neil Bothwick wrote: > On Sat, 16 Apr 2022 23:44:58 -0500, Dale wrote: > >>>> That's even more typing than /dev/sdk. Some things I do easily by >>>> using tab completion and all. When mounting, I let fstab remember >>>> the UUID for it. >>> That's what copy/paste is for. How often are you editing your >>> crypttab anyway? This way when you move drives around they still >>> work. >> What is crypttab? I type in the command manually. > Then use a shell alias, even less typing. > > I've done a couple basic alias things here but never grasped it enough to do anything beyond making ls run with -al each time. I think there is another one I did but it was long ago. I'd have to dig to find it. My biggest thing, I'm so used to using sdk1 that I'm likely to have to hit the backspace key quite often until this gets sorted out. My OS stuff is on sda, sdb, sdc and was on sdd. Anything above that was external. If one of the storms knocks my lights out, I may get a chance to reboot. See if that fixes things. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 18:45 ` Dale @ 2022-04-18 13:21 ` Neil Bothwick 2022-04-18 14:06 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Neil Bothwick @ 2022-04-18 13:21 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 838 bytes --] On Sun, 17 Apr 2022 13:45:39 -0500, Dale wrote: > >> What is crypttab? I type in the command manually. > > Then use a shell alias, even less typing. > > I've done a couple basic alias things here but never grasped it enough > to do anything beyond making ls run with -al each time. I think there > is another one I did but it was long ago. I'd have to dig to find it. alias docrypt='cryptsetup whatever you normally type' Put that in your profile and you can then mount open the encrypted drives by typing docrypt. And if your setup changes, you change the alias but the command you type stays the same. Or you could use a shell script to open and mount with one command. #!/bin/sh cryptsetup whatever mount whatever -- Neil Bothwick Celery is not food. It is a member of the plywood family. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-18 13:21 ` Neil Bothwick @ 2022-04-18 14:06 ` Dale 2022-04-18 18:57 ` Neil Bothwick 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-18 14:06 UTC (permalink / raw To: gentoo-user Neil Bothwick wrote: > On Sun, 17 Apr 2022 13:45:39 -0500, Dale wrote: > >>>> What is crypttab? I type in the command manually. >>> Then use a shell alias, even less typing. >> I've done a couple basic alias things here but never grasped it enough >> to do anything beyond making ls run with -al each time. I think there >> is another one I did but it was long ago. I'd have to dig to find it. > alias docrypt='cryptsetup whatever you normally type' > > Put that in your profile and you can then mount open the encrypted drives > by typing docrypt. And if your setup changes, you change the alias but the > command you type stays the same. > > Or you could use a shell script to open and mount with one command. > > #!/bin/sh > cryptsetup whatever > mount whatever > > I have to enter a password in the middle of that. I don't know how that would work. As I've said before, my "scripts" are so simple, they may not even be called scripts. They're just files with commands in them. If nothing changes when I get around to rebooting, I'll get into this some more. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-18 14:06 ` Dale @ 2022-04-18 18:57 ` Neil Bothwick 2022-04-29 15:39 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Neil Bothwick @ 2022-04-18 18:57 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 631 bytes --] On Mon, 18 Apr 2022 09:06:11 -0500, Dale wrote: > > #!/bin/sh > > cryptsetup whatever > > mount whatever > > > > > > > I have to enter a password in the middle of that. I don't know how that > would work. As I've said before, my "scripts" are so simple, they may > not even be called scripts. They're just files with commands in them. > > If nothing changes when I get around to rebooting, I'll get into this > some more. It will prompt for the password, just as if you ran the command manually. -- Neil Bothwick One of the nice things about standards is that there are so many of them. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-18 18:57 ` Neil Bothwick @ 2022-04-29 15:39 ` Dale 0 siblings, 0 replies; 57+ messages in thread From: Dale @ 2022-04-29 15:39 UTC (permalink / raw To: gentoo-user Neil Bothwick wrote: > On Mon, 18 Apr 2022 09:06:11 -0500, Dale wrote: > >>> #!/bin/sh >>> cryptsetup whatever >>> mount whatever >>> >>> >> >> I have to enter a password in the middle of that. I don't know how that >> would work. As I've said before, my "scripts" are so simple, they may >> not even be called scripts. They're just files with commands in them. >> >> If nothing changes when I get around to rebooting, I'll get into this >> some more. > It will prompt for the password, just as if you ran the command manually. > > Finally got around to trying this. I went to town today and locked it up before I left. When I came back, used your little script trick and it worked great. It mounts and everything for me. Now I'll make one to umount and close as well. No prompting so it should be easy enough. Thanks for the tip. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 3:13 ` Rich Freeman 2022-04-17 4:44 ` Dale @ 2022-04-17 18:38 ` Dale 1 sibling, 0 replies; 57+ messages in thread From: Dale @ 2022-04-17 18:38 UTC (permalink / raw To: gentoo-user Rich Freeman wrote: > On Sat, Apr 16, 2022 at 6:39 PM Dale <rdalek1967@gmail.com> wrote: >> Neil Bothwick wrote: >>> Use /dev/disks/by/partlabel/foo or /dev/disks/by-partuuid/bar. >>> >> That's even more typing than /dev/sdk. Some things I do easily by using >> tab completion and all. When mounting, I let fstab remember the UUID >> for it. > That's what copy/paste is for. How often are you editing your > crypttab anyway? This way when you move drives around they still > work. > I did a google search for crypttab. After reading what its purpose is, I see why I don't have one. It seems it is more for decrypting and mounting things during bootup. I don't need to mount encrypted data to boot up or even log into KDE. I just need it to access data when needed. Most of the encrypted data that I access often is actually my external drives. When I leave home, I close the encrypted data. When I get home, I open it and remount it. If I need it for something. One day I may encrypt my /home directory. Maybe. I don't really see the need since any data I want protected can just be put on the encrypted part I have now. Anyway, I suspect when I reboot, this is will be back to the old way. I thought I was going to have a opportunity to do that last night. My lights went off for a few seconds. UPS kicked in and they came back on. It's not over yet tho. ;-) Or am I missing something? Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 17:45 ` Dale 2022-04-16 18:58 ` Neil Bothwick @ 2022-04-16 19:41 ` Frank Steinmetzger 2022-04-16 22:52 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Frank Steinmetzger @ 2022-04-16 19:41 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1306 bytes --] Am Sat, Apr 16, 2022 at 12:45:20PM -0500 schrieb Dale: > My next project, find a good external drive enclosure like the three I > got now. They no longer available tho. I like them because they have a > fan, a eSATA port and a nifty display to let me know things are > working. Really a good price for the features. I don't like USB > connected drives. Long story. How about a table-top dock? - no cable salad, caused by each enclosure having its own power supply and data cable - disks are used “naked”, so no heat buildup and you are more flexible Here are some models with eSATA: https://skinflint.co.uk/?cat=hddocks&xf=4426_eSATA And one of them even has four slots → even fewer cables. That’s of course if you use the disks intermittently and store them away inbetween. If you plan on running them for longer durations at a time, it may be better to use a proper enclosure, in order to protect the disks from physical influences (impacts, short-circuits). Also, those SATA connectors are not designed to be connected often. I think I read about 50 cycles somewhere. -- Grüße | Greetings | Salut | Qapla’ Please do not share anything from, with or about me on any social network. The knowing don’t talk much, the talking don’t know much. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 19:41 ` Frank Steinmetzger @ 2022-04-16 22:52 ` Dale 2022-04-16 23:01 ` Mark Knecht 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-16 22:52 UTC (permalink / raw To: gentoo-user Frank Steinmetzger wrote: > Am Sat, Apr 16, 2022 at 12:45:20PM -0500 schrieb Dale: > >> My next project, find a good external drive enclosure like the three I >> got now. They no longer available tho. I like them because they have a >> fan, a eSATA port and a nifty display to let me know things are >> working. Really a good price for the features. I don't like USB >> connected drives. Long story. > How about a table-top dock? > - no cable salad, caused by each enclosure having its own power supply and > data cable > - disks are used “naked”, so no heat buildup and you are more flexible > > Here are some models with eSATA: > https://skinflint.co.uk/?cat=hddocks&xf=4426_eSATA > And one of them even has four slots → even fewer cables. > > That’s of course if you use the disks intermittently and store them away > inbetween. If you plan on running them for longer durations at a time, it > may be better to use a proper enclosure, in order to protect the disks from > physical influences (impacts, short-circuits). Also, those SATA connectors > are not designed to be connected often. I think I read about 50 cycles > somewhere. > I've looked into those. They do have advantages for sure. One, the bare drives take up less room in my fire safe. Lots smaller than the enclosures I have now. My concern has always been the plugging/unplugging a lot and dust when not in use. I didn't know how long those connectors are supposed to last but the bad thing is, when it goes, the drive is gone to, plus the data. I do like that it is in open air which takes care of cooling pretty well. I do my backups once a week so it isn't as often as some situations but it isn't rare either. I've found a enclosure since my post but got to wait until next income boost to get one. May buy a few of them if I can. I think the ones I found have fans but no display but that's OK. I really like having the fan more than the display. It likely doesn't help with huge airflow but it gives it some airflow. I'm running pretty short on space in my case. I have a Cooler Master HAF-932 case. I'm out of 3.5" spots. I need to get some 5 1/4" to 3.5" adapters. I got some plastic thingys but they don't work in my case. It has that push button thingy and the plastic adapter is to loose for my comfort. Plus, it has little cooling too. The 3.5" bays have that big fan blowing on them. Working on a plan. Maybe this is a good excuse to start working on a NAS. :/ Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 22:52 ` Dale @ 2022-04-16 23:01 ` Mark Knecht 2022-04-17 1:05 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Mark Knecht @ 2022-04-16 23:01 UTC (permalink / raw To: Gentoo User On Sat, Apr 16, 2022 at 3:53 PM Dale <rdalek1967@gmail.com> wrote: <SNIP> Maybe this is a good excuse > to start working on a NAS. :/ That's my vote. (For the second time) I'm using a FreeBSD Nas (TrueNAS) but they recently came out with a Linux version which you might be more comfortable with. If you use a 1Gb/S or higher network connection it's quite fast. You can also go the Synology route via Amazon. You can get a 2-disk NAS chassis which does RAID for around $250 last time I looked. Good luck whatever you do. Mark ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 23:01 ` Mark Knecht @ 2022-04-17 1:05 ` Dale 2022-04-17 13:52 ` Mark Knecht 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-17 1:05 UTC (permalink / raw To: gentoo-user Mark Knecht wrote: > On Sat, Apr 16, 2022 at 3:53 PM Dale <rdalek1967@gmail.com> wrote: > <SNIP> > Maybe this is a good excuse >> to start working on a NAS. :/ > That's my vote. (For the second time) > > I'm using a FreeBSD Nas (TrueNAS) but they recently came out with a > Linux version which you might be more comfortable with. If you use a > 1Gb/S or higher network connection it's quite fast. > > You can also go the Synology route via Amazon. You can get a 2-disk > NAS chassis which does RAID for around $250 last time I looked. > > Good luck whatever you do. > > Mark Other than being another piece of equipment running up a light bill, it is the best way to deal with this. The way I'm doing now is a bit of a struggle at times. I just need to get other things done first, from a money perspective which inflation isn't helping on. A trip to the grocery story is no fun anymore. One of these days tho. I just gotta do it. Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 1:05 ` Dale @ 2022-04-17 13:52 ` Mark Knecht 2022-04-17 17:22 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Mark Knecht @ 2022-04-17 13:52 UTC (permalink / raw To: Gentoo User On Sat, Apr 16, 2022 at 6:06 PM Dale <rdalek1967@gmail.com> wrote: > > Mark Knecht wrote: > > On Sat, Apr 16, 2022 at 3:53 PM Dale <rdalek1967@gmail.com> wrote: > > <SNIP> > > Maybe this is a good excuse > >> to start working on a NAS. :/ > > That's my vote. (For the second time) > > > > I'm using a FreeBSD Nas (TrueNAS) but they recently came out with a > > Linux version which you might be more comfortable with. If you use a > > 1Gb/S or higher network connection it's quite fast. > > > > You can also go the Synology route via Amazon. You can get a 2-disk > > NAS chassis which does RAID for around $250 last time I looked. > > > > Good luck whatever you do. > > > > Mark > > Other than being another piece of equipment running up a light bill, it > is the best way to deal with this. The way I'm doing now is a bit of a > struggle at times. I just need to get other things done first, from a > money perspective which inflation isn't helping on. A trip to the > grocery story is no fun anymore. > > One of these days tho. I just gotta do it. > > Dale I hear you about groceries and inflation. Wol pushed me to build my first one just using an old computer. I had an old machine - case, power supply with a bad motherboard so I purchased an i3-2120 CPU @ 3.30GHz motherboard with 8GB memory used at a computer store for $40. Surprisingly that's more than enough CPU & memory for basic backups. No matter what you're going to have to pay for the drives whether they go in your box, in external cases or in a backup machine. I only turn it on to do backups or to retrieve data so not much electricity. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 13:52 ` Mark Knecht @ 2022-04-17 17:22 ` Dale 2022-04-17 17:36 ` Mark Knecht 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-17 17:22 UTC (permalink / raw To: gentoo-user Mark Knecht wrote: > On Sat, Apr 16, 2022 at 6:06 PM Dale <rdalek1967@gmail.com> wrote: >> Mark Knecht wrote: >>> On Sat, Apr 16, 2022 at 3:53 PM Dale <rdalek1967@gmail.com> wrote: >>> <SNIP> >>> Maybe this is a good excuse >>>> to start working on a NAS. :/ >>> That's my vote. (For the second time) >>> >>> I'm using a FreeBSD Nas (TrueNAS) but they recently came out with a >>> Linux version which you might be more comfortable with. If you use a >>> 1Gb/S or higher network connection it's quite fast. >>> >>> You can also go the Synology route via Amazon. You can get a 2-disk >>> NAS chassis which does RAID for around $250 last time I looked. >>> >>> Good luck whatever you do. >>> >>> Mark >> Other than being another piece of equipment running up a light bill, it >> is the best way to deal with this. The way I'm doing now is a bit of a >> struggle at times. I just need to get other things done first, from a >> money perspective which inflation isn't helping on. A trip to the >> grocery story is no fun anymore. >> >> One of these days tho. I just gotta do it. >> >> Dale > I hear you about groceries and inflation. Wol pushed me to build my > first one just using an old computer. I had an old machine - case, > power supply with a bad motherboard so I purchased an i3-2120 CPU @ > 3.30GHz motherboard with 8GB memory used at a computer store for $40. > Surprisingly that's more than enough CPU & memory for basic backups. > No matter what you're going to have to pay for the drives whether they > go in your box, in external cases or in a backup machine. > > I only turn it on to do backups or to retrieve data so not much electricity. > > I was wanting to have a NAS that also puts video on my TV. That way I can turn off my puter and still watch TV. It would be as much a media system as a NAS. I have a mobo, ram and I think I have a extra video card somewhere. I'd need a case, power supply and such. I'd also need a place to put all this which is going to be interesting. I'd want plenty of hard drive bays tho. I found a fractal 804 case that caught my eye. Can't recall all the details tho. Still, needs money and right now, I got to many other coals in the fire. Plus, I'm trying to figure out this crypttab thing. From what I've read, it is for opening encrypted drives during boot up which is not really what I want. I can boot and login into my KDE without anything encrypted being mounted. Kinda like this new setup really. I'll be so glad when fiber internet gets here. I think I'm going with the 500Mb/sec plan. Costs about the same as my current 1.5Mb/sec plan. lol Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 17:22 ` Dale @ 2022-04-17 17:36 ` Mark Knecht 2022-04-17 18:03 ` Dale 0 siblings, 1 reply; 57+ messages in thread From: Mark Knecht @ 2022-04-17 17:36 UTC (permalink / raw To: Gentoo User On Sun, Apr 17, 2022 at 10:22 AM Dale <rdalek1967@gmail.com> wrote: <SNIP> > > I was wanting to have a NAS that also puts video on my TV. That way I > can turn off my puter and still watch TV. It would be as much a media > system as a NAS. I have a mobo, ram and I think I have a extra video > card somewhere. I'd need a case, power supply and such. I'd also need > a place to put all this which is going to be interesting. I'd want > plenty of hard drive bays tho. I found a fractal 804 case that caught > my eye. Can't recall all the details tho. > > Still, needs money and right now, I got to many other coals in the > fire. Plus, I'm trying to figure out this crypttab thing. From what > I've read, it is for opening encrypted drives during boot up which is > not really what I want. I can boot and login into my KDE without > anything encrypted being mounted. Kinda like this new setup really. > > I'll be so glad when fiber internet gets here. I think I'm going with > the 500Mb/sec plan. Costs about the same as my current 1.5Mb/sec plan. > lol > > Dale I believe all of that can be done on TrueNAS, and most likely with any of the prepackaged boxes like Synology, but I've not do it myself. Most modern flatscreens can access NAS servers and play video and or music over the network so the NAS server itself need not have a GPU. I did put a VGA in both of mine as building them is easier, but it wasn't strictly necessary. TrueNAS can be built on a headless machine if you know the IP address. As for FreeBSD, they have 'jails' which I think are more or less chroot environments, so you can put whatever MythTV is called these days in a jail and run it from there. People do that with DNS, network monitors and all sorts of things. (Assuming you have enough compute power.) No need to do any of this now. It's good that you're thinking about solutions so that when the money comes along you'll be ready. Cheers, Mark ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 17:36 ` Mark Knecht @ 2022-04-17 18:03 ` Dale 2022-04-17 18:44 ` Mark Knecht 0 siblings, 1 reply; 57+ messages in thread From: Dale @ 2022-04-17 18:03 UTC (permalink / raw To: gentoo-user Mark Knecht wrote: > On Sun, Apr 17, 2022 at 10:22 AM Dale <rdalek1967@gmail.com> wrote: > <SNIP> >> I was wanting to have a NAS that also puts video on my TV. That way I >> can turn off my puter and still watch TV. It would be as much a media >> system as a NAS. I have a mobo, ram and I think I have a extra video >> card somewhere. I'd need a case, power supply and such. I'd also need >> a place to put all this which is going to be interesting. I'd want >> plenty of hard drive bays tho. I found a fractal 804 case that caught >> my eye. Can't recall all the details tho. >> >> Still, needs money and right now, I got to many other coals in the >> fire. Plus, I'm trying to figure out this crypttab thing. From what >> I've read, it is for opening encrypted drives during boot up which is >> not really what I want. I can boot and login into my KDE without >> anything encrypted being mounted. Kinda like this new setup really. >> >> I'll be so glad when fiber internet gets here. I think I'm going with >> the 500Mb/sec plan. Costs about the same as my current 1.5Mb/sec plan. >> lol >> >> Dale > I believe all of that can be done on TrueNAS, and most likely with > any of the prepackaged boxes like Synology, but I've not do it myself. > > Most modern flatscreens can access NAS servers and play video > and or music over the network so the NAS server itself > need not have a GPU. I did put a VGA in both of mine as building > them is easier, but it wasn't strictly necessary. TrueNAS can be > built on a headless machine if you know the IP address. > > As for FreeBSD, they have 'jails' which I think are more or less > chroot environments, so you can put whatever MythTV is called > these days in a jail and run it from there. People do that with > DNS, network monitors and all sorts of things. (Assuming > you have enough compute power.) > > No need to do any of this now. It's good that you're thinking > about solutions so that when the money comes along you'll > be ready. > > Cheers, > Mark > > When I bought my current TV, I avoided the smart ones. At the time, it was new technology and people were talking about how buggy it was so I bought a regular TV. If I had to buy one today, I'd buy a smart one. They seem to work pretty well now. Nice and stable at least. Still, I check to make sure whatever I buy is based on Linux as its OS. One can usually check the manual and see the copyright notice in the last few pages. It mentions the kernel. If it mentions windoze, I move on. LQ is almost always Linux based. I'm at the point where I know I need to do this. It's just getting there. I even thought about putting the OS on a USB stick. After all, once booted, it won't access the stick very often. I could even load it into memory at boot up and it not even need the stick at all once booted. Like is done with some Gentoo install media. One of these days. Dale :-) :-) P. S. New drive seems to be working fine. Now to figure out what to do with old one. :-D ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-17 18:03 ` Dale @ 2022-04-17 18:44 ` Mark Knecht 0 siblings, 0 replies; 57+ messages in thread From: Mark Knecht @ 2022-04-17 18:44 UTC (permalink / raw To: Gentoo User On Sun, Apr 17, 2022 at 11:04 AM Dale <rdalek1967@gmail.com> wrote: <SNIP> > > When I bought my current TV, I avoided the smart ones. At the time, it > was new technology and people were talking about how buggy it was so I > bought a regular TV. If I had to buy one today, I'd buy a smart one. > They seem to work pretty well now. Nice and stable at least. Still, I > check to make sure whatever I buy is based on Linux as its OS. One can > usually check the manual and see the copyright notice in the last few > pages. It mentions the kernel. If it mentions windoze, I move on. LQ > is almost always Linux based. > > I'm at the point where I know I need to do this. It's just getting > there. I even thought about putting the OS on a USB stick. After all, > once booted, it won't access the stick very often. I could even load it > into memory at boot up and it not even need the stick at all once > booted. Like is done with some Gentoo install media. > > One of these days. Fair enough. You might also investigate whether a newer Roku/AppleTV type machine will access a network share. I suspect they will. TrueNAS will run from a USB stick. You'll need two - one for the setup media and a second to install it to, but after that you only need storage drives to hold your backups or media. I think a NAS for backups and media playback makes sense. You want the machine on most of the time, but if you shut it down it won't generally stop you from using your main computer. On the other hand, with NVMe drives in my new machine I have no spinning media so I use the NAS as a network store much as you envision watching movies on your TV, but for me it's mostly astrophotography data. Have fun. Happy Easter if you celebrate it. Happy Sunday if you don't. Cheers, Mark ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 14:59 ` Dale 2022-04-16 17:20 ` Rich Freeman @ 2022-04-16 17:22 ` Michael 2022-04-16 18:01 ` Dale 1 sibling, 1 reply; 57+ messages in thread From: Michael @ 2022-04-16 17:22 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1614 bytes --] On Saturday, 16 April 2022 15:59:25 BST Dale wrote: > Frank Steinmetzger wrote: > > Am Fri, Apr 15, 2022 at 10:49:21AM -0500 schrieb Dale: > >> Howdy, > >> > >> I got the drive and pvmove is doing its thing. I would like to unplug > >> one of the drives and physically move them around without shutting down > >> my system. Is there a way to tell LVM to disable the drives while I'm > >> doing this and restart them when done? > > > > Be aware that SATA hot-plugging must be enabled in the BIOS for each > > individual SATA port (at least that’s the case on my board). I’m not sure > > what a difference it actually makes, though. > > I enabled that the first time I cut the system on after building it. I > couldn't think of any reason not to have it enabled really. It would be > like making USB require rebooting before plugging/unplugging something. > Certainly better than the old IDE days. > > I have googled and can not find a way to reset udev and it naming > drives. I may have to rework some things since the drive kept the sdk > instead of switching to sdd when I made the physical change. Thing is, > I suspect it will when I reboot the next time. It also triggered > messages from SMART too. It got upset that it couldn't find sdd anymore. > > Dale > > :-) :-) Have a look at this post. It explains why you could end up with a race condition if you set up udev rules to name disks in different order than what the kernel assigns: https://www.linuxquestions.org/questions/linux-hardware-18/udev-persistent-disk-name-4175450519/#post4893847 [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [gentoo-user] Hard drive error from SMART 2022-04-16 17:22 ` Michael @ 2022-04-16 18:01 ` Dale 0 siblings, 0 replies; 57+ messages in thread From: Dale @ 2022-04-16 18:01 UTC (permalink / raw To: gentoo-user Michael wrote: > On Saturday, 16 April 2022 15:59:25 BST Dale wrote: >> Frank Steinmetzger wrote: >>> Am Fri, Apr 15, 2022 at 10:49:21AM -0500 schrieb Dale: >>>> Howdy, >>>> >>>> I got the drive and pvmove is doing its thing. I would like to unplug >>>> one of the drives and physically move them around without shutting down >>>> my system. Is there a way to tell LVM to disable the drives while I'm >>>> doing this and restart them when done? >>> Be aware that SATA hot-plugging must be enabled in the BIOS for each >>> individual SATA port (at least that’s the case on my board). I’m not sure >>> what a difference it actually makes, though. >> I enabled that the first time I cut the system on after building it. I >> couldn't think of any reason not to have it enabled really. It would be >> like making USB require rebooting before plugging/unplugging something. >> Certainly better than the old IDE days. >> >> I have googled and can not find a way to reset udev and it naming >> drives. I may have to rework some things since the drive kept the sdk >> instead of switching to sdd when I made the physical change. Thing is, >> I suspect it will when I reboot the next time. It also triggered >> messages from SMART too. It got upset that it couldn't find sdd anymore. >> >> Dale >> >> :-) :-) > Have a look at this post. It explains why you could end up with a race > condition if you set up udev rules to name disks in different order than what > the kernel assigns: > > https://www.linuxquestions.org/questions/linux-hardware-18/udev-persistent-disk-name-4175450519/#post4893847 I think I've read about that before. Gonna read it in a minute. What I'd like is a way to reset it back to like it would be with a fresh install for example. I figure there is a config file somewhere that stores this sort of thing but no clue where it is tho. Oh well. Maybe one day. ;-) Dale :-) :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
end of thread, other threads:[~2022-04-29 15:39 UTC | newest] Thread overview: 57+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-04-12 1:27 [gentoo-user] Hard drive error from SMART Dale 2022-04-12 8:05 ` Wols Lists 2022-04-12 13:01 ` Dale 2022-04-12 14:20 ` Laurence Perkins 2022-04-12 14:57 ` Rich Freeman 2022-04-12 17:08 ` Dale 2022-04-12 17:21 ` Laurence Perkins 2022-04-12 18:22 ` Dale 2022-04-12 19:41 ` Laurence Perkins 2022-04-12 21:50 ` Wol 2022-04-12 22:05 ` Laurence Perkins 2022-04-12 22:16 ` Julien Roy 2022-04-12 21:59 ` Dale 2022-04-12 19:17 ` Wols Lists 2022-04-12 22:00 ` Dale 2022-04-12 22:03 ` Mark Knecht 2022-04-12 17:39 ` Frank Steinmetzger 2022-04-12 18:09 ` Laurence Perkins 2022-04-12 18:54 ` Frank Steinmetzger 2022-04-12 19:27 ` Rich Freeman 2022-04-12 22:03 ` Dale 2022-04-12 22:49 ` Frank Steinmetzger 2022-04-12 23:01 ` Dale 2022-04-12 23:59 ` Frank Steinmetzger 2022-04-13 0:43 ` Dale 2022-04-15 15:49 ` Dale 2022-04-15 16:50 ` John Covici 2022-04-15 17:29 ` Dale 2022-04-16 1:43 ` Dale 2022-04-16 13:16 ` Frank Steinmetzger 2022-04-16 14:59 ` Dale 2022-04-16 17:20 ` Rich Freeman 2022-04-16 17:45 ` Dale 2022-04-16 18:58 ` Neil Bothwick 2022-04-16 20:09 ` Alan Mackenzie 2022-04-16 21:40 ` Neil Bothwick 2022-04-16 22:39 ` Dale 2022-04-17 3:13 ` Rich Freeman 2022-04-17 4:44 ` Dale 2022-04-17 8:34 ` Neil Bothwick 2022-04-17 18:45 ` Dale 2022-04-18 13:21 ` Neil Bothwick 2022-04-18 14:06 ` Dale 2022-04-18 18:57 ` Neil Bothwick 2022-04-29 15:39 ` Dale 2022-04-17 18:38 ` Dale 2022-04-16 19:41 ` Frank Steinmetzger 2022-04-16 22:52 ` Dale 2022-04-16 23:01 ` Mark Knecht 2022-04-17 1:05 ` Dale 2022-04-17 13:52 ` Mark Knecht 2022-04-17 17:22 ` Dale 2022-04-17 17:36 ` Mark Knecht 2022-04-17 18:03 ` Dale 2022-04-17 18:44 ` Mark Knecht 2022-04-16 17:22 ` Michael 2022-04-16 18:01 ` Dale
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox