From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 6120C138350 for ; Sun, 3 May 2020 18:30:33 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 38750E0964; Sun, 3 May 2020 18:29:55 +0000 (UTC) Received: from mail-qk1-x729.google.com (mail-qk1-x729.google.com [IPv6:2607:f8b0:4864:20::729]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id ADA47E08A4 for ; Sun, 3 May 2020 18:29:54 +0000 (UTC) Received: by mail-qk1-x729.google.com with SMTP id g185so910578qke.7 for ; Sun, 03 May 2020 11:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=S/1gOi8drS92MggJ9TYAGJc3YprltKhTESv7K1E1Ngs=; b=c1ALub3oFo51rDdK5NT9+aZzR6PIWngPWSPr0pQAk6BHRFWPwfOgSQzLHFNpKt85RJ VPB/Ut1W2JUrekn1g9/UuMHrMivOcRsg5A74++CSQDaCEEIkmI7XDIriJT0/NnDTwD9i oPU7X7Biiqp3mcnDx/0bUaJILEmqKUz0NRHS/SnzHym82TJOpT3Tg6dyDRPx1N4lkK6y 3LVT9JDjhg1jBM9UBbywexEzQT+4q2M3mZNic6tTMe9sjKqsYrIaCyg+FM+dXFSHAY9T b9cy5R2VjosiA86w6ws4+ZQP57mKiSUTX56zTPfgoluiUezieytl2ZDCF6sLnbAbrOXc 7Gdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=S/1gOi8drS92MggJ9TYAGJc3YprltKhTESv7K1E1Ngs=; b=sQZhM6yLtHluFT8Qoko3oUaB6FErz6ijYr3Zs1dWNRZkqo21v9N7czBCXgajaHPLin ON5B6CUVU4CqeVSYKBKmQVfxPPo1rU1Fv3dU4SkoiP5aTLWKPehnsiaYbAohcHluJ3jc KFDaYYo87mGaOCATZrT58P45hnB35XgW8MPWiIuL5PBLsRyj/2/QI4sGwDeGUN/VJGUx jzosSTOJTXA1vHd3uo9ZxwVJDaAmZ/EvUcAAB8whvyzWHgEffQ1N1NiC3k9ugnxM+Uz8 ZK/lVK9hqiSmtiTaAVZrcz3685ar04KTkHM6lBbIQ8uWKH7+j+oHK2j5usNMnwG4ZVCf N+Tw== X-Gm-Message-State: AGi0PuZ8NsHIIhvzMS0VJ5fav4xhjhL/lIC6Zc1V2pRwZ2ns9lOGbxZ8 kfdKOOo8r7WnzmAOKj5fbwkqxifN1WfiiaCZlDmAmZGk X-Google-Smtp-Source: APiQypJydOd3U5fnCO/GzBVrimzv5rDoEN2T0VfrCJmG8AwIJZSwXxRF32VQQ8GSieR+410n6fdlHo8UmDsdu46cIQE= X-Received: by 2002:a37:d92:: with SMTP id 140mr10063676qkn.426.1588530593432; Sun, 03 May 2020 11:29:53 -0700 (PDT) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 References: <1821f420-8977-0d91-b0f9-9d41b12d11ae@konstantinhansen.de> <5EAE8DA1.8020105@youngman.org.uk> In-Reply-To: From: Mark Knecht Date: Sun, 3 May 2020 11:29:42 -0700 Message-ID: Subject: Re: [gentoo-user] which linux RAID setup to choose? To: Gentoo User Content-Type: multipart/alternative; boundary="000000000000bc8cf305a4c299f1" X-Archives-Salt: 4045a076-4361-40d3-9979-6fccb4cda53b X-Archives-Hash: 0b14e20da8d1a673e703a13363ff54fa --000000000000bc8cf305a4c299f1 Content-Type: text/plain; charset="UTF-8" On Sun, May 3, 2020 at 10:56 AM Caveman Al Toraboran < toraboracaveman@protonmail.com> wrote: > > On Sunday, May 3, 2020 1:23 PM, Wols Lists wrote: > > > For anything above raid 1, MAKE SURE your drives support SCT/ERC. For > > example, Seagate Barracudas are very popular desktop drives, but I guess > > maybe HALF of the emails asking for help recovering an array on the raid > > list involve them dying ... > > > > (I've got two :-( but my new system - when I get it running - has > > ironwolves instead.) > > that's very scary. > > just to double check: are those help emails about > linux's software RAID? or is it about hardware > RAIDs? > > the reason i ask about software vs. hardware, is > because of this wiki article [1] which seems to > suggest that mdadm handles error recovery by > waiting for up to 30 seconds (set in > /sys/block/sd*/device/timeout) after which the > device is reset. > > am i missing something? to me it seems that [1] > seems to suggest that linux software raid has a > reliable way to handle the issue? since i guess > all disks support resetting well? > > [1] https://en.wikipedia.org/wiki/Error_recovery_control#Software_RAID > When doing Linux RAID, hardware or software, make sure you get a RAID aware drive that supports TLER (Time Limited Error Recovery) or whatever the vendor that makes your drive calls it. Typically this is set at about 7 seconds guaranteeing that no mater what's going on the drive will respond to the upper layers (mdadm) to let it know it's alive. A non-RAID drive with no TLER feature will respond when it's ready and typically if that's longer than 30 seconds then the RAID subsystem kicks the drive and you have to re-add it. While there's nothing 'technically' wrong with the storage when the RAID rebuilds you eventually hit another on of these >30 second waits and another drive gets kicked and you're dead. I've used the WD Reds and WD Golds (no not sold) and never had any problem. Build a RAID with a WD Green and you're in for trouble. ;-))) HTH, Mark --000000000000bc8cf305a4c299f1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, May 3, 2020 at 10:56 AM Caveman Al Torabor= an <toraboracaveman@pr= otonmail.com> wrote:
>
> On Sunday, May 3, 2020 1:23 PM,= Wols Lists <antlists@youngm= an.org.uk> wrote:
>
> > For anything above raid 1, MA= KE SURE your drives support SCT/ERC. For
> > example, Seagate Barr= acudas are very popular desktop drives, but I guess
> > maybe HALF= of the emails asking for help recovering an array on the raid
> >= list involve them dying ...
> >
> > (I've got two :-= ( but my new system - when I get it running - has
> > ironwolves i= nstead.)
>
> that's very scary.
>
> just to dou= ble check: =C2=A0are those help emails about
> linux's software R= AID? =C2=A0or is it about hardware
> RAIDs?
>
> the reaso= n i ask about software vs. hardware, is
> because of this wiki articl= e [1] which seems to
> suggest that mdadm handles error recovery by> waiting for up to 30 seconds (set in
> /sys/block/sd*/device/t= imeout) after which the
> device is reset.
>
> am i missi= ng something? =C2=A0to me it seems that [1]
> seems to suggest that l= inux software raid has a
> reliable way to handle the issue? =C2=A0si= nce i guess
> all disks support resetting well?
>
> [1] <= a href=3D"https://en.wikipedia.org/wiki/Error_recovery_control#Software_RAI= D">https://en.wikipedia.org/wiki/Error_recovery_control#Software_RAID>

When doing Linux RAID, hardware or software, = make sure you get a RAID aware drive that supports TLER (Time Limited Error= Recovery) or whatever the vendor that makes your drive calls it. Typically= this is set at about 7 seconds guaranteeing that no mater what's going= on the drive will respond to the upper layers (mdadm) to let it know it= 9;s alive. A non-RAID drive with no TLER feature will respond when it's= ready and typically if that's longer than 30 seconds then the RAID sub= system kicks the drive and you have to re-add it. While there's nothing= 'technically' wrong with the storage when the RAID rebuilds you ev= entually hit another on of these >30 second waits and another drive gets= kicked and you're dead.

I've used the WD = Reds and WD Golds (no not sold) and never had any problem.=C2=A0
=
Build a RAID with a WD Green and you're in for trouble. = ;-)))

HTH,
Mark


--000000000000bc8cf305a4c299f1--