public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: Bill Kenworthy <billk@iinet.net.au>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Hard drive storage questions
Date: Fri, 9 Nov 2018 16:17:59 +0800	[thread overview]
Message-ID: <f0fefde7-a601-2b8d-be5f-9f9a953ba801@iinet.net.au> (raw)
In-Reply-To: <CAGfcS_nJCc3L7v7zupYD_214gUGNev6Gmm170c8MQ0PxApuqVA@mail.gmail.com>

On 09/11/18 10:29, Rich Freeman wrote:
> On Thu, Nov 8, 2018 at 8:16 PM Dale <rdalek1967@gmail.com> wrote:
>> I'm trying to come up with a
>> plan that allows me to grow easier and without having to worry about
>> running out of motherboard based ports.
>>
> So, this is an issue I've been changing my mind on over the years.
> There are a few common approaches:
>
> * Find ways to cram a lot of drives on one host
> * Use a patchwork of NAS devices or improvised hosts sharing over
> samba/nfs/etc and end up with a mess of mount points.
> * Use a distributed FS
>
> Right now I'm mainly using the first approach, and I'm trying to move
> to the last.  The middle option has never appealed to me.
>
> So, to do more of what you're doing in the most efficient way
> possible, I recommend finding used LSI HBA cards.  These have mini-SAS
> ports on them, and one of these can be attached to a breakout cable
> that gets you 4 SATA ports.  I just picked up two of these for $20
> each on ebay (used) and they have 4 mini-SAS ports each, which is
> capacity for 16 SATA drives per card.  Typically these have 4x or
> larger PCIe interfaces, so you'll need a large slot, or one with a
> cutout.  You'd have to do the math but I suspect that if the card+MB
> supports PCIe 3.0 you're not losing much if you cram it into a smaller
> slot.  If most of the drives are idle most of the time then that also
> demands less bandwidth.  16 fully busy hard drives obviously can put
> out a lot of data if reading sequentially.
>
> You can of course get more consumer-oriented SATA cards, but you're
> lucky to get 2-4 SATA ports on a card that runs you $30.  The mini-SAS
> HBAs get you a LOT more drives per PCIe slot, and your PCIe slots are
> you main limiting factor assuming you have power and case space.
>
> Oh, and those HBA cards need to be flashed into "IT" mode - they're
> often sold this way, but if they support RAID you want to flash the IT
> firmware that just makes them into a bunch of standalone SATA slots.
> This is usually a PITA that involves DOS or whatever, but I have
> noticed some of the software needed in the Gentoo repo.
>
> If you go that route it is just like having a ton of SATA ports in
> your system - they just show up as sda...sdz and so on (no idea where
> it goes after that).  Software-wise you just keep doing what you're
> already doing (though you should be seriously considering
> mdadm/zfs/btrfs/whatever at that point).
>
> That is the more traditional route.
>
> Now let me talk about distributed filesystems, which is the more
> scalable approach.  I'm getting tired of being limited by SATA ports,
> and cases, and such.  I'm also frustrated with some of zfs's
> inflexibility around removing drives.  These are constraints that make
> upgrading painful, and often inefficient.  Distributed filesystems
> offer a different solution.
>
> A distributed filesystem spreads its storage across many hosts, with
> an arbitrary number of drives per host (more or less).  So, you can
> add more hosts, add more drives to a host, and so on.  That means
> you're never forced to try to find a way to cram a few more drives in
> one host.  The resulting filesystem appears as one gigantic filesystem
> (unless you want to split it up), which means no mess of nfs
> mountpoints and so on, and all the other headaches of nfs.  Just as
> with RAID these support redundancy, except now you can lose entire
> hosts without issue.  With many you can even tell it which
> PDU/rack/whatever each host is plugged into, and it will make sure you
> can lose all the hosts in one rack.  You can also mount the filesystem
> on as many hosts as you want at the same time.
>
> They do tend to be a bit more complex.  The big players can scale VERY
> large - thousands of drives easily.  Everything seems to be moving
> towards Ceph/CephFS.  If you were hosting a datacenter full of
> VMs/containers/etc I'd be telling you to host it on Ceph.  However,
> for small scale (which you definitely are right now), I'm not thrilled
> with it.  Due to the way it allocates data (hash-based) anytime
> anything changes you end up having to move all the data around in the
> cluster, and all the reports I've read suggests it doesn't perform all
> that great if you only have a few nodes.  Ceph storage nodes are also
> RAM-hungry, and I want to run these on ARM to save power, and few ARM
> boards have that kind of RAM, and they're very expensive.
>
> Personally I'm working on deploying a cluster of a few nodes running
> LizardFS, which is basically a fork/derivative of MooseFS.  While it
> won't scale nearly as well, below 100 nodes should be fine, and in
> particular it sounds like it works fairly well with only a few nodes.
> It has its pros and cons, but for my needs it should be sufficient.
> It also isn't RAM-hungry.  I'm going to be testing it on some
> RockPro64s, with the LSI HBAs.
>
> I did note that Gentoo lacks a LizardFS client.  I suspect I'll be
> looking to fix that - I'm sure the moosefs ebuild would be a good
> starting point.  I'm probably going to be a whimp and run the storage
> nodes on Ubuntu or whatever upstream targets - they're basically
> appliances as far as I'm concerned.
>
> So, those are the two routes I'd recommend.  Just get yourself an HBA
> if you only want a few more drives.  If you see your needs expanding
> then consider a distributed filesystem.  The advantage of the latter
> is that you can keep expanding it however you want with additional
> drives/nodes/whatever.  If you're going over 20 nodes I'd use Ceph for
> sure - IMO that seems to be the future of this space.
>
I'll second your comments on ceph after my experience - great idea for
large scale systems, otherwise performance is quite poor on small
systems. Needs at least GB connections with two networks as well as only
one or two drives per host to work properly.

I think I'll give lizardfs a go - an interesting read.


BillK




  reply	other threads:[~2018-11-09  8:19 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-09  1:16 [gentoo-user] Hard drive storage questions Dale
2018-11-09  1:31 ` Jack
2018-11-09  1:43   ` Dale
2018-11-09  2:04     ` Andrew Lowe
2018-11-09  2:07     ` Bill Kenworthy
2018-11-09  8:39       ` Neil Bothwick
2018-11-09  2:29 ` Rich Freeman
2018-11-09  8:17   ` Bill Kenworthy [this message]
2018-11-09 13:25     ` Rich Freeman
2018-11-09  9:02   ` J. Roeleveld
2018-11-11  0:45   ` Dale
2018-11-11 21:41     ` Wol's lists
2018-11-11 22:17       ` Dale
2018-11-09  9:24 ` Wols Lists
  -- strict thread matches above, loose matches on Subject: below --
2015-04-28  8:39 Dale
2015-04-28 14:49 ` Francisco Ares
2015-04-28 15:01 ` Alan McKinnon
2015-04-28 15:24   ` Neil Bothwick
2015-04-28 17:38     ` Rich Freeman
2015-04-28 18:11       ` Neil Bothwick
2015-04-28 18:31         ` Rich Freeman
2015-04-28 18:41           ` Neil Bothwick
2015-04-29  6:13     ` Alan McKinnon
2015-04-29  7:52       ` Neil Bothwick
2015-05-04  7:39         ` Dale
2015-05-04  7:46           ` Neil Bothwick
2015-05-04  8:13             ` Mick
2015-05-04  8:26               ` Dale
2015-05-04  8:23             ` Dale
2015-05-04 10:31               ` Neil Bothwick
2015-05-04 10:40                 ` Dale
2015-05-04 11:26                   ` Neil Bothwick
2015-05-09 10:56                     ` Dale
2015-05-09 12:59                       ` Rich Freeman
2015-05-09 14:46                         ` Todd Goodman
2015-05-09 18:16                           ` Rich Freeman
2015-05-04 11:35                 ` Rich Freeman
2015-05-04 18:42                   ` Nuno Magalhães
2015-05-05  6:41                     ` Alan McKinnon
2015-05-05 10:56                     ` Rich Freeman
2015-05-05 11:33                       ` Neil Bothwick
2015-05-05 12:05                         ` Mick
2015-05-05 12:21                           ` Neil Bothwick
2015-05-05 12:39                             ` Mick
2015-05-05 12:53                             ` Rich Freeman
2015-05-05 21:50                               ` Neil Bothwick
2015-05-05 22:21                                 ` Bill Kenworthy
2015-05-05 22:33                                   ` Bill Kenworthy
2015-05-04 10:57               ` Alan Mackenzie
2015-04-28 15:02 ` Rich Freeman
2015-05-04  7:23 ` Dale
2015-05-05  3:01   ` Walter Dnes
2015-04-27  7:41 Dale
2015-04-28 18:25 ` Daniel Frey
2015-04-28 21:23   ` Dale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f0fefde7-a601-2b8d-be5f-9f9a953ba801@iinet.net.au \
    --to=billk@iinet.net.au \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox