From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (unknown [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 9DB741381FA for ; Tue, 27 May 2014 23:38:08 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id B1BF6E084A; Tue, 27 May 2014 23:38:06 +0000 (UTC) Received: from uberouter3.guranga.net (unknown [78.25.223.226]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id B271BE0821 for ; Tue, 27 May 2014 23:38:05 +0000 (UTC) Received: by uberouter3.guranga.net (Postfix, from userid 81) id 19DA582E67; Wed, 28 May 2014 00:38:04 +0100 (BST) To: gentoo-amd64@lists.gentoo.org Subject: Re: [gentoo-amd64] Soliciting new RAID ideas X-PHP-Originating-Script: 81:rcmail.php Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@lists.gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Wed, 28 May 2014 00:38:03 +0100 From: thegeezer@thegeezer.net In-Reply-To: References: <20140527223938.GA3701@sgi.com> Message-ID: X-Sender: thegeezer@thegeezer.net User-Agent: Roundcube Webmail/0.9.5 X-Archives-Salt: 94e7d9af-a860-43b1-898b-28c2292d901f X-Archives-Hash: 9e90ff9b3224cdd70311c677edebf48f On 2014-05-27 23:58, Harry Holt wrote: > On May 27, 2014 6:39 PM, "Bob Sanders" wrote: > > > > Mark Knecht, mused, then expounded: > > > Hi all, > > >    The list is quiet. Please excuse me waking it up. (Or trying > to...) ;-) > > > > > >    I'm at the point where I'm a few months from running out of > disk > > > space on my RAID6 so I'm considering how to move forward. I > thought > > > I'd check in here and get any ideas folks have. Thanks in > advance. > > > > > > > Beware - if Adobe acroread is used, and you opt for a 3TB home > > directory, there is a chance it will not work.  Or more > specifically, > > acroread is still 32-bit.  It's only something I've seen with the > xfs > > filesystem.  And Adobe has ignored it for approx. 3yrs now. > > > > >    The system is a Gentoo 64-bit, mostly stable, using a > i7-980x > > > Extreme Edition processor with 24GB DRAM. Large chassis, 6 > removable > > > HD bays, room for 6 other drives, a large power supply. > > > > > >    The disk subsystem is a 1.4TB RAID6 built from five SATA2 > 500GB WD > > > RAID-Edition 3 drives. The RAID has not had a single glitch in > the 4+ > > > years I've used this machine. > > > > > >    Generally there are 4 classes of data on the RAID: > > > > > > 1) Gentoo (obviously), configs backed up every weekend. I plan to > > > rebuild from scratch using existing configs if there's a failure. > > > Being down for a couple of days is not an issue. > > > 2) VMs - about 300GB. Loaded every morning, stopped & saved every > > > night, backed up every weekend. > > > 3) Financial data - lots of it - stocks, futures, options, etc. > > > Performance requirements are pretty low. Backed up every weekend. > > > 4) Video files - backed up to a different location than items > 1/2/3 > > > whenever there are changes > > > > > >    After eclean-dist/eclean-pkg I'm down to about 80GB free and > this > > > will fill up in 3-6 months so it's time to make some changes. > > > > > >    My thoughts: > > > > > > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go > with > > > RAID1. This would use the internal SATA2 ports so it wouldn't be > the > > > highest performance but likely a lot better than my SATA2 RAID6. > > > > > > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware > RAID > > > controller. This would be SATA3 so probably way more performance > than > > > I have now. MUCH more expensive though. > > > > > > > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA > ports. > > > > > 3) #1 + an SSD. I have an unused 120GB SSD so I could get > another, > > > make a 2-disk RAID1, put Gentoo on that and everything else on > the > > > newer 3TB drives. More complex, probably lower reliability and > I'm not > > > sure I gain much. > > > > > >    Beyond this I need to talk file system types. I'm fat dumb > and > > > happy with Ext4 and don't really relish dealing with new stuff > but > > > now's the time to at least look. > > > > > > > If you change, do not use ZFS and possibly BTRFS if the system does > not > > have ECC DRAM.  A single, unnoticed, ECC error can corrupt the > data pool > > and be written to the file system, which effectively renders it > corrupt > > without a way to recover. > > > > FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives > and > > provide a boot of nfs or iSCSI to your VMs.  The downside is the > NAS box > > and drives would go for a bit north of $636.  The upside is all > your > > movies and VM files could move off your workstation and the > workstation > > would still host the VMs via a mount of the NAS box. > > +1 for the Synology NAS boxes, those things are awesome, fast, > reliable, upgradable (if you buy a larger one), and the best value > available for iSCSI attached VMs. while i agree on the +1 for iscsi storage, there are a few drawbacks. yes the modularity is awesome primarily -- super simple to spin up backup system and "move" data with a simple connection command. also a top tip would be to have teh "data" part of the vm as an iscsi connection too, so you can easily detach/reattach to another vm. however, depending on the vm's you have you will probably start needing to use more than one gigabit connection to max out speeds: 1gigabit ethernet is not the same as 6gigabit sata3, and spinning rust is not the same as ssd. looking to the spec of the existing workstation, i'd be tempted to stay with mdadm rather than a hardware raid card (which is probably running embedded anyway) though with that i7 you have disabled turboboost right? what would be an interesting comparison is pci-express speed vs motherboard sata - cpu bridge speed, obviously spinning disks will not max 6gbit, and the motherboard may not give you 6x 6gbit real throughput, whereas dedicated hardware raid _might_ do if it had intelligent caching. other fun to look at would be lvm cos i personally think it's awesome. for an example the first half of spinning disks is substantially faster than the second half due to the tracks on the outer part, so i split each disk into three partitions fast,med,slow and add to lvm volume group, you can then group the fasts into a raid, medium into a raid and slows into a raid too; mdadm allows similar configs with partitions. ZFS for me lost it's lustre when minimum requirement was 1GB RAM per terabyte...i may have my gigabytes and gigabits mixed up on this one happy for someone to correct me. BTRFS looks very very interesting to me, though still not played with it but mostly for checksums, the rest i can do with lvm. you might also like to consider fun with deduplication, by have a raid base, with lvm on top with block level dedupe ala lessfs, then lvm inside the deduped-lvm (yeah i know i'm sick, but the doctor tells me the layers of abstraction eventually combine happily :) but i'm not sure you'll get much benefit from virtualmachines and movies being deduped. if you add an ssd into the mix you can also look at devicemapper caches such as bcache and dm-cache, or even just moving the journal of your ext4 partition there instead. crucially you need to think about what your issues you _need_ to solve and those that you would like to solve. space is obviously one issue, and performance is not really an issue for you. depending on your budget a pair of large sata drives + mdadm will be ideal, if you had lvm already you could simply 'move' then 'enlarge' your existing stuff (tm) : i'd like to know how btrfs would do the same for anyone who can let me know. you have raid6 because you probably know that raid5 is just waiting for trouble, so i'd probably start looking at btrfs for your finanical data to be checksummed. also consider ECC memory if your motherboard supports it, never mind the hosing of filesystems, if you are running vm's you do _not_ want memory making them behave oddly or worse, and if you have lots of active financial data (bloomberg + analytics) you run the risk of the butterfly effect making odd results.