* [gentoo-user] File system testing @ 2014-09-16 19:07 James 2014-09-17 7:45 ` J. Roeleveld ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: James @ 2014-09-16 19:07 UTC (permalink / raw To: gentoo-user Hello, By now many are familiar with my keen interest in clustering gentoo systems. So, what most cluster technologies use is a distributed file system on top of the local (HD/SDD) file system. Naturally not all file systems, particularly the distributed file systems, have straightforward instructions. Also, an device file system, such as XFS and a distibuted (on top of the device file system) combination may not work very well when paired. So a variety of testing is something I'm researching. Eliminiation of either file system listed below, due to Gentoo User Experience is most welcome information, as well as tips and tricks to setting up any file system. Distributed File Systems (DFS): HDFS (poor performance) Lustre Ceph XtreemFS GlusterFS MooseFS FhGFS (BeeGFS) soon to be entirely open sourced? Any other distributed file systems I should consider using? Local (Device) File Systems LFS: btrfs zfs ext4 xfs Obviously I do not what to test all combinations of DFS/LocalFS so your comments are extremely welcome as is any and all related information. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-16 19:07 [gentoo-user] File system testing James @ 2014-09-17 7:45 ` J. Roeleveld 2014-09-17 15:55 ` [gentoo-user] " James 2014-09-17 18:10 ` [gentoo-user] " Hervé Guillemet 2014-09-25 20:47 ` [gentoo-user] " thegeezer 2 siblings, 1 reply; 25+ messages in thread From: J. Roeleveld @ 2014-09-17 7:45 UTC (permalink / raw To: gentoo-user On Tuesday, September 16, 2014 07:07:38 PM James wrote: > Hello, > > By now many are familiar with my keen interest in clustering gentoo > systems. So, what most cluster technologies use is a distributed file > system on top of the local (HD/SDD) file system. Naturally not > all file systems, particularly the distributed file systems, have > straightforward instructions. Also, an device file system, such as > XFS and a distibuted (on top of the device file system) combination > may not work very well when paired. So a variety of testing is > something I'm researching. Eliminiation of either file system > listed below, due to Gentoo User Experience is most welcome information, > as well as tips and tricks to setting up any file system. > > > Distributed File Systems (DFS): > HDFS (poor performance) > Lustre > Ceph > XtreemFS > GlusterFS > MooseFS > FhGFS (BeeGFS) soon to be entirely open sourced? > Any other distributed file systems I should consider using? > > Local (Device) File Systems LFS: > btrfs > zfs > ext4 > xfs > > Obviously I do not what to test all combinations of DFS/LocalFS > so your comments are extremely welcome as is any and all > related information. > > James James, Is my understanding correct that the top list all require one of the bottom list? Eg. the "clustering" FSs only ensure the files on the LFSs are duplicated/spread over the various nodes? I would normally expect the clustering FS to be either the full layer or a clustered block-device where an FS can be placed on top. Otherwise it seems more like a network filesystem with caching options (See AFS). I am also interested in these filesystems, but for a slightly different scenario: - 2 servers in remote locations (different offices) - 1 of these has all the files stored (server A) at the main office - The other (server B - remote office) needs to "offer" all files from serverA When server B needs to supply a file, it needs to check if the local copy is still the "valid" version. If yes, supply the local copy, otherwise download from server A. When a file is changed, server A needs to be updated. While server B is sharing a file, the file needs to be locked on server A preventing simultaneous updates. I prefer not to supply the same amount of storage at server B as server A has. The remote location generally only needs access to 5% of the total amount of files stored on server A. But not always the same 5%. Does anyone know of a filesystem that can handle this? -- Joost ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-user] Re: File system testing 2014-09-17 7:45 ` J. Roeleveld @ 2014-09-17 15:55 ` James 2014-09-17 19:34 ` J. Roeleveld 0 siblings, 1 reply; 25+ messages in thread From: James @ 2014-09-17 15:55 UTC (permalink / raw To: gentoo-user J. Roeleveld <joost <at> antarean.org> writes: > > Distributed File Systems (DFS): > > Local (Device) File Systems LFS: > Is my understanding correct that the top list all require one of > the bottom list? > Eg. the "clustering" FSs only ensure the files on the LFSs are > duplicated/spread over the various nodes? > I would normally expect the clustering FS to be either the full layer > or a clustered block-device where an FS can be placed on top. I have not performed these installation yet. My research indicates that first you put the Local FS on the drive, just like any installation of Linux. Then you put the distributed FS on top of this. Some DFS might not require a LFS, but FhGFS does and does HDFS. I will not acutally be able to accurately answer your questions, until I start to build up the 3 system cluster. (a week or 2 away) is my best guess. > Otherwise it seems more like a network filesystem with caching > options (See AFS). OK, I'll add AFS. You may be correct on this one or AFS might be both. > I am also interested in these filesystems, but for a slightly different > scenario: Ok, so I the "test-dummy-crash-victim" I'd be honored to have, you, Alan, Neil, Mic etc etc back-seat-0drive on this adventure! (The more I read the more it's time for burbon, bash, and a bit of cursing to get started...) > - 2 servers in remote locations (different offices) > - 1 of these has all the files stored (server A) at the main office > - The other (server B - remote office) needs to "offer" all files > from serverA When server B needs to supply a file, it needs to > check if the local copy is still the "valid" version. > If yes, supply the local copy, otherwise download > from server A. When a file is changed, server A needs to be updated. > While server B is sharing a file, the file needs to be locked on server A > preventing simultaneous updates. OOch, file locking (precious tells me that is alway tricky). (pist, systemd is causing fits for the clustering geniuses; some are espousing a variety of cgroup gymnastics for phantom kills) Spark is fault tolerant, regardless of node/memory/drive failures above the fault tolerance that a file system configuration many support. If fact, files lost can be 'regenerated' but it is computationally expensive. You have to get your file system(s) set up. Then install mesos-0.20.0 and then spark. I have mesos mostly ready. I should have spark in alpha-beta this weekend. I'm fairly clueless on the DFS/LFS issue, so a DFS that needs no LFS might be a good first choice for testing the (3) system cluster. > I prefer not to supply the same amount of storage at server B as > server A has. The remote location generally only needs access to 5% of > the total amount of files stored on server A. But not always the same 5%. > Does anyone know of a filesystem that can handle this? So in clustering, from what I have read, there are all kinds of files passed around between the nodes and the master(s). Many are critical files not part of the application or scientific calculations. So in time, I think in a clustering evironment, all you seek is very possible, but it's a hunch, gut feeling, not fact. I'd put raid mirros underdneath that system, if it makes sense, for now, or just dd the stuff with a script of something kludgy (Alan is the king of kludge....) On gentoo planet one of the devs has "Consul" in his overlays. Read up on that for ideas that may be relevant to what you need. > Joost James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 15:55 ` [gentoo-user] " James @ 2014-09-17 19:34 ` J. Roeleveld 2014-09-17 20:20 ` Alec Ten Harmsel 0 siblings, 1 reply; 25+ messages in thread From: J. Roeleveld @ 2014-09-17 19:34 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 6077 bytes --] On Wednesday, September 17, 2014 03:55:56 PM James wrote: > J. Roeleveld <joost <at> antarean.org> writes: > > > Distributed File Systems (DFS): > > > > > Local (Device) File Systems LFS: > > Is my understanding correct that the top list all require one of > > the bottom list? > > Eg. the "clustering" FSs only ensure the files on the LFSs are > > duplicated/spread over the various nodes? > > > > I would normally expect the clustering FS to be either the full layer > > or a clustered block-device where an FS can be placed on top. > > I have not performed these installation yet. My research indicates > that first you put the Local FS on the drive, just like any installation > of Linux. Then you put the distributed FS on top of this. Some DFS might > not require a LFS, but FhGFS does and does HDFS. I will not acutally > be able to accurately answer your questions, until I start to build > up the 3 system cluster. (a week or 2 away) is my best guess. Playing around with clusters is on my list, but due to other activities having a higher priority, I haven't had much time yet. > > Otherwise it seems more like a network filesystem with caching > > options (See AFS). > > OK, I'll add AFS. You may be correct on this one or AFS might be both. Personally, I would read up on these and see how they work. Then, based on that, decide if they are likely to assist in the specific situation you are interested in. AFS, NFS, CIFS,... can be used for clusters, but, apart from NFS, I wouldn't expect much performance out of them. If you need it to be fault-tolerant and not overly rely on a single point of failure, I wouldn't be using any of these. Only AFS, from my original investigation, showed some fault-tolerence, but needed too many resources (disk-space) on the clients. > > I am also interested in these filesystems, but for a slightly different > > > scenario: > Ok, so I the "test-dummy-crash-victim" I'd be honored to have, you, > Alan, Neil, Mic etc etc back-seat-0drive on this adventure! (The more > I read the more it's time for burbon, bash, and a bit of cursing > to get started...) Good luck and even though I'd love to join in with the testing, I simply do not have the time to keep up. I would probably just slow you down. > > - 2 servers in remote locations (different offices) > > - 1 of these has all the files stored (server A) at the main office > > - The other (server B - remote office) needs to "offer" all files > > from serverA When server B needs to supply a file, it needs to > > check if the local copy is still the "valid" version. > > If yes, supply the local copy, otherwise download > > from server A. When a file is changed, server A needs to be updated. > > While server B is sharing a file, the file needs to be locked on server A > > preventing simultaneous updates. > > OOch, file locking (precious tells me that is alway tricky). I need it to be locked on server A while server B has a proper write-lock to avoid 2 modifications to compete with each other. > (pist, systemd is causing fits for the clustering geniuses; > some are espousing a variety of cgroup gymnastics for phantom kills) phantom kills? > Spark is fault tolerant, regardless of node/memory/drive failures > above the fault tolerance that a file system configuration many support. > If fact, files lost can be 'regenerated' but it is computationally > expensive. Too much for me. > You have to get your file system(s) set up. Then install > mesos-0.20.0 and then spark. I have mesos mostly ready. I should > have spark in alpha-beta this weekend. I'm fairly clueless on the > DFS/LFS issue, so a DFS that needs no LFS might be a good first choice > for testing the (3) system cluster. That, or a 4th node acting like a NAS sharing the filesystem over NFS. > > I prefer not to supply the same amount of storage at server B as > > server A has. The remote location generally only needs access to 5% of > > the total amount of files stored on server A. But not always the same 5%. > > Does anyone know of a filesystem that can handle this? > > So in clustering, from what I have read, there are all kinds of files > passed around between the nodes and the master(s). Many are critical > files not part of the application or scientific calculations. > So in time, I think in a clustering evironment, all you seek is > very possible, but it's a hunch, gut feeling, not fact. I'd put > raid mirros underdneath that system, if it makes sense, for now, > or just dd the stuff with a script of something kludgy (Alan is the > king of kludge....) Hmm... mirroring between servers. Always an option, except it will not work for me in this case: 1) Remote location will have a domestic ADSL line. I'll be lucky if it has a 500kbps uplink 2) Server A, currently, has around 7TB of current data that also needs to be available on the remote site. With a 8mbps downlink, waiting for a file to be copied to the remote site is acceptable. After modifications, the new version can be copied back to serverA slowly during network-idle-time or when server A actually needs it. If there is a constant mirroring between A and B, the 500kbps (if I am lucky) will be insufficient. > On gentoo planet one of the devs has "Consul" in his overlays. Read > up on that for ideas that may be relevant to what you need. Assuming the following is the website: http://www.consul.io/intro/vs/ Then this seems more a tool to replace Nagios, Puppet and similar. It doesn't have any magic inside to actually distribute a filesystem in a way that when a file is "cached" at the local site, you don't have to wait for it to download from the remote site. And any changes to the file will be copied to the master store automagically. It is intelligent enough to invalidate local copies only when the master copy got changed. And it distributes write-locks to ensure edits can occur only via 1 server at a time. And every user will always get the latest version, regardless of where/when it was last edited. -- Joost > > > Joost > > James [-- Attachment #2: Type: text/html, Size: 23291 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 19:34 ` J. Roeleveld @ 2014-09-17 20:20 ` Alec Ten Harmsel 2014-09-17 20:56 ` James ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Alec Ten Harmsel @ 2014-09-17 20:20 UTC (permalink / raw To: gentoo-user As far as HDFS goes, I would only set that up if you will use it for Hadoop or related tools. It's highly specific, and the performance is not good unless you're doing a massively parallel read (what it was designed for). I can elaborate why if anyone is actually interested. We use Lustre for our high performance general storage. I don't have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB sounds familiar, but don't quote me on that). > > Personally, I would read up on these and see how they work. Then, > based on that, decide if they are likely to assist in the specific > situation you are interested in. > Always good advice. Alec ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-user] Re: File system testing 2014-09-17 20:20 ` Alec Ten Harmsel @ 2014-09-17 20:56 ` James 2014-09-18 8:24 ` J. Roeleveld 2014-09-18 8:04 ` J. Roeleveld 2014-09-18 9:17 ` Kerin Millar 2 siblings, 1 reply; 25+ messages in thread From: James @ 2014-09-17 20:56 UTC (permalink / raw To: gentoo-user Alec Ten Harmsel <alec <at> alectenharmsel.com> writes: > As far as HDFS goes, I would only set that up if you will use it for > Hadoop or related tools. It's highly specific, and the performance is > not good unless you're doing a massively parallel read (what it was > designed for). I can elaborate why if anyone is actually interested. Acutally, from my research and my goal (one really big scientific simulation running constantly). Many folks are recommending to skip Hadoop/HDFS all together and go straight to mesos/spark. RDD (in-memory) cluster calculations are at the heart of my needs. The opposite end of the spectrum, loads of small files and small apps; I dunno about, but, I'm all ears. In the end, my (3) node scientific cluster will morph and support the typical myriad of networked applications, but I can take a few years to figure that out, or just copy what smart guys like you and joost do..... > We use Lustre for our high performance general storage. I don't have any > numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB > sounds familiar, but don't quote me on that). AT Umich, you guys should test the FhGFS/btrfs combo. The folks at UCI swear about it, although they are only publishing a wee bit. (you know, water cooler gossip)...... Surely the Wolverines do not want those californians getting up on them? Are you guys planning a mesos/spark test? > > Personally, I would read up on these and see how they work. Then, > > based on that, decide if they are likely to assist in the specific > > situation you are interested in. It's a ton of reading. It's not apples-to-apple_cider type of reading. My head hurts..... I'm leaning to DFS/LFS (2) Luster/btrfs and FhGFS/btrfs Thoughts/comments? James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 20:56 ` James @ 2014-09-18 8:24 ` J. Roeleveld 2014-09-18 9:48 ` Rich Freeman 2014-09-19 13:41 ` James 0 siblings, 2 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-18 8:24 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 3874 bytes --] On Wednesday, September 17, 2014 08:56:28 PM James wrote: > Alec Ten Harmsel <alec <at> alectenharmsel.com> writes: > > As far as HDFS goes, I would only set that up if you will use it for > > Hadoop or related tools. It's highly specific, and the performance is > > not good unless you're doing a massively parallel read (what it was > > designed for). I can elaborate why if anyone is actually interested. > > Acutally, from my research and my goal (one really big scientific simulation > running constantly). Out of curiosity, what do you want to simulate? > Many folks are recommending to skip Hadoop/HDFS all > together I agree, Hadoop/HDFS is for data analysis. Like building a profile about people based on the information companies like Facebook, Google, NSA, Walmart, Governments, Banks,.... collect about their customers/users/citizens/slaves/.... > and go straight to mesos/spark. RDD (in-memory) cluster > calculations are at the heart of my needs. The opposite end of the > spectrum, loads of small files and small apps; I dunno about, but, I'm all > ears. > In the end, my (3) node scientific cluster will morph and support > the typical myriad of networked applications, but I can take > a few years to figure that out, or just copy what smart guys like > you and joost do..... Nope, I'm simply following what you do and provide suggestions where I can. Most of the clusters and distributed computing stuff I do is based on adding machines to distribute the load. But the mechanisms for these are implemented in the applications I work with, not what I design underneath. The filesystems I am interested in are different to the ones you want. I need to provided access to software installation files to a VM server and access to documentation which is created by the users. The VM server is physically next to what I already mentioned as server A. Access to the VM from the remote site will be using remote desktop connections. But to allow faster and easier access to the documentation, I need a server B at the remote site which functions as described. AFS might be suitable, but I need to be able to layer Samba on top of that to allow a seamless operation. I don't want the laptops to have their own cache and then having to figure out how to solve the multiple different changes to documents containing layouts. (MS Word and OpenDocument files) > > We use Lustre for our high performance general storage. I don't have any > > numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB > > sounds familiar, but don't quote me on that). > > AT Umich, you guys should test the FhGFS/btrfs combo. The folks > at UCI swear about it, although they are only publishing a wee bit. > (you know, water cooler gossip)...... Surely the Wolverines do not > want those californians getting up on them? > > Are you guys planning a mesos/spark test? > > > > Personally, I would read up on these and see how they work. Then, > > > based on that, decide if they are likely to assist in the specific > > > situation you are interested in. > > It's a ton of reading. It's not apples-to-apple_cider type of reading. > My head hurts..... Take a walk outside. Clear air should help you with the headaches :P > I'm leaning to DFS/LFS > > (2) Luster/btrfs and FhGFS/btrfs > > Thoughts/comments? I have insufficient knowledge to advise on either of these. One question, why BTRFS instead of ZFS? My current understanding is: - ZFS is production ready, but due to licensing issues, not included in the kernel - BTRFS is included, but not yet production ready with all planned features For me, Raid6-like functionality is an absolute requirement and latest I know is that that isn't implemented in BTRFS yet. Does anyone know when that will be implemented and reliable? Eg. what time-frame are we talking about? -- Joost [-- Attachment #2: Type: text/html, Size: 14985 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-18 8:24 ` J. Roeleveld @ 2014-09-18 9:48 ` Rich Freeman 2014-09-18 10:22 ` J. Roeleveld 2014-09-19 13:41 ` James 1 sibling, 1 reply; 25+ messages in thread From: Rich Freeman @ 2014-09-18 9:48 UTC (permalink / raw To: gentoo-user The HTML...it hurts my eyes... :) On Thu, Sep 18, 2014 at 4:24 AM, J. Roeleveld <joost@antarean.org> wrote: > > On Wednesday, September 17, 2014 08:56:28 PM James wrote: > >> Alec Ten Harmsel <alec <at> alectenharmsel.com> writes: > >> > As far as HDFS goes, I would only set that up if you will use it for >> > Hadoop or related tools. It's highly specific, and the performance is >> > not good unless you're doing a massively parallel read (what it was >> > designed for). I can elaborate why if anyone is actually interested. > FYI - one very big limitation of hdfs is its minimum filesize is something huge like 1MB or something like that. Hadoop was designed to take a REALLY big input file and chunk it up. If you use hdfs to store something like /usr/portage it will turn into the sort of monstrosity that you'd actually need a cluster to store. > > My current understanding is: > > - ZFS is production ready, but due to licensing issues, not included in the > kernel > > - BTRFS is included, but not yet production ready with all planned features > Your understanding of their maturity is fairly accurate. They also aren't 100% moving in the same direction - btrfs aims more to be a general-purpose filesystem replacement especially for smaller systems, and zfs is more focused on the enterprise, so it lacks features like raid reshaping (who needs to add 1 disk to a raid5 when you can just add 5 more disks to your 30 disk storage system). I think btrfs has a bit more hope of being an ext4 replacement some day for both this reason and the licensing issue. That in no way detracts from the usefulness of zfs, especially for larger deployments where the few areas where btrfs is more flexible would probably be looked at as gimmicks (kind of like being able to build your whole OS from source :) ). > For me, Raid6-like functionality is an absolute requirement and latest I > know is that that isn't implemented in BTRFS yet. Does anyone know when that > will be implemented and reliable? Eg. what time-frame are we talking about? > I suspect we're talking months before it is really implemented, and much longer before it is reliable. Right now btrfs can write raid6, but it can't really read it. That is, it operates just fine until you actually lose a disk containing something other than parity, and then it loses access to the data. This code is only in the kernel for development purposes and nobody advocates using it for production. Most of the code in btrfs which is reliable has been around for years, like raid1 support, and obviously it will be years until the raid5/6 code reaches that point. I am using btrfs mainly because once that day comes it will be much easier to migrate to it from btrfs raid1 than from zfs (which has no mechanism for migrating raid levels in-place (that is, within an existing vdev) - you would need to add new drives to the pool, migrate the data, and remove the old drives from the pool, which is nice if you have a big stack of drives and spare sata ports lying around like you would in a SAN). -- Rich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-18 9:48 ` Rich Freeman @ 2014-09-18 10:22 ` J. Roeleveld 0 siblings, 0 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-18 10:22 UTC (permalink / raw To: gentoo-user On Thursday, September 18, 2014 05:48:58 AM Rich Freeman wrote: > The HTML...it hurts my eyes... :) Apologies. > > My current understanding is: > > > > - ZFS is production ready, but due to licensing issues, not included in > > the > > kernel > > > > - BTRFS is included, but not yet production ready with all planned > > features > > Your understanding of their maturity is fairly accurate. They also > aren't 100% moving in the same direction - btrfs aims more to be a > general-purpose filesystem replacement especially for smaller systems, > and zfs is more focused on the enterprise, so it lacks features like > raid reshaping (who needs to add 1 disk to a raid5 when you can just > add 5 more disks to your 30 disk storage system). Thank you for this info. I wasn't aware of this difference in 'design'. Sounds like ZFS will be more suited for me then. > I think btrfs has a bit more hope of being an ext4 replacement some > day for both this reason and the licensing issue. That in no way > detracts from the usefulness of zfs, especially for larger deployments > where the few areas where btrfs is more flexible would probably be > looked at as gimmicks (kind of like being able to build your whole OS > from source :) ). Next time I am rebuilding the desktops, I will likely switch them to BTRFS. Sounds like BTRFS will be more suited there. > > For me, Raid6-like functionality is an absolute requirement and latest I > > know is that that isn't implemented in BTRFS yet. Does anyone know when > > that will be implemented and reliable? Eg. what time-frame are we talking > > about? > I suspect we're talking months before it is really implemented, and > much longer before it is reliable. Right now btrfs can write raid6, > but it can't really read it. That is, it operates just fine until you > actually lose a disk containing something other than parity, and then > it loses access to the data. This code is only in the kernel for > development purposes and nobody advocates using it for production. > Most of the code in btrfs which is reliable has been around for years, > like raid1 support, and obviously it will be years until the raid5/6 > code reaches that point. I am using btrfs mainly because once that > day comes it will be much easier to migrate to it from btrfs raid1 > than from zfs (which has no mechanism for migrating raid levels > in-place (that is, within an existing vdev) - you would need to add > new drives to the pool, migrate the data, and remove the old drives > from the pool, which is nice if you have a big stack of drives and > spare sata ports lying around like you would in a SAN). Exactly, although I prefer not to change the filesystem on a live system anytime soon. When it comes to redoing the filesystem like that, restoring from backups will be the fastest solution. -- Joost ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-user] Re: File system testing 2014-09-18 8:24 ` J. Roeleveld 2014-09-18 9:48 ` Rich Freeman @ 2014-09-19 13:41 ` James 2014-09-19 14:56 ` Rich Freeman 2014-09-19 15:02 ` J. Roeleveld 1 sibling, 2 replies; 25+ messages in thread From: James @ 2014-09-19 13:41 UTC (permalink / raw To: gentoo-user J. Roeleveld <joost <at> antarean.org> writes: > Out of curiosity, what do you want to simulate? subsurface flows in porous medium. AKA carbon sequestration by injection wells. You know, provide proof that those that remove hydrocarbons and actuall put the CO2 back and significantly mitigate the effects of their ventures. It's like this. I have been stuggling with my 17 year old "genius" son who is a year away from entering medical school, with learning responsibility. So I got him a hyperactive, highly intelligent (mix-doberman) puppy to nurture, raise, train, love and be resonsible for. It's one genious pup, teaching another pup about being responsible. So goes the earl_bidness.......imho. > > Many folks are recommending to skip Hadoop/HDFS all together > I agree, Hadoop/HDFS is for data analysis. Like building a profile > about people based on the information companies like Facebook, > Google, NSA, Walmart, Governments, Banks,.... collect about their > customers/users/citizens/slaves/.... > > and go straight to mesos/spark. RDD (in-memory) cluster > > calculations are at the heart of my needs. The opposite end of the > > spectrum, loads of small files and small apps; I dunno about, but, I'm all > > ears. > > In the end, my (3) node scientific cluster will morph and support > > the typical myriad of networked applications, but I can take > > a few years to figure that out, or just copy what smart guys like > > you and joost do..... > > Nope, I'm simply following what you do and provide suggestions where I can. > Most of the clusters and distributed computing stuff I do is based on > adding machines to distribute the load. But the mechanisms for these are > implemented in the applications I work with, not what I design underneath. > The filesystems I am interested in are different to the ones you want. Maybe. I do not know what I want yet. My vision is very light weight workstations running lxqt (small memory footprint) or such, and a bad_arse cluster for the heavy lifting running on whatever heterogenous resoruces I have. From what I've read, the cluster and the file systems are all redundant that the cluster level (mesos/spark anyway) regardless of one any give processor/system is doing. All of Alans fantasies (needs) can be realized once the cluster stuff is master. (chronos, ansible etc etc). > I need to provided access to software installation files to a VM server > and access to documentation which is created by the users. The > VM server is physically next to what I already mentioned as server A. > Access to the VM from the remote site will be using remote desktop > connections. But to allow faster and easier access to the > documentation, I need a server B at the remote site which functions as > described. AFS might be suitable, but I need to be able to layer Samba > on top of that to allow a seamless operation. > I don't want the laptops to have their own cache and then having to > figure out how to solve the multiple different changes to documents > containing layouts. (MS Word and OpenDocument files). Ok so your customers (hperactive problem users) inteface to your cluster to do their work. When finished you write things out to other servers with all of the VM servers. Lots of really cool tools are emerging in the cluster space. I think these folks have mesos + spark + samba + nfs all in one box. [1] Build rather than purchase? WE have to figure out what you and Alan need, on a cluster, because it is what most folks need/want. It the admin_advantage part of cluster. (There also the Big Science (me) and Web centric needs. Right now they are realted project, but things will coalesce, imho. There is even "Spark_sql" for postgres admins [2]. [1] http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&qs=102 [2] https://spark.apache.org/sql/ > > > We use Lustre for our high performance general storage. I don't > > > have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s > > > over IB sounds familiar, but don't quote me on that). > > > > AT Umich, you guys should test the FhGFS/btrfs combo. The folks > > at UCI swear about it, although they are only publishing a wee bit. > > (you know, water cooler gossip)...... Surely the Wolverines do not > > want those californians getting up on them? > > Are you guys planning a mesos/spark test? > > > > Personally, I would read up on these and see how they work. Then, > > > > based on that, decide if they are likely to assist in the specific > > > > situation you are interested in. > > It's a ton of reading. It's not apples-to-apple_cider type of reading. > > My head hurts..... > Take a walk outside. Clear air should help you with the headaches :P Basketball, Boobs and Burbon use to work quite well. Now it's mostly basketball, but I'm working on someone "very cute"...... > > I'm leaning to DFS/LFS > > (2) Luster/btrfs and FhGFS/btrfs > I have insufficient knowledge to advise on either of these. > One question, why BTRFS instead of ZFS? I think btrfs has tremendous potential. I tried ZFS a few times, but the installs are not part of gentoo, so they got borked uEFI, grubs to uuids, etc etc also were in the mix. That was almost a year ago. For what ever reason the clustering folks I have read and communicated with are using ext4, xfs and btrfs. Prolly mostly because those are mostly used in their (systemd) inspired) distros....? > My current understanding is: - ZFS is production ready, but due to > licensing issues, not included in the kernel - BTRFS is included, but > not yet production ready with all planned features. Yep. the license issue with ZFS is a real killer for me. Besides, as an old state-machine, C hack, anything with B-tree is fabulous. Prejudices? Yep, but here, I'm sticking with my gut. Multi port ram can do mavelous things with Btree data structures. The rest will become available/stable. Simply, I just trust btrfs, in my gut. > For me, Raid6-like functionality is an absolute requirement and latest I > know is that that isn't implemented in BTRFS yet. Does anyone know when > that will be implemented and reliable? Eg. what time-frame are we > talking about? Now we are "communicating"! We have different visions. I want cheap, mirrored HD on small numbers of processors (less than 16 for now). I want max ram of the hightest performance possilbe. I want my reduncancy in my cluster with my cluster software deciding when/where/how-often to write out to HD. If the max_ram is not enought, then SSD will be between the ram and HD. Also, know this. The GPU will be assimilated into the processors, just like the FPUs were, some decade ago. Remember the i386 and the i387 math coprocessor chip? The good folks at opengl, gcc (GNU) and others will soon (eventually?) give us compilers to automagically use the gpu (and all of that blazingly fast ram therein, as slave to Alan admin authority (some bullship like that). So, my "Epiphany" is this. The bitches at systemd are to renamed "StripperD", as they will manage the boot cycle (how fast you can go down (save power) and come back up (online). The Cluster will rule off of your hardware, like a "Sheilk" "the ring that rules them all" be the driver of the gabage collect processes. The cluster will be like the "knights of the round table"; each node helping, and standing for those other nodes (nobles) that stumble, always with extra resources, triple/quad redundancy and solving problems before that kernel based "piece of" has a chance to anything other than "go down" or "Come up" online. We shall see just who the master is of my hardawre! The sadest thing for me is that when I extolled about billion dollar companies corrupting the kernel development process, I did not even have those {hat wearing loosers} in mind. They are irrelevant. I was thinking about those semiconductor companies. You know the ones that accept billions of dollars for the NSA and private spoofs to embed hardware inside of hardware. The ones that can use "white noise" as a communications channel. The ones that can tap a fiber optic cable, with penetration. Those are the ones to focus on. Not a bunch of "silly boyz"...... My new K_main{} has highlighted a path to neuter systemd. But I do like how StripperD moves up and down, very quickly. Cool huh? It's PARTY TIME! > Joost James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-19 13:41 ` James @ 2014-09-19 14:56 ` Rich Freeman 2014-09-19 15:06 ` J. Roeleveld 2014-09-19 15:02 ` J. Roeleveld 1 sibling, 1 reply; 25+ messages in thread From: Rich Freeman @ 2014-09-19 14:56 UTC (permalink / raw To: gentoo-user On Fri, Sep 19, 2014 at 9:41 AM, James <wireless@tampabay.rr.com> wrote: > > I think btrfs has tremendous potential. I tried ZFS a few times, > but the installs are not part of gentoo, so they got borked > uEFI, grubs to uuids, etc etc also were in the mix. That was almost > a year ago. For what ever reason the clustering folks I have > read and communicated with are using ext4, xfs and btrfs. Prolly > mostly because those are mostly used in their (systemd) inspired) > distros....? I do think that btrfs in the long-term is more likely to be mainstream on linux, but I wouldn't be surprised if getting zfs working on Gentoo is much easier now. Richard Yao is both a Gentoo dev and significant zfs on linux contributor, so I suspect he is doing much of the latter on the former. > > Yep. the license issue with ZFS is a real killer for me. Besides, > as an old state-machine, C hack, anything with B-tree is fabulous. > Prejudices? Yep, but here, I'm sticking with my gut. Multi port > ram can do mavelous things with Btree data structures. The > rest will become available/stable. Simply, I just trust btrfs, in > my gut. I don't know enough about zfs to compare them, but the design of btrfs has a certain amount of beauty/symmetry/etc to it IMHO. I only have studied it enough to be dangerous and give some intro talks to my LUG, but just about everything is stored in b-trees, the design allows both fixed and non-fixed length nodes within the trees, and just about everything about the filesystem is dynamic other than the superblocks, which do little more than ID the filesystem and point to the current tree roots. The important stuff is all replicated and versioned. I wouldn't be surprised if it shared many of these design features with other modern filesystems, and I do not profess to be an expert on modern filesystem design, so I won't make any claims about btrfs being better/worse than other filesystems in this regard. However, I would say that anybody interested in data structures would do well to study it. -- Rich ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-19 14:56 ` Rich Freeman @ 2014-09-19 15:06 ` J. Roeleveld 0 siblings, 0 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-19 15:06 UTC (permalink / raw To: gentoo-user On Friday, September 19, 2014 10:56:59 AM Rich Freeman wrote: > On Fri, Sep 19, 2014 at 9:41 AM, James <wireless@tampabay.rr.com> wrote: > > I think btrfs has tremendous potential. I tried ZFS a few times, > > but the installs are not part of gentoo, so they got borked > > uEFI, grubs to uuids, etc etc also were in the mix. That was almost > > a year ago. For what ever reason the clustering folks I have > > read and communicated with are using ext4, xfs and btrfs. Prolly > > mostly because those are mostly used in their (systemd) inspired) > > distros....? > > I do think that btrfs in the long-term is more likely to be mainstream > on linux, but I wouldn't be surprised if getting zfs working on Gentoo > is much easier now. Richard Yao is both a Gentoo dev and significant > zfs on linux contributor, so I suspect he is doing much of the latter > on the former. Don't have the link handy, but there is an howto about it that, when followed, will give a ZFS pool running on Gentoo in a very short time. (emerge zfs is the longest part of the whole thing) Not even needed to reboot. > > Yep. the license issue with ZFS is a real killer for me. Besides, > > as an old state-machine, C hack, anything with B-tree is fabulous. > > Prejudices? Yep, but here, I'm sticking with my gut. Multi port > > ram can do mavelous things with Btree data structures. The > > rest will become available/stable. Simply, I just trust btrfs, in > > my gut. > > I don't know enough about zfs to compare them, but the design of btrfs > has a certain amount of beauty/symmetry/etc to it IMHO. I only have > studied it enough to be dangerous and give some intro talks to my LUG, > but just about everything is stored in b-trees, the design allows both > fixed and non-fixed length nodes within the trees, and just about > everything about the filesystem is dynamic other than the superblocks, > which do little more than ID the filesystem and point to the current > tree roots. The important stuff is all replicated and versioned. > > I wouldn't be surprised if it shared many of these design features > with other modern filesystems, and I do not profess to be an expert on > modern filesystem design, so I won't make any claims about btrfs being > better/worse than other filesystems in this regard. However, I would > say that anybody interested in data structures would do well to study > it. I like the idea of both and hope BTRFS will also come with the raid-6-like features and good support for larger drive counts (I've got 16 available for the filestorage) to make it, for me, a viable alternative to ZFS. -- Joost ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-19 13:41 ` James 2014-09-19 14:56 ` Rich Freeman @ 2014-09-19 15:02 ` J. Roeleveld 1 sibling, 0 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-19 15:02 UTC (permalink / raw To: gentoo-user On Friday, September 19, 2014 01:41:26 PM James wrote: > J. Roeleveld <joost <at> antarean.org> writes: > > Out of curiosity, what do you want to simulate? > > subsurface flows in porous medium. AKA carbon sequestration > by injection wells. You know, provide proof that those > that remove hydrocarbons and actuall put the CO2 back > and significantly mitigate the effects of their ventures. Interesting topic. Can't provide advice on that topic. > It's like this. I have been stuggling with my 17 year old "genius" > son who is a year away from entering medical school, with > learning responsibility. So I got him a hyperactive, highly > intelligent (mix-doberman) puppy to nurture, raise, train, love > and be resonsible for. It's one genious pup, teaching another > pup about being responsible. Overactive kids, always fun. I try to keep mine busy without computers and TVs for now. (She's going to be 3 in November) > So goes the earl_bidness.......imho. > > > > Many folks are recommending to skip Hadoop/HDFS all together > > > > I agree, Hadoop/HDFS is for data analysis. Like building a profile > > about people based on the information companies like Facebook, > > Google, NSA, Walmart, Governments, Banks,.... collect about their > > customers/users/citizens/slaves/.... > > > > > and go straight to mesos/spark. RDD (in-memory) cluster > > > calculations are at the heart of my needs. The opposite end of the > > > spectrum, loads of small files and small apps; I dunno about, but, I'm > > > all > > > ears. > > > In the end, my (3) node scientific cluster will morph and support > > > the typical myriad of networked applications, but I can take > > > a few years to figure that out, or just copy what smart guys like > > > you and joost do..... > > > > > > Nope, I'm simply following what you do and provide suggestions where I > > can. > > Most of the clusters and distributed computing stuff I do is based on > > adding machines to distribute the load. But the mechanisms for these are > > implemented in the applications I work with, not what I design underneath. > > The filesystems I am interested in are different to the ones you want. > > Maybe. I do not know what I want yet. My vision is very light weight > workstations running lxqt (small memory footprint) or such, and a bad_arse > cluster for the heavy lifting running on whatever heterogenous resoruces I > have. From what I've read, the cluster and the file systems are all > redundant that the cluster level (mesos/spark anyway) regardless of one any > give processor/system is doing. All of Alans fantasies (needs) can be > realized once the cluster stuff is master. (chronos, ansible etc etc). Alan = your son? or? I would, from the workstation point of view, keep the cluster as a single entity, to keep things easier. A cluster FS for workstation/desktop use is generally not suitable for a High Performance Cluster (HPC) (or vice-versa) > > I need to provided access to software installation files to a VM server > > and access to documentation which is created by the users. The > > VM server is physically next to what I already mentioned as server A. > > Access to the VM from the remote site will be using remote desktop > > connections. But to allow faster and easier access to the > > documentation, I need a server B at the remote site which functions as > > described. AFS might be suitable, but I need to be able to layer Samba > > on top of that to allow a seamless operation. > > I don't want the laptops to have their own cache and then having to > > figure out how to solve the multiple different changes to documents > > containing layouts. (MS Word and OpenDocument files). > > Ok so your customers (hperactive problem users) inteface to your cluster > to do their work. When finished you write things out to other servers > with all of the VM servers. Lots of really cool tools are emerging > in the cluster space. Actually, slightly different scenario. Most work is done at customers systems. Occasionally we need to test software versions prior to implementing these at customers. For that, we use VMs. The VM-server we have is currently sufficient for this. When it isn't, we'll need to add a 2nd VMserver. On the NAS, we store: - Documentation about customers + Howto documents on how to best install the software. - Installation files downloaded from vendors (We also deal with older versions that are no longer available. We need to have our own collection to handle that) As we are looking into also working from a different location, we need: - Access to the VM-server (easy, using VPN and Remote Desktops) - Access to the files (I prefer to have a local 'cache' at the remote location) It's the access to files part where I need to have some sort of "distributed" filesystem. > I think these folks have mesos + spark + samba + nfs all in one box. [1] > [1] > http://www.quantaqct.com/en/01_product/02_detail.php?mid=29&sid=162&id=163&q > s=102 Had a quick look, these use MS Windows Storage 2012, this is only failover on the storage side. I don't see anything related to what we are working with. > Build rather than purchase? WE have to figure out what you and Alan need, on > a cluster, because it is what most folks need/want. It the admin_advantage > part of cluster. (There also the Big Science (me) and Web centric needs. > Right now they are realted project, but things will coalesce, imho. There > is even "Spark_sql" for postgres admins [2]. > > > [2] https://spark.apache.org/sql/ Hmm.... that is interesting. > > > > We use Lustre for our high performance general storage. I don't > > > > have any numbers, but I'm pretty sure it is *really* fast (10Gbit/s > > > > over IB sounds familiar, but don't quote me on that). > > > > > > AT Umich, you guys should test the FhGFS/btrfs combo. The folks > > > at UCI swear about it, although they are only publishing a wee bit. > > > (you know, water cooler gossip)...... Surely the Wolverines do not > > > want those californians getting up on them? > > > > > > Are you guys planning a mesos/spark test? > > > > > > > > Personally, I would read up on these and see how they work. Then, > > > > > based on that, decide if they are likely to assist in the specific > > > > > situation you are interested in. > > > > > > It's a ton of reading. It's not apples-to-apple_cider type of reading. > > > My head hurts..... > > > > Take a walk outside. Clear air should help you with the headaches :P > > Basketball, Boobs and Burbon use to work quite well. Now it's mostly > basketball, but I'm working on someone "very cute"...... Cloning? Genetics? Now that I am interested in. I could do with a couple of clones. ;) Btw, there are women who know more about some aspects of IT then you and me put together. Some of those even manage to look great as well ;) > > > I'm leaning to DFS/LFS > > > (2) Luster/btrfs and FhGFS/btrfs > > > > I have insufficient knowledge to advise on either of these. > > One question, why BTRFS instead of ZFS? > > I think btrfs has tremendous potential. I tried ZFS a few times, > but the installs are not part of gentoo, so they got borked > uEFI, grubs to uuids, etc etc also were in the mix. That was almost > a year ago. I did a quick test with Gentoo and ZFS. With the current documentation and ebuilds, it actually is quite simple to get to use. Provided you don't intend to use it for the root filesystem. > For what ever reason the clustering folks I have > read and communicated with are using ext4, xfs and btrfs. Prolly > mostly because those are mostly used in their (systemd) inspired) > distros....? I think mostly because they are included native into the kernel and when dealing with HPC, you don't want to use a filesystem that is know to eat memory for breakfast. When I switch the NAS over to ZFS, I will be using a dedicated machine with 16GB of memory. Probably going to increase that to 32GB not too long after. > > My current understanding is: - ZFS is production ready, but due to > > licensing issues, not included in the kernel - BTRFS is included, but > > not yet production ready with all planned features. > > Yep. the license issue with ZFS is a real killer for me. Besides, > as an old state-machine, C hack, anything with B-tree is fabulous. > Prejudices? Yep, but here, I'm sticking with my gut. Multi port > ram can do mavelous things with Btree data structures. The > rest will become available/stable. Simply, I just trust btrfs, in > my gut. I think both are stable and usable, with the limitations I currently see and confirmed by Rich. > > For me, Raid6-like functionality is an absolute requirement and latest I > > know is that that isn't implemented in BTRFS yet. Does anyone know when > > that will be implemented and reliable? Eg. what time-frame are we > > talking about? > > Now we are "communicating"! We have different visions. I want cheap, > mirrored HD on small numbers of processors (less than 16 for now). > I want max ram of the hightest performance possilbe. I want my reduncancy > in my cluster with my cluster software deciding when/where/how-often > to write out to HD. If the max_ram is not enought, then SSD will > be between the ram and HD. Also, know this. The GPU will be assimilated > into the processors, just like the FPUs were, some decade ago. Remember > the i386 and the i387 math coprocessor chip? The good folks at opengl, > gcc (GNU) and others will soon (eventually?) give us compilers to > automagically use the gpu (and all of that blazingly fast ram therein, > as slave to Alan admin authority (some bullship like that). Yep, and for HPC and VMs, you want to keep as much memory available for what matters. For a file storage cluster, memory is there to assist the serving of files. (As that is what matters there) > So, my "Epiphany" is this. The bitches at systemd are to renamed > "StripperD", as they will manage the boot cycle (how fast you can > go down (save power) and come back up (online). The Cluster > will rule off of your hardware, like a "Sheilk" "the ring that rules > them all" be the driver of the gabage collect processes. Aargh, garbage collectors... They tend to spring into action when least convenient... Try to be able to control when they start cleaning. > The cluster > will be like the "knights of the round table"; each node helping, and > standing for those other nodes (nobles) that stumble, always with > extra resources, triple/quad redundancy and solving problems > before that kernel based "piece of" has a chance to anything > other than "go down" or "Come up" online. Interesting, need to parse this slowly over the weekend. > We shall see just who the master is of my hardawre! > The sadest thing for me is that when I extolled about billion > dollar companies corrupting the kernel development process, I did > not even have those {hat wearing loosers} in mind. They are > irrelevant. I was thinking about those semiconductor companies. > You know the ones that accept billions of dollars for the NSA > and private spoofs to embed hardware inside of hardware. The ones > that can use "white noise" as a communications channel. The ones > that can tap a fiber optic cable, with penetration. Those are > the ones to focus on. Not a bunch of "silly boyz"...... For that, you need to keep the important sensitive data off the grid. > My new K_main{} has highlighted a path to neuter systemd. > But I do like how StripperD moves up and down, very quickly. I don't care about boot times or shutdown times. If I did, I'd invest in high speed ram disks and SSDs. Having 50 of the fastest SSDs in Raid-0 config will give more data then the rest of the system can handle ;) If then using that for VMs which can keep the entire virtual disk also in memory, and you really are fllying with performance. That's why in-memory systems are becoming popular again. > Cool huh? > It's PARTY TIME! Parties are nice... -- Joost ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 20:20 ` Alec Ten Harmsel 2014-09-17 20:56 ` James @ 2014-09-18 8:04 ` J. Roeleveld 2014-09-18 9:17 ` Kerin Millar 2 siblings, 0 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-18 8:04 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1657 bytes --] On Wednesday, September 17, 2014 04:20:24 PM Alec Ten Harmsel wrote: > As far as HDFS goes, I would only set that up if you will use it for > Hadoop or related tools. It's highly specific, and the performance is > not good unless you're doing a massively parallel read (what it was > designed for). I can elaborate why if anyone is actually interested. > > We use Lustre for our high performance general storage. I don't have any > numbers, but I'm pretty sure it is *really* fast (10Gbit/s over IB > sounds familiar, but don't quote me on that). I think any shared filesystem will be fast if you have a lot of bandwidth :) When comparing network filesystems it makes sense to keep the hardware identical reduce the overhead to a percentage. Eg. What is the theoretical maximum speed for the used network. (10Gbit/s) and what is the actual maximum speed you get with: 1) a single really large file (200GB) 2) a lot (100,000) smaller files (2MB) Then you can make an estimate on what to expect when using a 1Gbit/s network. I somehow don't expect James to have InfiniBand available for his research? Personally, when choosing between InfiniBand and Ethernet, I'm tempted to go with dedicated bonded 10Gbit/s links because of the price- difference. (A quick research shows me that Infiniband is about 3x as expensive for the same throughput) > > Personally, I would read up on these and see how they work. Then, > > based on that, decide if they are likely to assist in the specific > > situation you are interested in. > > Always good advice. It saves time to do some simple research (the reading type) before actually doing tests. -- Joost [-- Attachment #2: Type: text/html, Size: 6197 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 20:20 ` Alec Ten Harmsel 2014-09-17 20:56 ` James 2014-09-18 8:04 ` J. Roeleveld @ 2014-09-18 9:17 ` Kerin Millar 2014-09-18 13:12 ` Alec Ten Harmsel 2 siblings, 1 reply; 25+ messages in thread From: Kerin Millar @ 2014-09-18 9:17 UTC (permalink / raw To: gentoo-user On 17/09/2014 21:20, Alec Ten Harmsel wrote: > As far as HDFS goes, I would only set that up if you will use it for > Hadoop or related tools. It's highly specific, and the performance is > not good unless you're doing a massively parallel read (what it was > designed for). I can elaborate why if anyone is actually interested. I, for one, am very interested. --Kerin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-18 9:17 ` Kerin Millar @ 2014-09-18 13:12 ` Alec Ten Harmsel 2014-09-19 15:21 ` Kerin Millar 0 siblings, 1 reply; 25+ messages in thread From: Alec Ten Harmsel @ 2014-09-18 13:12 UTC (permalink / raw To: gentoo-user On 09/18/2014 05:17 AM, Kerin Millar wrote: > On 17/09/2014 21:20, Alec Ten Harmsel wrote: >> As far as HDFS goes, I would only set that up if you will use it for >> Hadoop or related tools. It's highly specific, and the performance is >> not good unless you're doing a massively parallel read (what it was >> designed for). I can elaborate why if anyone is actually interested. > > I, for one, am very interested. > > --Kerin > Alright, here goes: Rich Freeman wrote: > FYI - one very big limitation of hdfs is its minimum filesize is > something huge like 1MB or something like that. Hadoop was designed > to take a REALLY big input file and chunk it up. If you use hdfs to > store something like /usr/portage it will turn into the sort of > monstrosity that you'd actually need a cluster to store. This is exactly correct, except we run with a block size of 128MB, and a large cluster will typically have a block size of 256MB or even 512MB. HDFS has two main components: a NameNode, which keeps track of which blocks are a part of which file (in memory), and the DataNodes that actually store the blocks. No data ever flows through the NameNode; it negotiates transfers between the client and DataNodes and negotiates transfers for jobs. Since the NameNode stores metadata in-memory, small files are bad because RAM gets wasted. What exactly is Hadoop/HDFS used for? The most common uses are generating search indices on data (which is a batch job) and doing non-realtime processing of log streams and/or data streams (another batch job) and allowing a large number of analysts run disparate queries on the same large dataset (another batch job). Batch processing - processing the entire dataset - is really where Hadoop shines. When you put a file into HDFS, it gets split based on the block size. This is done so that a parallel read will be really fast - each map task reads in a single block and processes it. Ergo, if you put in a 1GB file with a 128MB block size and run a MapReduce job, 8 map tasks will be launched. If you put in a 1TB file, 8192 tasks would be launched. Tuning the block size is important to optimize the overhead of launching tasks vs. potentially under-utilizing a cluster. Typically, a cluster with a lot of data has a bigger block size. The downsides of HDFS: * Seeked reads are not supported afaik because no one needs that for batch processing * Seeked writes into an existing file are not supported because either blocks would be added in the middle of a file and wouldn't be 128MB, or existing blocks would be edited, resulting in blocks larger than 128MB. Both of these scenarios are bad. Since HDFS users typically do not need seeked reads or seeked writes, these downsides aren't really a big deal. If something's not clear, let me know. Alec ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-18 13:12 ` Alec Ten Harmsel @ 2014-09-19 15:21 ` Kerin Millar 0 siblings, 0 replies; 25+ messages in thread From: Kerin Millar @ 2014-09-19 15:21 UTC (permalink / raw To: gentoo-user On 18/09/2014 14:12, Alec Ten Harmsel wrote: > > On 09/18/2014 05:17 AM, Kerin Millar wrote: >> On 17/09/2014 21:20, Alec Ten Harmsel wrote: >>> As far as HDFS goes, I would only set that up if you will use it for >>> Hadoop or related tools. It's highly specific, and the performance is >>> not good unless you're doing a massively parallel read (what it was >>> designed for). I can elaborate why if anyone is actually interested. >> >> I, for one, am very interested. >> >> --Kerin >> > > Alright, here goes: > > Rich Freeman wrote: > >> FYI - one very big limitation of hdfs is its minimum filesize is >> something huge like 1MB or something like that. Hadoop was designed >> to take a REALLY big input file and chunk it up. If you use hdfs to >> store something like /usr/portage it will turn into the sort of >> monstrosity that you'd actually need a cluster to store. > > This is exactly correct, except we run with a block size of 128MB, and a large cluster will typically have a block size of 256MB or even 512MB. > > HDFS has two main components: a NameNode, which keeps track of which blocks are a part of which file (in memory), and the DataNodes that actually store the blocks. No data ever flows through the NameNode; it negotiates transfers between the client and DataNodes and negotiates transfers for jobs. Since the NameNode stores metadata in-memory, small files are bad because RAM gets wasted. > > What exactly is Hadoop/HDFS used for? The most common uses are generating search indices on data (which is a batch job) and doing non-realtime processing of log streams and/or data streams (another batch job) and allowing a large number of analysts run disparate queries on the same large dataset (another batch job). Batch processing - processing the entire dataset - is really where Hadoop shines. > > When you put a file into HDFS, it gets split based on the block size. This is done so that a parallel read will be really fast - each map task reads in a single block and processes it. Ergo, if you put in a 1GB file with a 128MB block size and run a MapReduce job, 8 map tasks will be launched. If you put in a 1TB file, 8192 tasks would be launched. Tuning the block size is important to optimize the overhead of launching tasks vs. potentially under-utilizing a cluster. Typically, a cluster with a lot of data has a bigger block size. > > The downsides of HDFS: > * Seeked reads are not supported afaik because no one needs that for batch processing > * Seeked writes into an existing file are not supported because either blocks would be added in the middle of a file and wouldn't be 128MB, or existing blocks would be edited, resulting in blocks larger than 128MB. Both of these scenarios are bad. > > Since HDFS users typically do not need seeked reads or seeked writes, these downsides aren't really a big deal. > > If something's not clear, let me know. Thank you for taking the time to explain. --Kerin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-16 19:07 [gentoo-user] File system testing James 2014-09-17 7:45 ` J. Roeleveld @ 2014-09-17 18:10 ` Hervé Guillemet 2014-09-17 18:21 ` J. Roeleveld 2014-09-18 15:32 ` [gentoo-user] " James 2014-09-25 20:47 ` [gentoo-user] " thegeezer 2 siblings, 2 replies; 25+ messages in thread From: Hervé Guillemet @ 2014-09-17 18:10 UTC (permalink / raw To: gentoo-user Le 16/09/2014 21:07, James a écrit : > > By now many are familiar with my keen interest in clustering gentoo > systems. So, what most cluster technologies use is a distributed file > system on top of the local (HD/SDD) file system. Naturally not > all file systems, particularly the distributed file systems, have > straightforward instructions. Also, an device file system, such as > XFS and a distibuted (on top of the device file system) combination > may not work very well when paired. So a variety of testing is > something I'm researching. Eliminiation of either file system > listed below, due to Gentoo User Experience is most welcome information, > as well as tips and tricks to setting up any file system. Hi James, Have you found this document : http://hal.inria.fr/hal-00789086/PDF/a_survey_of_dfs.pdf On a related matter, I'd like to host my own file server on a dedicated box so that I can access my working files from serveral locations. I'd like it to be fast and secure, and I don't mind if the files are replicated on each workstation. What would be the better tools for this ? -- Hervé ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-17 18:10 ` [gentoo-user] " Hervé Guillemet @ 2014-09-17 18:21 ` J. Roeleveld 2014-09-17 21:05 ` [gentoo-user] " James ` (2 more replies) 2014-09-18 15:32 ` [gentoo-user] " James 1 sibling, 3 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-17 18:21 UTC (permalink / raw To: gentoo-user On 17 September 2014 20:10:57 CEST, "Hervé Guillemet" <herve@guillemet.org> wrote: >Le 16/09/2014 21:07, James a écrit : >> >> By now many are familiar with my keen interest in clustering gentoo >> systems. So, what most cluster technologies use is a distributed file >> system on top of the local (HD/SDD) file system. Naturally not >> all file systems, particularly the distributed file systems, have >> straightforward instructions. Also, an device file system, such as >> XFS and a distibuted (on top of the device file system) combination >> may not work very well when paired. So a variety of testing is >> something I'm researching. Eliminiation of either file system >> listed below, due to Gentoo User Experience is most welcome >information, >> as well as tips and tricks to setting up any file system. > >Hi James, > >Have you found this document : > >http://hal.inria.fr/hal-00789086/PDF/a_survey_of_dfs.pdf > >On a related matter, I'd like to host my own file server on a dedicated >box so that I can access my working files from serveral locations. I'd >like it to be fast and secure, and I don't mind if the files are >replicated on each workstation. What would be the better tools for this >? AFS has caching and can survive temporary disappearance of the server. For me, I need to be able to provide Samba filesharing on top of that layer on 2 different locations as I don't see the network bandwidth to be sufficient for normal operations. (ADSL uplinks tend to be dead slow) -- Joost -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-user] Re: File system testing 2014-09-17 18:21 ` J. Roeleveld @ 2014-09-17 21:05 ` James 2014-09-18 7:29 ` J. Roeleveld 2014-09-18 8:28 ` [gentoo-user] " Kerin Millar 2014-09-25 20:56 ` thegeezer 2 siblings, 1 reply; 25+ messages in thread From: James @ 2014-09-17 21:05 UTC (permalink / raw To: gentoo-user J. Roeleveld <joost <at> antarean.org> writes: > AFS has caching and can survive temporary disappearance of the server. Excellent for low bandwidth connections. Most DFS have mechanisms to deal with transient failures, but not as generaous on the time-scale as AFS. I believe, if I recall correctly, these hi-latency, low bandwith recovery mechanism keen design paramters, at least bake in the CMU develop cycples, for AFS? While attractive for your situation, these features might actually be detrimental to a hi_performance distributed cluster's needs for a DFS? > For me, I need to be able to provide Samba filesharing on top of that > layer on 2 different locations as I don't see the network bandwidth to > be sufficient for normal operations. (ADSL uplinks tend to be dead slow) Yea, I'm not going to be testing OpenAFS for my needs, unless I read some compelling publish data on it's applicability to high end clusters best choice as a DFS..... It's probably great for SETI etc etc. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] Re: File system testing 2014-09-17 21:05 ` [gentoo-user] " James @ 2014-09-18 7:29 ` J. Roeleveld 0 siblings, 0 replies; 25+ messages in thread From: J. Roeleveld @ 2014-09-18 7:29 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1672 bytes --] On Wednesday, September 17, 2014 09:05:09 PM James wrote: > J. Roeleveld <joost <at> antarean.org> writes: > > AFS has caching and can survive temporary disappearance of the server. > > Excellent for low bandwidth connections. Most DFS have mechanisms to > deal with transient failures, but not as generaous on the time-scale > as AFS. I believe, if I recall correctly, these hi-latency, low bandwith > recovery mechanism keen design paramters, at least bake in the > CMU develop cycples, for AFS? > > While attractive for your situation, these features might actually > be detrimental to a hi_performance distributed cluster's needs for > a DFS? I tend to agree. I'm not sure how up-to-date AFS is, but from re-reading the wikipedia pages, it sounds like what I need. Provided I can get it to work together with Samba. I need to allow MS Windows laptops access to the files on the remote location. > > For me, I need to be able to provide Samba filesharing on top of that > > layer on 2 different locations as I don't see the network bandwidth to > > be sufficient for normal operations. (ADSL uplinks tend to be dead slow) > > Yea, I'm not going to be testing OpenAFS for my needs, unless I read > some compelling publish data on it's applicability to high end > clusters best choice as a DFS..... I wouldn't either. > It's probably great for SETI etc etc. Doubtful :) Did you see the following wikipedia page: http://en.wikipedia.org/wiki/List_of_file_systems It contains a nice long list of various distributed, clustered,.... filesystems. I just miss an indication on how well these are still supported and on which OSs these (can) work. -- Joost [-- Attachment #2: Type: text/html, Size: 7704 bytes --] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-17 18:21 ` J. Roeleveld 2014-09-17 21:05 ` [gentoo-user] " James @ 2014-09-18 8:28 ` Kerin Millar 2014-09-25 20:56 ` thegeezer 2 siblings, 0 replies; 25+ messages in thread From: Kerin Millar @ 2014-09-18 8:28 UTC (permalink / raw To: gentoo-user On 17/09/2014 19:21, J. Roeleveld wrote: > On 17 September 2014 20:10:57 CEST, "Hervé Guillemet" <herve@guillemet.org> wrote: >> Le 16/09/2014 21:07, James a écrit : >>> >>> By now many are familiar with my keen interest in clustering gentoo >>> systems. So, what most cluster technologies use is a distributed file >>> system on top of the local (HD/SDD) file system. Naturally not >>> all file systems, particularly the distributed file systems, have >>> straightforward instructions. Also, an device file system, such as >>> XFS and a distibuted (on top of the device file system) combination >>> may not work very well when paired. So a variety of testing is >>> something I'm researching. Eliminiation of either file system >>> listed below, due to Gentoo User Experience is most welcome >> information, >>> as well as tips and tricks to setting up any file system. >> >> Hi James, >> >> Have you found this document : >> >> http://hal.inria.fr/hal-00789086/PDF/a_survey_of_dfs.pdf >> >> On a related matter, I'd like to host my own file server on a dedicated >> box so that I can access my working files from serveral locations. I'd >> like it to be fast and secure, and I don't mind if the files are >> replicated on each workstation. What would be the better tools for this >> ? > > AFS has caching and can survive temporary disappearance of the server. > > For me, I need to be able to provide Samba filesharing on top of that layer on 2 different locations as I don't see the network bandwidth to be sufficient for normal operations. (ADSL uplinks tend to be dead slow) You might try GlusterFS with two replicating bricks. The latest version of Samba in portage includes a VFS plugin that can integrate GlusterFS volumes via GFAPI. --Kerin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-17 18:21 ` J. Roeleveld 2014-09-17 21:05 ` [gentoo-user] " James 2014-09-18 8:28 ` [gentoo-user] " Kerin Millar @ 2014-09-25 20:56 ` thegeezer 2 siblings, 0 replies; 25+ messages in thread From: thegeezer @ 2014-09-25 20:56 UTC (permalink / raw To: gentoo-user On 17/09/14 19:21, J. Roeleveld wrote: > AFS has caching and can survive temporary disappearance of the server. > For me, I need to be able to provide Samba filesharing on top of that > layer on 2 different locations as I don't see the network bandwidth to > be sufficient for normal operations. (ADSL uplinks tend to be dead > slow) -- Joost Riverbed wan appliances were always great for this. I would have loved to see an open source version of their hash-zip-send as it worked amazingly well. however, from [1] you can mount.cifs with option fsc, and perhaps (sorry not tried myself) then use something like cachefs to make for a controlled size and location for that cache? also [2] might be of interest to you " fsc Enable local disk caching using FS-Cache (off by default). This option could be useful to improve performance on a slow link, heavily loaded server and/or network where reading from the disk is faster than reading from the server (over the network). This could also impact scalability positively as the number of calls to the server are reduced. However, local caching is not suitable for all workloads for e.g. read-once type workloads. So, you need to consider carefully your workload/scenario before using this option. Currently, local disk caching is functional for CIFS files opened as read-only. " [1] https://www.kernel.org/doc/readme/Documentation-filesystems-cifs-README [2] http://www.cyberciti.biz/faq/centos-redhat-install-configure-cachefilesd-for-nfs/ ^ permalink raw reply [flat|nested] 25+ messages in thread
* [gentoo-user] Re: File system testing 2014-09-17 18:10 ` [gentoo-user] " Hervé Guillemet 2014-09-17 18:21 ` J. Roeleveld @ 2014-09-18 15:32 ` James 1 sibling, 0 replies; 25+ messages in thread From: James @ 2014-09-18 15:32 UTC (permalink / raw To: gentoo-user Hervé Guillemet <herve <at> guillemet.org> writes: > > Le 16/09/2014 21:07, James a écrit : > > > > By now many are familiar with my keen interest in clustering gentoo > > systems. So, what most cluster technologies use is a distributed file > > system on top of the local (HD/SDD) file system. > Have you found this document : > http://hal.inria.fr/hal-00789086/PDF/a_survey_of_dfs.pdf Hello Herve, Yes, I read the document and it is a good introduction to some of my issues on which file system(s) to use for clustering. But, it's more of a survey than a comparison/benchmark study, which would be really beneficial. DFS are moving so fast now, and their setups and features are rarely a one to one match. For example, (currently) the best load balancing you find, is actually in the apps that run above the cluster software. [1] Some of the performance/resource-utilizations of the files systems/resources are determined by real-time analytics with graphical displays. I'm not sure that load balancing even belongs in a DFS, yet in the paper you reference, it was prominently discussed. Things are moving so fast there in the distributed-*/cluster/cluster-tools/cluster-apps space, one really need a system set up to apply almost daily patches for testing. I never realize just how much reading is necessary just to understand the current landscape in clustering. I'm trying to figure out an echo_system where gentoo folks can experiment wtih mesos clustering for scientific applications. After that, the more general case should be mature enough for general purpose applications. I'm avoiding the clustered web arena, as that is just too much for me to digest; so somebody else could champion that part of all of those Apache-cluster technologies. Thanks for the document link! James [1] ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [gentoo-user] File system testing 2014-09-16 19:07 [gentoo-user] File system testing James 2014-09-17 7:45 ` J. Roeleveld 2014-09-17 18:10 ` [gentoo-user] " Hervé Guillemet @ 2014-09-25 20:47 ` thegeezer 2 siblings, 0 replies; 25+ messages in thread From: thegeezer @ 2014-09-25 20:47 UTC (permalink / raw To: gentoo-user On 16/09/14 20:07, James wrote: > Hello, > > By now many are familiar with my keen interest in clustering gentoo > systems. So, what most cluster technologies use is a distributed file > system on top of the local (HD/SDD) file system. Naturally not > all file systems, particularly the distributed file systems, have > straightforward instructions. Also, an device file system, such as > XFS and a distibuted (on top of the device file system) combination > may not work very well when paired. So a variety of testing is > something I'm researching. Eliminiation of either file system > listed below, due to Gentoo User Experience is most welcome information, > as well as tips and tricks to setting up any file system. > > > Distributed File Systems (DFS): > HDFS (poor performance) > Lustre > Ceph > XtreemFS > GlusterFS > MooseFS > FhGFS (BeeGFS) soon to be entirely open sourced? > Any other distributed file systems I should consider using? > > Local (Device) File Systems LFS: > btrfs > zfs > ext4 > xfs > > Obviously I do not what to test all combinations of DFS/LocalFS > so your comments are extremely welcome as is any and all > related information. > > James > > howdy, you might also like to see about GFS2, OCFS and OrangeFS. GFS2 for me was major effort to get going on gentoo, OCFS worked almost out of the box, but is from oracle. in all cases writes were the biggest hurdle for me due to the distributed lock mechanisms ymmv ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2014-09-25 20:57 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-09-16 19:07 [gentoo-user] File system testing James 2014-09-17 7:45 ` J. Roeleveld 2014-09-17 15:55 ` [gentoo-user] " James 2014-09-17 19:34 ` J. Roeleveld 2014-09-17 20:20 ` Alec Ten Harmsel 2014-09-17 20:56 ` James 2014-09-18 8:24 ` J. Roeleveld 2014-09-18 9:48 ` Rich Freeman 2014-09-18 10:22 ` J. Roeleveld 2014-09-19 13:41 ` James 2014-09-19 14:56 ` Rich Freeman 2014-09-19 15:06 ` J. Roeleveld 2014-09-19 15:02 ` J. Roeleveld 2014-09-18 8:04 ` J. Roeleveld 2014-09-18 9:17 ` Kerin Millar 2014-09-18 13:12 ` Alec Ten Harmsel 2014-09-19 15:21 ` Kerin Millar 2014-09-17 18:10 ` [gentoo-user] " Hervé Guillemet 2014-09-17 18:21 ` J. Roeleveld 2014-09-17 21:05 ` [gentoo-user] " James 2014-09-18 7:29 ` J. Roeleveld 2014-09-18 8:28 ` [gentoo-user] " Kerin Millar 2014-09-25 20:56 ` thegeezer 2014-09-18 15:32 ` [gentoo-user] " James 2014-09-25 20:47 ` [gentoo-user] " thegeezer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox