* [gentoo-amd64] Soliciting new RAID ideas @ 2014-05-27 22:13 Mark Knecht 2014-05-27 22:39 ` Bob Sanders 2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander 0 siblings, 2 replies; 30+ messages in thread From: Mark Knecht @ 2014-05-27 22:13 UTC (permalink / raw To: Gentoo AMD64 Hi all, The list is quiet. Please excuse me waking it up. (Or trying to...) ;-) I'm at the point where I'm a few months from running out of disk space on my RAID6 so I'm considering how to move forward. I thought I'd check in here and get any ideas folks have. Thanks in advance. The system is a Gentoo 64-bit, mostly stable, using a i7-980x Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable HD bays, room for 6 other drives, a large power supply. The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+ years I've used this machine. Generally there are 4 classes of data on the RAID: 1) Gentoo (obviously), configs backed up every weekend. I plan to rebuild from scratch using existing configs if there's a failure. Being down for a couple of days is not an issue. 2) VMs - about 300GB. Loaded every morning, stopped & saved every night, backed up every weekend. 3) Financial data - lots of it - stocks, futures, options, etc. Performance requirements are pretty low. Backed up every weekend. 4) Video files - backed up to a different location than items 1/2/3 whenever there are changes After eclean-dist/eclean-pkg I'm down to about 80GB free and this will fill up in 3-6 months so it's time to make some changes. My thoughts: 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with RAID1. This would use the internal SATA2 ports so it wouldn't be the highest performance but likely a lot better than my SATA2 RAID6. 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID controller. This would be SATA3 so probably way more performance than I have now. MUCH more expensive though. 3) #1 + an SSD. I have an unused 120GB SSD so I could get another, make a 2-disk RAID1, put Gentoo on that and everything else on the newer 3TB drives. More complex, probably lower reliability and I'm not sure I gain much. Beyond this I need to talk file system types. I'm fat dumb and happy with Ext4 and don't really relish dealing with new stuff but now's the time to at least look. Anyway, that's the basic outline. Any thoughts, ideas, corrections, expansions, etc., I'm very interested in talking about. Cheers, Mark ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht @ 2014-05-27 22:39 ` Bob Sanders 2014-05-27 22:58 ` Harry Holt ` (2 more replies) 2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander 1 sibling, 3 replies; 30+ messages in thread From: Bob Sanders @ 2014-05-27 22:39 UTC (permalink / raw To: gentoo-amd64 Mark Knecht, mused, then expounded: > Hi all, > The list is quiet. Please excuse me waking it up. (Or trying to...) ;-) > > I'm at the point where I'm a few months from running out of disk > space on my RAID6 so I'm considering how to move forward. I thought > I'd check in here and get any ideas folks have. Thanks in advance. > Beware - if Adobe acroread is used, and you opt for a 3TB home directory, there is a chance it will not work. Or more specifically, acroread is still 32-bit. It's only something I've seen with the xfs filesystem. And Adobe has ignored it for approx. 3yrs now. > The system is a Gentoo 64-bit, mostly stable, using a i7-980x > Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable > HD bays, room for 6 other drives, a large power supply. > > The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD > RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+ > years I've used this machine. > > Generally there are 4 classes of data on the RAID: > > 1) Gentoo (obviously), configs backed up every weekend. I plan to > rebuild from scratch using existing configs if there's a failure. > Being down for a couple of days is not an issue. > 2) VMs - about 300GB. Loaded every morning, stopped & saved every > night, backed up every weekend. > 3) Financial data - lots of it - stocks, futures, options, etc. > Performance requirements are pretty low. Backed up every weekend. > 4) Video files - backed up to a different location than items 1/2/3 > whenever there are changes > > After eclean-dist/eclean-pkg I'm down to about 80GB free and this > will fill up in 3-6 months so it's time to make some changes. > > My thoughts: > > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with > RAID1. This would use the internal SATA2 ports so it wouldn't be the > highest performance but likely a lot better than my SATA2 RAID6. > > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID > controller. This would be SATA3 so probably way more performance than > I have now. MUCH more expensive though. > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports. > 3) #1 + an SSD. I have an unused 120GB SSD so I could get another, > make a 2-disk RAID1, put Gentoo on that and everything else on the > newer 3TB drives. More complex, probably lower reliability and I'm not > sure I gain much. > > Beyond this I need to talk file system types. I'm fat dumb and > happy with Ext4 and don't really relish dealing with new stuff but > now's the time to at least look. > If you change, do not use ZFS and possibly BTRFS if the system does not have ECC DRAM. A single, unnoticed, ECC error can corrupt the data pool and be written to the file system, which effectively renders it corrupt without a way to recover. FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and provide a boot of nfs or iSCSI to your VMs. The downside is the NAS box and drives would go for a bit north of $636. The upside is all your movies and VM files could move off your workstation and the workstation would still host the VMs via a mount of the NAS box. > Anyway, that's the basic outline. Any thoughts, ideas, corrections, > expansions, etc., I'm very interested in talking about. > > Cheers, > Mark > -- - ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:39 ` Bob Sanders @ 2014-05-27 22:58 ` Harry Holt 2014-05-27 23:38 ` thegeezer 2014-05-27 23:32 ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht 2014-05-27 23:51 ` Marc Joliet 2 siblings, 1 reply; 30+ messages in thread From: Harry Holt @ 2014-05-27 22:58 UTC (permalink / raw To: gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 3730 bytes --] On May 27, 2014 6:39 PM, "Bob Sanders" <rsanders@sgi.com> wrote: > > Mark Knecht, mused, then expounded: > > Hi all, > > The list is quiet. Please excuse me waking it up. (Or trying to...) ;-) > > > > I'm at the point where I'm a few months from running out of disk > > space on my RAID6 so I'm considering how to move forward. I thought > > I'd check in here and get any ideas folks have. Thanks in advance. > > > > Beware - if Adobe acroread is used, and you opt for a 3TB home > directory, there is a chance it will not work. Or more specifically, > acroread is still 32-bit. It's only something I've seen with the xfs > filesystem. And Adobe has ignored it for approx. 3yrs now. > > > The system is a Gentoo 64-bit, mostly stable, using a i7-980x > > Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable > > HD bays, room for 6 other drives, a large power supply. > > > > The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD > > RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+ > > years I've used this machine. > > > > Generally there are 4 classes of data on the RAID: > > > > 1) Gentoo (obviously), configs backed up every weekend. I plan to > > rebuild from scratch using existing configs if there's a failure. > > Being down for a couple of days is not an issue. > > 2) VMs - about 300GB. Loaded every morning, stopped & saved every > > night, backed up every weekend. > > 3) Financial data - lots of it - stocks, futures, options, etc. > > Performance requirements are pretty low. Backed up every weekend. > > 4) Video files - backed up to a different location than items 1/2/3 > > whenever there are changes > > > > After eclean-dist/eclean-pkg I'm down to about 80GB free and this > > will fill up in 3-6 months so it's time to make some changes. > > > > My thoughts: > > > > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with > > RAID1. This would use the internal SATA2 ports so it wouldn't be the > > highest performance but likely a lot better than my SATA2 RAID6. > > > > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID > > controller. This would be SATA3 so probably way more performance than > > I have now. MUCH more expensive though. > > > > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports. > > > 3) #1 + an SSD. I have an unused 120GB SSD so I could get another, > > make a 2-disk RAID1, put Gentoo on that and everything else on the > > newer 3TB drives. More complex, probably lower reliability and I'm not > > sure I gain much. > > > > Beyond this I need to talk file system types. I'm fat dumb and > > happy with Ext4 and don't really relish dealing with new stuff but > > now's the time to at least look. > > > > If you change, do not use ZFS and possibly BTRFS if the system does not > have ECC DRAM. A single, unnoticed, ECC error can corrupt the data pool > and be written to the file system, which effectively renders it corrupt > without a way to recover. > > FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and > provide a boot of nfs or iSCSI to your VMs. The downside is the NAS box > and drives would go for a bit north of $636. The upside is all your > movies and VM files could move off your workstation and the workstation > would still host the VMs via a mount of the NAS box. +1 for the Synology NAS boxes, those things are awesome, fast, reliable, upgradable (if you buy a larger one), and the best value available for iSCSI attached VMs. > > > Anyway, that's the basic outline. Any thoughts, ideas, corrections, > > expansions, etc., I'm very interested in talking about. > > > > Cheers, > > Mark > > > > -- > - > > [-- Attachment #2: Type: text/html, Size: 4719 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:58 ` Harry Holt @ 2014-05-27 23:38 ` thegeezer 2014-05-28 0:26 ` Rich Freeman 2014-05-28 3:12 ` [gentoo-amd64] btrfs Was: " Duncan 0 siblings, 2 replies; 30+ messages in thread From: thegeezer @ 2014-05-27 23:38 UTC (permalink / raw To: gentoo-amd64 On 2014-05-27 23:58, Harry Holt wrote: > On May 27, 2014 6:39 PM, "Bob Sanders" <rsanders@sgi.com> wrote: > > > > Mark Knecht, mused, then expounded: > > > Hi all, > > > The list is quiet. Please excuse me waking it up. (Or trying > to...) ;-) > > > > > > I'm at the point where I'm a few months from running out of > disk > > > space on my RAID6 so I'm considering how to move forward. I > thought > > > I'd check in here and get any ideas folks have. Thanks in > advance. > > > > > > > Beware - if Adobe acroread is used, and you opt for a 3TB home > > directory, there is a chance it will not work. Or more > specifically, > > acroread is still 32-bit. It's only something I've seen with the > xfs > > filesystem. And Adobe has ignored it for approx. 3yrs now. > > > > > The system is a Gentoo 64-bit, mostly stable, using a > i7-980x > > > Extreme Edition processor with 24GB DRAM. Large chassis, 6 > removable > > > HD bays, room for 6 other drives, a large power supply. > > > > > > The disk subsystem is a 1.4TB RAID6 built from five SATA2 > 500GB WD > > > RAID-Edition 3 drives. The RAID has not had a single glitch in > the 4+ > > > years I've used this machine. > > > > > > Generally there are 4 classes of data on the RAID: > > > > > > 1) Gentoo (obviously), configs backed up every weekend. I plan to > > > rebuild from scratch using existing configs if there's a failure. > > > Being down for a couple of days is not an issue. > > > 2) VMs - about 300GB. Loaded every morning, stopped & saved every > > > night, backed up every weekend. > > > 3) Financial data - lots of it - stocks, futures, options, etc. > > > Performance requirements are pretty low. Backed up every weekend. > > > 4) Video files - backed up to a different location than items > 1/2/3 > > > whenever there are changes > > > > > > After eclean-dist/eclean-pkg I'm down to about 80GB free and > this > > > will fill up in 3-6 months so it's time to make some changes. > > > > > > My thoughts: > > > > > > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go > with > > > RAID1. This would use the internal SATA2 ports so it wouldn't be > the > > > highest performance but likely a lot better than my SATA2 RAID6. > > > > > > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware > RAID > > > controller. This would be SATA3 so probably way more performance > than > > > I have now. MUCH more expensive though. > > > > > > > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA > ports. > > > > > 3) #1 + an SSD. I have an unused 120GB SSD so I could get > another, > > > make a 2-disk RAID1, put Gentoo on that and everything else on > the > > > newer 3TB drives. More complex, probably lower reliability and > I'm not > > > sure I gain much. > > > > > > Beyond this I need to talk file system types. I'm fat dumb > and > > > happy with Ext4 and don't really relish dealing with new stuff > but > > > now's the time to at least look. > > > > > > > If you change, do not use ZFS and possibly BTRFS if the system does > not > > have ECC DRAM. A single, unnoticed, ECC error can corrupt the > data pool > > and be written to the file system, which effectively renders it > corrupt > > without a way to recover. > > > > FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives > and > > provide a boot of nfs or iSCSI to your VMs. The downside is the > NAS box > > and drives would go for a bit north of $636. The upside is all > your > > movies and VM files could move off your workstation and the > workstation > > would still host the VMs via a mount of the NAS box. > > +1 for the Synology NAS boxes, those things are awesome, fast, > reliable, upgradable (if you buy a larger one), and the best value > available for iSCSI attached VMs. while i agree on the +1 for iscsi storage, there are a few drawbacks. yes the modularity is awesome primarily -- super simple to spin up backup system and "move" data with a simple connection command. also a top tip would be to have teh "data" part of the vm as an iscsi connection too, so you can easily detach/reattach to another vm. however, depending on the vm's you have you will probably start needing to use more than one gigabit connection to max out speeds: 1gigabit ethernet is not the same as 6gigabit sata3, and spinning rust is not the same as ssd. looking to the spec of the existing workstation, i'd be tempted to stay with mdadm rather than a hardware raid card (which is probably running embedded anyway) though with that i7 you have disabled turboboost right? what would be an interesting comparison is pci-express speed vs motherboard sata - cpu bridge speed, obviously spinning disks will not max 6gbit, and the motherboard may not give you 6x 6gbit real throughput, whereas dedicated hardware raid _might_ do if it had intelligent caching. other fun to look at would be lvm cos i personally think it's awesome. for an example the first half of spinning disks is substantially faster than the second half due to the tracks on the outer part, so i split each disk into three partitions fast,med,slow and add to lvm volume group, you can then group the fasts into a raid, medium into a raid and slows into a raid too; mdadm allows similar configs with partitions. ZFS for me lost it's lustre when minimum requirement was 1GB RAM per terabyte...i may have my gigabytes and gigabits mixed up on this one happy for someone to correct me. BTRFS looks very very interesting to me, though still not played with it but mostly for checksums, the rest i can do with lvm. you might also like to consider fun with deduplication, by have a raid base, with lvm on top with block level dedupe ala lessfs, then lvm inside the deduped-lvm (yeah i know i'm sick, but the doctor tells me the layers of abstraction eventually combine happily :) but i'm not sure you'll get much benefit from virtualmachines and movies being deduped. if you add an ssd into the mix you can also look at devicemapper caches such as bcache and dm-cache, or even just moving the journal of your ext4 partition there instead. crucially you need to think about what your issues you _need_ to solve and those that you would like to solve. space is obviously one issue, and performance is not really an issue for you. depending on your budget a pair of large sata drives + mdadm will be ideal, if you had lvm already you could simply 'move' then 'enlarge' your existing stuff (tm) : i'd like to know how btrfs would do the same for anyone who can let me know. you have raid6 because you probably know that raid5 is just waiting for trouble, so i'd probably start looking at btrfs for your finanical data to be checksummed. also consider ECC memory if your motherboard supports it, never mind the hosing of filesystems, if you are running vm's you do _not_ want memory making them behave oddly or worse, and if you have lots of active financial data (bloomberg + analytics) you run the risk of the butterfly effect making odd results. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 23:38 ` thegeezer @ 2014-05-28 0:26 ` Rich Freeman 2014-05-28 3:12 ` [gentoo-amd64] btrfs Was: " Duncan 1 sibling, 0 replies; 30+ messages in thread From: Rich Freeman @ 2014-05-28 0:26 UTC (permalink / raw To: gentoo-amd64 On Tue, May 27, 2014 at 7:38 PM, <thegeezer@thegeezer.net> wrote: > if you had lvm already you could > simply 'move' then 'enlarge' your existing stuff (tm) Yup - if you're not running btrfs/zfs you probably should be running lvm. One thing I would do is backup your lvm metadata when it changes - I once got burned by an lvm error of some kind and an fsck scrambled the living daylights out of my disk (an fsck on one ext3 partition scrambled a different partition). That is pretty rare though (but I did find one or two mentions online of similar situations. > : i'd like to know how > btrfs would do the same for anyone who can let me know. A btrfs filesystem pools storage. You can add devices to the pool, and remove devices to the pool. If you remove a device with data on it the data will get moved. When adding devices btrfs does not automatically shuffle data around - you can issue a balance command to do so, but I wouldn't do this until you're done adding/removing drives. A nice thing about btrfs is that devices do not have to be of the same size and it generally does the right thing. The downside of btrfs right now for raid is that raid5/6 are still very experimental. They will support reshaping though, which is one of the reasons I've stayed away from zfs. Zfs also lets you add/remove devices from a pool, but it does not allow you to reshape a raid. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] btrfs Was: Soliciting new RAID ideas 2014-05-27 23:38 ` thegeezer 2014-05-28 0:26 ` Rich Freeman @ 2014-05-28 3:12 ` Duncan 2014-05-28 7:29 ` thegeezer 1 sibling, 1 reply; 30+ messages in thread From: Duncan @ 2014-05-28 3:12 UTC (permalink / raw To: gentoo-amd64 thegeezer posted on Wed, 28 May 2014 00:38:03 +0100 as excerpted: > depending on your budget a pair of large sata drives + mdadm will be > ideal, if you had lvm already you could simply 'move' then 'enlarge' > your existing stuff (tm) : i'd like to know how btrfs would do the same > for anyone who can let me know. > you have raid6 because you probably know that raid5 is just waiting for > trouble, so i'd probably start looking at btrfs for your finanical data > to be checksummed. Given that I'm a regular on the btrfs list as well as running it myself, I'm likely to know more about it than most. Here's a whirlwind rundown with a strong emphasis on practical points a lot of people miss (IOW, I'm skipping a lot of the commonly covered and obvious stuff). Point 6 below directly answers your move/enlarge question. Meanwhile, points 1, 7 and 8 are critically important, as we see a lot of people on the btrfs list getting them wrong. 1) Since there's raid5/6 discussion on the thread... Don't use btrfs raid56 modes at this time, except purely for playing around with trashable or fully backed up data. The implementation as introduced isn't code- complete, and while the operational runtime side works, recovery from dropped devices, not so much. Thus, in terms of data safety you're effectively running a slow raid0 with lots of extra overhead that can be considered trash if a device drops, with the sole benefit being that when the raid56 mode recovery implementation code gets merged (and is tested for a kernel cycle or two to work out the initial bugs), you'll then get what amounts to a "free" upgrade to the raid5 or raid6 mode you had originally configured, since it was doing the operational parity calculation and writes to track it all along, it just couldn't yet be used for actual recovery as the code simply wasn't there to do so. 2) Btrfs raid0, raid1 and raid10 modes, along with single mode (on a single or multiple-devices) and dup mode (on a single device, metadata is by default duplicated -- two copies, except on ssd where the default is only a single copy since some ssds dedup anyway) are reasonably mature and stable, to the same point as btrfs in general, anyway, which is to say it's "mostly stable, keep your backups fresh but you're not /too/ likely to have to use them." There are still enough bugs being fixed in each kernel release, however, that running latest stable series is /strongly/ recommended, as your data is at risk to known-fixed bugs (even if at this point they only tend to hit the corner-cases) if you're not doing so. 3) It's worth noting that btrfs treats data and metadata separately -- when you do a mkfs.btrfs, you can configure redundancy modes separately for each, the single-device default being (as above) dup metadata (except for ssd), single data, the multi-device default being raid1 metadata, single data 4) FWIW, most of my btrfs formatted partitions are dual-device raid1 mode for both data and metadata, on ssd. (Second backup is reiserfs on spinning-rust, just in case some Armageddon bug eats all the btrfs at the same time, working copy and first backup, tho btrfs is stable enough now that's extremely unlikely, but I didn't consider it so back when I set things up nearly a year ago now.) The reason for my raid1 mode choice isn't that of ordinary raid1, it's specifically due to btrfs' checksumming and data integrity features -- if one copy fails its checksum, btrfs will, IF IT HAS ANOTHER COPY TO TRY, check the second copy and if it's good, will use it and rewrite the bad copy. Btrfs scrub allows checking the entire filesystem for checksum errors and restoring any errors it finds from good copies where possible. Obviously, the default single data mode (or raid0) won't have a second copy to check and rewrite from, while raid1 (and raid10) modes will (as will dup-mode metadata on a single device, but with one exception, dup mode isn't allowed for data, only metadata, the exception being the mixed- blockgroup mode that mixes data and metadata together, that's the default on filesystems under 1 GiB but isn't recommended on large filesystems for performance reasons). So I wanted a second copy of both data and metadata to take advantage of btrfs' data integrity and scrub features, and with btrfs raid1 mode, I get both that and the traditional raid1 device-loss protection as well. =:^) 5) It's worth noting that as of now, btrfs raid1 mode is only two-way- mirrored, no matter how many devices are configured into the filesystem. N-way-mirrored is the next feature on the roadmap after the raid56 work is completed, but given how nearly every btrfs feature has taken much longer to complete than originally planned, I'm not expecting it until sometime next year, now. Which is unfortunate, as my risk vs. cost sweet spot would be 3-way- mirroring, covering in case *TWO* copies of a block failed checksum. Oh, well, it's coming, even if it seems at this point like the proverbial carrot dangling off a stick held in front of the donkey. 6) Btrfs handles moving then enlarging (parallel to LVM) using btrfs add/delete, to add or delete a device to/from a filesystem (moving the content from a to-be-deleted device in the process), plus btrfs balance, to restripe/convert/rebalance between devices as well as to free allocated but empty data and metadata chunks back to unallocated. There's also btrfs resize, but that's more like the conventional filesystem resize command, resizing the part of the filesystem on an individual device (partitioned/virtual or whole physical device). So to add a device, you'd btrfs device add, then btrfs balance, with an optional conversion to a different redundancy mode if desired, to rebalance the existing data and metadata onto that device. (Without the rebalance it would be used for new chunks, but existing data and metadata chunks would stay where they were. I'll omit the "chunk definition" discussion in the interest of brevity.) To delete a device, you'd btrfs device delete, which would move all the data on that device onto other existing devices in the filesystem, after which it could be removed. 7) Given the thread, I'd be remiss to omit this one. VM images and other large "internal-rewrite-pattern" files (large database files, etc) need special treatment on btrfs, at least currently. As such, btrfs may not be the greatest solution for Mark (tho it would work fine with special procedures), given the several VMs he runs. This one unfortunately hits a lot of people. =:^( But here's a heads-up, so it doesn't have to hit anyone reading this! =:^) As a property of the technology, any copy-on-write-based filesystem is going to find files where various bits of existing data within the file are repeatedly rewritten (as opposed to new data simply being appended, think a log file or live-stored audio/video stream) extremely challenging to deal with. The problem is that unlike ordinary filesystems that rewrite the data in place such that a file continues to occupy the same extents as it did before, copy-on-write filesystems will write a changed block to a different location. While COW does mean atomic updates and thus more reliability since either the new data or the old data should exist, never an unpredictable mixture of the two, as a result of the above rewrite pattern, this type of internally-rewritten file gets **HEAVILY** fragmented over time. We've had filefrag reports of several gig files with over 100K extents! Obviously, this isn't going to be the most efficient file in the world to access! For smaller files, up to a couple hundred MiB or perhaps a bit more, btrfs has the autodefrag mount option, which can help a lot. With this option enabled, whenever a block of a file is changed and rewritten, thus written elsewhere, btrfs queues up a rewrite of the entire file to happen in the background. The rewrite will be done sequentially, thus defragging the file. This works quite well for firefox's sqlite database files, for instance, as they're internal-rewrite-pattern, but they're small enough that autodefrag handles them reasonably nicely. But this solution doesn't scale so well as the file size increases toward and past a GiB, particularly for files with a continuous stream of internal rewrites such as can happen with an operating VM writing to its virtual storage device. At some point, the stream of writes comes in faster than the file can be rewritten, and things start to back up! To deal with this case, there's the NOCOW file attribute, set with chattr +C. However, to be effective, this attribute must be set when the file is empty, before it has existing content. The easiest way to do that is to set the attribute on the directory which will contain the files. While it doesn't affect the directory itself any, newly created files within that directory inherit the NOCOW attribute before they have data, thus allowing it to work without having to worry about it that much. For existing files, create a new directory, set its NOCOW attribute, and COPY (don't move, and don't use cp --reflink) the existing files into it. Once you have your large internal-rewrite-pattern files set NOCOW, btrfs will rewrite them in-place as an ordinary filesystem would, thus avoiding the problem. Except for one thing. I haven't mentioned btrfs snapshots yet as that feature, but for this caveat, is covered well enough elsewhere. But here's the problem. A snapshot locks the existing file data in place. As a result, the first write to a block within a file after a snapshot MUST be COW, even if the file is otherwise set NOCOW. If only the occasional one-off snapshot is done it's not /too/ bad, as all the internal file writes between snapshots are NOCOW, it's only the first one to each file block after a snapshot that must be COW. But many people and distros are script-automating their snapshots in ordered to have rollback capacities, and on btrfs, snapshots are (ordinarily) light enough that people are sometimes configuring a snapshot a minute! If only a minute's changes can be written to a the existing location, then there's a snapshot and changes must be written to a new location, then another snapshot and yet another location... Basically the NOCOW we set on that file isn't doing us any good! 8) So making this a separate point as it's important and a lot of people get it wrong. NOCOW and snapshots don't mix! There is, however, a (partial) workaround. Because snapshots stop at btrfs subvolume boundaries, if you put your large VM images and similar large internal-rewrite-pattern files (databases, etc) in subvolumes, making that directory I suggested above a full subvolume not just a NOCOW directory, snapshots of the parent subvolume will not include the VM images subvolume, thus leaving the VM images alone. This solves the snapshot-broken-NOCOW and thus the fragmentation issue, but it DOES mean that those VM images must be backed up using more conventional methods since snapshotting won't work for them. 9) Some other still partially broken bits of btrfs include: 9a) Quotas: Just don't use them on btrfs at this point. Performance doesn't scale (altho there's a rewrite in progress), and they are buggy. Additionally, the scaling interaction with snapshots is geometrically negative, sometimes requiring 64 GiB of RAM or more and coming to a near standstill at that, for users with enough quota-groups and enough snapshots. If you need quotas, use a more traditional filesystem with stable quota support. Hopefully by this time next year... 9b) Snapshot-aware-defrag: This was enabled at one point but simply didn't scale, when it turned out people were doing things like per-minute snapshots and thus had thousands and thousands of snapshots. So this has been disabled for the time being. Btrfs defrag will defrag the working copy it is run on, but currently doesn't account for snapshots, so data that was fragmented at snapshot time gets duplicated as it is defragmented. However, they plan to re-enable the feature ones they have rewritten various bits to scale far better than they do at present. 9c) Send and receive. Btrfs send and receive are a very nice feature that can make backups far faster, with far less data transferred. They're great when they work. Unfortunately, there are still various corner-cases where they don't. (As an example, a recent fix was for the case where subdir B was nested inside subdir A for the first, full send/ receive, but later, the relationship was reversed, with subdir B made the parent of subdir A. Until the recent fix, send/receive couldn't handle that sort of corner-case.) You can go ahead and use it if it's working for you, as if it finishes without error, the copy should be 100% reliable. However, have an alternate plan for backups if you suddenly hit one of those corner-cases and send/receive quits working. Of course it's worth mentioning that b and c deal with features that most filesystems don't have at all, so with the exception of quotas, it's not like something's broken on btrfs that works on other filesystems. Instead, these features are (nearly) unique to btrfs, so even if they come with certain limitations, that's still better than not having the option of using the feature at all, because it simply doesn't exist on the other filesystem! 10) Btrfs in general is headed toward stable now, and a lot of people, including me, have used it for a significant amount of time without problems, but it's still new enough that you're strongly urged to make and test your backups, because by not doing so, you're stating by your actions if not your words, that you simply don't care if some as yet undiscovered and unfixed bug in the filesystem eats your data. For similar reasons altho already mentioned above, run the latest stable kernel from the latest stable kernel series, at the oldest, and consider running rc kernels from at least rc2 or so (by which time any real data eating bugs, in btrfs or elsewhere, should be found and fixed, or at least published). Because anything older and you are literally risking your data to known and fixed bugs. As is said, take reasonable care and you're much less likely to be the statistic! -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] btrfs Was: Soliciting new RAID ideas 2014-05-28 3:12 ` [gentoo-amd64] btrfs Was: " Duncan @ 2014-05-28 7:29 ` thegeezer 2014-05-28 20:32 ` Marc Joliet 0 siblings, 1 reply; 30+ messages in thread From: thegeezer @ 2014-05-28 7:29 UTC (permalink / raw To: gentoo-amd64 top man, thanks for detail and the tips ! ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] btrfs Was: Soliciting new RAID ideas 2014-05-28 7:29 ` thegeezer @ 2014-05-28 20:32 ` Marc Joliet 2014-05-29 6:41 ` [gentoo-amd64] " Duncan 0 siblings, 1 reply; 30+ messages in thread From: Marc Joliet @ 2014-05-28 20:32 UTC (permalink / raw To: gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 3028 bytes --] (Dammit, it seems that I've developed a habit of writing somewhat long-winded emails :-/ . Sorry!) Am Wed, 28 May 2014 08:29:07 +0100 schrieb thegeezer <thegeezer@thegeezer.net>: > top man, thanks for detail and the tips ! I second this :) . In fact, I think I'll link to it in my btrfs thread on gentoo-user. I do have a question for Duncan (or anybody else who knows, but I know that Duncan is fairly active on the BTRFS ML), though: How does btrfs handle checksum errors on a single drive (or when self-healing fails)? That is, does it return a hard error, rendering the file unreadable, or is it possible to read from a corrupted file? Sadly, I don't remember finding the answer to this from my own research into BTRFS before I made the switch (my thread is here: [0]), and searching online now hasn't revealed anything; all I can find are mentions of its self-healing capability. I *think* BTRFS treats this as a hard error? But I'm just not sure. (I feel kind of stupid, because I'm sure I saw the answer in some of the emails on linux-btrfs that I read through via GMANE.) I ask because I'm considering converting the 2TB data partition on my 3TB external hard drive from NTFS to BTRFS [1] . It primarily contains media files, where random corruption is decidedly *not* the end of the world. However, it also contains ISOs and other large files where corruption matters more, but which are not important enough to land on my BTRFS RAID (on the other hand, my music collection is ;-) ). In any case, reconstructing a corrupted file can be fairly difficult: It might involve re-ripping a (game) disk, or it might be something I got from a friend, delaying file recovery until I can get it again, or the file might be a youtube download (or a conference video, or something from archive.org, or ...) and I have to track it down online again. However, I might want to *know* that a file is corrupt, so that I *can* reconstruct it if I want to. The obvious answer, retrieving from backup, is difficult to implement, since I would need an additional external drive for that. Also, the files are not *that* important, e.g., in the case of a youtube download, where most of the time I delete the file afterwards anyway. (It seems to me that the optimal solution would be to use some sort of NAS, with a multi-device ZFS or BTRFS file system, in place of an external hard drive; I expect to go that route in the future, when I can afford it.) [0] http://thread.gmane.org/gmane.linux.gentoo.user/274236 [1] I used NTFS under the assumption that I might want to keep the drive Windows compatible (for family), but have decided that I don't really care, since the drive is pretty much permanently attached to my desktop (it also has an EXT4 partition for automatic local backups, so removing it would be less than optimal ;-) ). -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas 2014-05-28 20:32 ` Marc Joliet @ 2014-05-29 6:41 ` Duncan 2014-05-29 17:57 ` Marc Joliet 0 siblings, 1 reply; 30+ messages in thread From: Duncan @ 2014-05-29 6:41 UTC (permalink / raw To: gentoo-amd64 Marc Joliet posted on Wed, 28 May 2014 22:32:47 +0200 as excerpted: > (Dammit, it seems that I've developed a habit of writing somewhat > long-winded emails :-/ . Sorry!) You? <looking this way and that> What does that make mine? =:^) > Am Wed, 28 May 2014 08:29:07 +0100 schrieb thegeezer > <thegeezer@thegeezer.net>: > >> top man, thanks for detail and the tips ! > > I second this :) . In fact, I think I'll link to it in my btrfs thread > on gentoo-user. Thanks. I was on the user list for a short time back in 2004 when I first started with gentoo, but back then it was mostly x86, while my interest was amd64, and the amd64 list was active enough back then that I didn't really feel the need for the mostly x86 user list, so I unsubscribed and never got around to subscribing again, when the amd64 list traffic mostly dried up. But if it'll help people there... go right ahead and link or repost. (Also, anyone who wants to put it up on the gentoo wiki, go ahead. I work best on newsgroups and mailing lists, and find wikis, like most of the web, in practice read-only for my usage. I'll read up on them, but somehow never get around to actually writing anything on them, even if it would in theory save me a bunch of time since I could write stuff once and link it instead of repeating on the lists.) > I do have a question for Duncan (or anybody else who knows, but I know > that Duncan is fairly active on the BTRFS ML), though: > > How does btrfs handle checksum errors on a single drive (or when > self-healing fails)? > > That is, does it return a hard error, rendering the file unreadable, or > is it possible to read from a corrupted file? As you suspect, it's a hard error. There has been developer discussion on the btrfs list of some sort of mount option or the like that would allow retrieval even with bad checksums, presumably with dmesg then being the only indication something was wrong, in case it's a simple single bit-flip or the like in something like text where it should be obvious, or media, where it'll likely not even be noticed, but I've not seen an actual patch for it. Presumably it'll eventually happen, but to now there's a lot more potential features and bug fixes to code up than developers and time in their days to code them, so no idea when. I guess when the right person gets that itch to scratch. Which is yet another reason I have chosen the raid1 mode for both data and metadata and am eagerly awaiting the N-way-mirroring code in ordered to let me do 3-way as well, because I'd really /hate/ to think it's just a bitflip, yet not have any way at all to get to it. Which of course makes it that much more critical to keep your backups as current as you're willing to risk losing, *AND* test that they're actually recoverable, as well. (FWIW here, while I do have backups, they aren't always current. Still, for my purposes the *REAL* backups are the experiences and knowledge in my head. As long as I have that, I can recreate the real valuable stuff, and to the extent that I can't, I don't consider it /that/ valuable. And if I lose those REAL backups... well I won't have enough left then to realize what I've lost, will I? That's ultimately the attitude I take, appreciating the real important stuff for what it is, and the rest, well, if it comes to it, I lose what I lose, but yes, I do still keep backups, actually multiple levels deep, tho as I said they aren't always current.) However, one trick that I alluded to, that actually turned out to be an accidental side effect feature of fixing an entirely different problem, is setting mixed-blockgroup mode at mkfs.btrfs and selecting dup mode for both data and metadata at that time as well. (In mixed-mode, data and metadata must be set the same, and the default except on ssd is then dup, but the point here is to ensure dup, not single.) As I said, the reason mixed-mode is there is to deal with really small filesystems and it's the default for under a gig. And there's definitely a performance cost as well as the double-space cost when using dup. But it *DOES* allow one to run dup mode for both data and metadata, and some users are willing to pay its performance costs for the additional data integrity it offers. Certainly, if you can possibly do two devices, the paired device raid1 mode is preferable, but for instance my netbook has only a single SATA port, so either mixed-bg and dup mode, or partitioning up and using two partitions to fake two devices for raid1 mode, are what I'm likely to do. (I actually don't know which I'll do as I haven't messed with the netbook in awhile, but I have an SSD already laying around to throw in it and I keep thinking about it, and with its single SATA port, it's a perfect example of sometimes not being /able/ to run two devices. OTOH, I might just throw some money at it and buy a full 64-bit replacement machine, thus allowing me to use the 64-bit packages I build for my main machine on the (new) little one too, and thus to do away with the 32-bit chroot on my main machine that I use as a built image for the netbook.) (I snipped it there to reply to this bit first as it was a straightforward answer. I'll go back and read the rest now, to see if there's anything else I want to reply to.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas 2014-05-29 6:41 ` [gentoo-amd64] " Duncan @ 2014-05-29 17:57 ` Marc Joliet 2014-05-29 17:59 ` Rich Freeman 0 siblings, 1 reply; 30+ messages in thread From: Marc Joliet @ 2014-05-29 17:57 UTC (permalink / raw To: gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 6405 bytes --] Am Thu, 29 May 2014 06:41:14 +0000 (UTC) schrieb Duncan <1i5t5.duncan@cox.net>: > Marc Joliet posted on Wed, 28 May 2014 22:32:47 +0200 as excerpted: > > > (Dammit, it seems that I've developed a habit of writing somewhat > > long-winded emails :-/ . Sorry!) > > You? <looking this way and that> What does that make mine? =:^) Novels, duh ;-) . > > Am Wed, 28 May 2014 08:29:07 +0100 schrieb thegeezer > > <thegeezer@thegeezer.net>: > > > >> top man, thanks for detail and the tips ! > > > > I second this :) . In fact, I think I'll link to it in my btrfs thread > > on gentoo-user. > > Thanks. I was on the user list for a short time back in 2004 when I > first started with gentoo, but back then it was mostly x86, while my > interest was amd64, and the amd64 list was active enough back then that I > didn't really feel the need for the mostly x86 user list, so I > unsubscribed and never got around to subscribing again, when the amd64 > list traffic mostly dried up. But if it'll help people there... go right > ahead and link or repost. I ended up simply forwarding it, as opposed to bumping my inactive thread. > (Also, anyone who wants to put it up on the > gentoo wiki, go ahead. I work best on newsgroups and mailing lists, and > find wikis, like most of the web, in practice read-only for my usage. > I'll read up on them, but somehow never get around to actually writing > anything on them, even if it would in theory save me a bunch of time > since I could write stuff once and link it instead of repeating on the > lists.) Heh, the only Wiki I ever edited was at my old student job. But yeah, I don't feel comfortable enough in my BTRFS knowledge to write a Wiki entry myself. > > I do have a question for Duncan (or anybody else who knows, but I know > > that Duncan is fairly active on the BTRFS ML), though: > > > > How does btrfs handle checksum errors on a single drive (or when > > self-healing fails)? > > > > That is, does it return a hard error, rendering the file unreadable, or > > is it possible to read from a corrupted file? > > As you suspect, it's a hard error. Damn >:-( . > There has been developer discussion on the btrfs list of some sort of > mount option or the like that would allow retrieval even with bad > checksums, presumably with dmesg then being the only indication something > was wrong, in case it's a simple single bit-flip or the like in something > like text where it should be obvious, or media, where it'll likely not > even be noticed, but I've not seen an actual patch for it. Presumably > it'll eventually happen, but to now there's a lot more potential features > and bug fixes to code up than developers and time in their days to code > them, so no idea when. I guess when the right person gets that itch to > scratch. That's really too bad, I guess this isn't a situation that often arises for BTRFS users. > Which is yet another reason I have chosen the raid1 mode for both data > and metadata and am eagerly awaiting the N-way-mirroring code in ordered > to let me do 3-way as well, because I'd really /hate/ to think it's just > a bitflip, yet not have any way at all to get to it. > > Which of course makes it that much more critical to keep your backups as > current as you're willing to risk losing, *AND* test that they're > actually recoverable, as well. Of course, but like I said, I can't back up this one data partition. I do have backups for everything on my desktop computer, though, which are on the other partition of this external drive. > (FWIW here, while I do have backups, they aren't always current. Still, > for my purposes the *REAL* backups are the experiences and knowledge in > my head. As long as I have that, I can recreate the real valuable stuff, > and to the extent that I can't, I don't consider it /that/ valuable. And > if I lose those REAL backups... well I won't have enough left then to > realize what I've lost, will I? That's ultimately the attitude I take, > appreciating the real important stuff for what it is, and the rest, well, > if it comes to it, I lose what I lose, but yes, I do still keep backups, > actually multiple levels deep, tho as I said they aren't always current.) Hehe, good philosophy :-) . > However, one trick that I alluded to, that actually turned out to be an > accidental side effect feature of fixing an entirely different problem, > is setting mixed-blockgroup mode at mkfs.btrfs and selecting dup mode for > both data and metadata at that time as well. (In mixed-mode, data and > metadata must be set the same, and the default except on ssd is then dup, > but the point here is to ensure dup, not single.) As I said, the reason > mixed-mode is there is to deal with really small filesystems and it's the > default for under a gig. And there's definitely a performance cost as > well as the double-space cost when using dup. But it *DOES* allow one to > run dup mode for both data and metadata, and some users are willing to > pay its performance costs for the additional data integrity it offers. That is an interesting idea. I might consider that. Or I might just create a third partition and make a RAID 1 out of those, once I know how much space my backups will ultimately take. But really, why is there no dup for data? (I only set up my backups about a month ago just before my migration to BTRFS, using rsnapshot, and the backups aren't fully there yet; the one monthly backup is still missing, and I wanted to wait a bit after that to see how much space the backups ultimately require. Plus, I might back up (parts of) my laptop to there, too, although there isn't that much stuff on it that isn't already synchronised in some other fashion, so it's not decided yet.) > Certainly, if you can possibly do two devices, the paired device raid1 > mode is preferable, but for instance my netbook has only a single SATA > port, so either mixed-bg and dup mode, or partitioning up and using two > partitions to fake two devices for raid1 mode, are what I'm likely to do. [...] Ah, you mentioned the RAID 1 idea already :-) . -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas 2014-05-29 17:57 ` Marc Joliet @ 2014-05-29 17:59 ` Rich Freeman 2014-05-29 18:25 ` Mark Knecht 2014-05-29 21:05 ` Frank Peters 0 siblings, 2 replies; 30+ messages in thread From: Rich Freeman @ 2014-05-29 17:59 UTC (permalink / raw To: gentoo-amd64 On Thu, May 29, 2014 at 1:57 PM, Marc Joliet <marcec@gmx.de> wrote: > Am Thu, 29 May 2014 06:41:14 +0000 (UTC) > schrieb Duncan <1i5t5.duncan@cox.net>: >> Thanks. I was on the user list for a short time back in 2004 when I >> first started with gentoo, but back then it was mostly x86, while my >> interest was amd64, and the amd64 list was active enough back then that I >> didn't really feel the need for the mostly x86 user list, so I >> unsubscribed and never got around to subscribing again, when the amd64 >> list traffic mostly dried up. But if it'll help people there... go right >> ahead and link or repost. > > I ended up simply forwarding it, as opposed to bumping my inactive thread. When was the last time we actually had an amd64-specific discussion on this list? Part of me wonders if the list ought to be retired. It made a lot more sense back when amd64 was fairly experimental and prone to fairly unique issues. I deleted my 32-bit chroot some time ago. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas 2014-05-29 17:59 ` Rich Freeman @ 2014-05-29 18:25 ` Mark Knecht 2014-05-29 21:05 ` Frank Peters 1 sibling, 0 replies; 30+ messages in thread From: Mark Knecht @ 2014-05-29 18:25 UTC (permalink / raw To: Gentoo AMD64 On Thu, May 29, 2014 at 10:59 AM, Rich Freeman <rich0@gentoo.org> wrote: > On Thu, May 29, 2014 at 1:57 PM, Marc Joliet <marcec@gmx.de> wrote: >> Am Thu, 29 May 2014 06:41:14 +0000 (UTC) >> schrieb Duncan <1i5t5.duncan@cox.net>: >>> Thanks. I was on the user list for a short time back in 2004 when I >>> first started with gentoo, but back then it was mostly x86, while my >>> interest was amd64, and the amd64 list was active enough back then that I >>> didn't really feel the need for the mostly x86 user list, so I >>> unsubscribed and never got around to subscribing again, when the amd64 >>> list traffic mostly dried up. But if it'll help people there... go right >>> ahead and link or repost. >> >> I ended up simply forwarding it, as opposed to bumping my inactive thread. > > When was the last time we actually had an amd64-specific discussion on > this list? Part of me wonders if the list ought to be retired. It > made a lot more sense back when amd64 was fairly experimental and > prone to fairly unique issues. I deleted my 32-bit chroot some time > ago. > > Rich > I completely understand your point but in my case, after about a decade ongentoo-user, I quit posting gentoo-user completely due to the attitudes of some folks there, flame posts, put-downs, etc. I have no idea how it is now but I have no real desire to go back there. The two things I really value about this list are the quality of posts as well as the very civil way folks treat each other. Just my 2 cents Cheers, Mark ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas 2014-05-29 17:59 ` Rich Freeman 2014-05-29 18:25 ` Mark Knecht @ 2014-05-29 21:05 ` Frank Peters 2014-05-30 2:04 ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan 1 sibling, 1 reply; 30+ messages in thread From: Frank Peters @ 2014-05-29 21:05 UTC (permalink / raw To: gentoo-amd64 On Thu, 29 May 2014 13:59:25 -0400 Rich Freeman <rich0@gentoo.org> wrote: > > When was the last time we actually had an amd64-specific discussion on > this list? Part of me wonders if the list ought to be retired. It > made a lot more sense back when amd64 was fairly experimental and > prone to fairly unique issues. > There may not be any amd64 issues, but there certainly are a lot of gripes. For those who operate a pure 64-bit system (no multi-lib), there is a fair amount of highly useful software that has not yet been updated to be 64-bit clean. For example, Adobe PDF Reader, Foxit PDF Reader, and the Intel ICC compiler are still 32-bit. I wish these folks would get with the modern trends. Frank Peters ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] amd64 list, still useful? Was: btrfs 2014-05-29 21:05 ` Frank Peters @ 2014-05-30 2:04 ` Duncan 2014-05-30 2:44 ` Frank Peters 2014-06-04 16:41 ` [gentoo-amd64] " Mark Knecht 0 siblings, 2 replies; 30+ messages in thread From: Duncan @ 2014-05-30 2:04 UTC (permalink / raw To: gentoo-amd64 Frank Peters posted on Thu, 29 May 2014 17:05:26 -0400 as excerpted: > There may not be any amd64 issues, but there certainly are a lot of > gripes. > > For those who operate a pure 64-bit system (no multi-lib), there is a > fair amount of highly useful software that has not yet been updated to > be 64-bit clean. For example, Adobe PDF Reader, Foxit PDF Reader, and > the Intel ICC compiler are still 32-bit. I wish these folks would get > with the modern trends. FWIW, I'm no-multilib as well, but I guess for a different reason. I don't do proprietary and in general couldn't even if I wanted to, since I cannot and will not agree to the EULAs, so non-free software that hasn't been amd64 ported is of no concern to me, except that it's yet another case where authors chose not to respect my rights, so I simply don't use their software. Meanwhile, all the software I actually use has long since been ported, and I no longer even use grub-static, since I've switched to grub2, which builds just fine on amd64. So there's literally no reason for me to run multilib at all, and in fact, when I switched over some years ago, I had already had various problems due to the 32-bit side which I never used except to build toolchain 32-bit support, breaking. As a result, simply switching to no- multilib significantly decomplicated life and resulted in far faster gcc and glibc rebuilds as well, and there was literally no down side whatsoever, except that I had to run grub-static for a couple years. Tho I do still have a 32-bit chroot as the build-root for my 32-bit only netbook. But by policy I don't keep anything private on the netbook and actually don't use it as a NET-book anyway, only connecting it via ethernet here at home. (I never did get the wifi working on it, I tried at one point, but apparently there was some bug in the kernel wifi driver at that point and I couldn't connect, and I simply never bothered since.) So security isn't a huge deal on it, and I actually haven't updated it in a couple years now, to the point that I'd have severe problems updating it using a current gentoo tree due to EAPI upgrade issues, so I'd have to do staggered updates using archived trees. At this point that means I'll probably just do a full from-stage3-rebuild at some point... if I even bother at all. I might actually just hardware upgrade to a 64-bit machine, such that I can use my main system's binpkgs for both machines. Meanwhile, Mark's reason for staying on this list, as opposed to the general user list, are more or less mine, as well. I never actually saw the negatives he saw there, and there was a time when there was an attack on me here, which I'll never forget as it was quite an experience seeing other regulars and lurkers too come out of the woodwork to defend me. I knew a lot of folks liked my posts due to thanks now and again, but WOW, I had no idea I had benefitted THAT many lurkers along with the others, and it was quite humbling indeed to see them post perhaps their only post in years, to defend me! Certainly a life-changing experience! Rather, in my case it is more that I remember the high traffic of the user list and kind of like the lower but perhaps higher quality traffic here, tho at times it's /too/ low traffic, these days. Probably at some point I'll get back to the user list, but if this list were to shut down, I'd still miss it, because while there's not a lot of traffic here these days, the signal to noise ratio really is about the highest I can imagine. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] amd64 list, still useful? Was: btrfs 2014-05-30 2:04 ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan @ 2014-05-30 2:44 ` Frank Peters 2014-05-30 6:25 ` [gentoo-amd64] " Duncan 2014-06-04 16:41 ` [gentoo-amd64] " Mark Knecht 1 sibling, 1 reply; 30+ messages in thread From: Frank Peters @ 2014-05-30 2:44 UTC (permalink / raw To: gentoo-amd64 On Fri, 30 May 2014 02:04:39 +0000 (UTC) Duncan <1i5t5.duncan@cox.net> wrote: > > FWIW, I'm no-multilib as well, but I guess for a different reason. > > I don't do proprietary and in general couldn't even if I wanted to, since > I cannot and will not agree to the EULAs, so non-free software that > hasn't been amd64 ported is of no concern to me, > It's not just proprietary software that lags behind. I continue to encounter FOSS packages from time to time that are still 32-bit only. One example, for audio enthusiasts, is the excellent AudioCutter: http://www.virtualworlds.de/AudioCutter/ (There are many other examples but at this moment I can't recall any specific names so you'll just have to trust me). However, when it comes to the PDF file format it is hard to beat the proprietary Foxit Reader. With FOSS only evince comes close but evince lacks a lot of capability and seems to be buggy in places. AMD64 should be the standard but many projects refuse to update since reliance on multi-lib is so much simpler. As a consequence we 64-bit purists are at a disadvantage. Frank Peters ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs 2014-05-30 2:44 ` Frank Peters @ 2014-05-30 6:25 ` Duncan 0 siblings, 0 replies; 30+ messages in thread From: Duncan @ 2014-05-30 6:25 UTC (permalink / raw To: gentoo-amd64 Frank Peters posted on Thu, 29 May 2014 22:44:05 -0400 as excerpted: > On Fri, 30 May 2014 02:04:39 +0000 (UTC) > Duncan <1i5t5.duncan@cox.net> wrote: > >> FWIW, I'm no-multilib as well, but I guess for a different reason. >> >> I don't do proprietary [...] >> > It's not just proprietary software that lags behind. I continue to > encounter FOSS packages from time to time that are still 32-bit only. > > One example, for audio enthusiasts, is the excellent AudioCutter: > http://www.virtualworlds.de/AudioCutter/ I'm not saying 32-bit-only FLOSS isn't out there, only that by now, and actually from 2010 or so (to pick the turn of the decade as a convenient date, one could actually say by 2008 or so), it's increasingly non- mainstream. There's the occasional exception, but for most people, either their 32-bit concerns are proprietary only, or there's a more mainstream 64-bit alternative. Luckily for me, my interests are mainstream enough... > (There are many other examples but at this moment I can't recall any > specific names so you'll just have to trust me). > > However, when it comes to the PDF file format it is hard to beat the > proprietary Foxit Reader. With FOSS only evince comes close but evince > lacks a lot of capability and seems to be buggy in places. I should explicitly mention that I'm all for people making their own decisions regarding proprietary. Because I know if someone had tried to push me before I was ready, even while I was preparing for my ultimate switch, the results would have been nothing but negative. So everyone must move when they are ready, and if that time never comes, well... But at the same time, that decision is behind me personally, and there's simply no way I'm going back to the days of proprietary. As for pdf, I'm running (semantic-desktop-stripped) kde and okular, and have been reasonably happy with it. Where I've seen people complain about PDF readability or compatibility and have checked, okular has done well enough for me, to the point I never saw what they were complaining about. Meanwhile, even if I did find some PDF nothing I could run would handle, that would simply mean I'd not read that pdf, tho if it was worth it I could envision taking it to the library to read or to a printer to have them print it out or something. But I wouldn't install anything proprietary on my own systems to read it. There are too many other things to do in the world to worry about missing what's in one pdf, especially if it meant my freedom was on the line. > AMD64 should be the standard but many projects refuse to update since > reliance on multi-lib is so much simpler. As a consequence we 64-bit > purists are at a disadvantage. True at times. Luckily, those times aren't so frequent these days. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] amd64 list, still useful? Was: btrfs 2014-05-30 2:04 ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan 2014-05-30 2:44 ` Frank Peters @ 2014-06-04 16:41 ` Mark Knecht 2014-06-05 2:00 ` [gentoo-amd64] " Duncan 1 sibling, 1 reply; 30+ messages in thread From: Mark Knecht @ 2014-06-04 16:41 UTC (permalink / raw To: Gentoo AMD64 On Thu, May 29, 2014 at 7:04 PM, Duncan <1i5t5.duncan@cox.net> wrote: <SNIP> > Meanwhile, Mark's reason for staying on this list, as opposed to the > general user list, are more or less mine, as well. <SNIP> > > Rather, in my case it is more that I remember the high traffic of the > user list and kind of like the lower but perhaps higher quality traffic > here, tho at times it's /too/ low traffic, these days. Probably at some > point I'll get back to the user list, but if this list were to shut down, > I'd still miss it, because while there's not a lot of traffic here these > days, the signal to noise ratio really is about the highest I can imagine. > > -- > Duncan - List replies preferred. No HTML msgs. > "Every nonfree program has a lord, a master -- > and if you use the program, he is your master." Richard Stallman Hi Duncan, There is an in progress, higher energy thread on gentoo-user with folks getting upset (my interpretation) about systemd and support for suspend/resume features. I only found it being that I ran into an emerge block and went looking for a solution. (In my case it was -upower as a new use flag setting.) Anyway, I prefer it here. If I was reading that thread real-time I know I'd be responding to a few things even though I don't have anything of much value to add. It's just my nature in the presence of threads like that! ;-) Cheers, Mark P.S. - BTW - I love your long answers although I seldom have time to read them when they arrive. Stay true. They are of value. ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs 2014-06-04 16:41 ` [gentoo-amd64] " Mark Knecht @ 2014-06-05 2:00 ` Duncan 2014-06-05 18:59 ` Mark Knecht [not found] ` <Alo71o01J1aVA4001lo9xP> 0 siblings, 2 replies; 30+ messages in thread From: Duncan @ 2014-06-05 2:00 UTC (permalink / raw To: gentoo-amd64 Mark Knecht posted on Wed, 04 Jun 2014 09:41:30 -0700 as excerpted: > There is an in progress, higher energy thread on gentoo-user with folks > getting upset (my interpretation) about systemd and support for > suspend/resume features. I only found it being that I ran into an emerge > block and went looking for a solution. (In my case it was -upower as a > new use flag setting.) Yeah. I saw the original dev-list thread on the topic, before it all hit the tree (and continuing now), which is a big part of why I subscribe to the dev-list, to get heads-up about things like that. What happened from the dev-list perspective is that after upower dropped about half the original package as systemd replaced that functionality, the gentoo maintainers split the package in half, the still included functionality under the original upower name, with the dropped portion in a new, basically-gentoo-as-upstream, package, upower-pm-utils. But to the gentoo maintainer the portage output was sufficient that between emerge --pretend --tree --unordered-display and eix upower, what was needed was self-evident, so he didn't judge a news item necessary. What a lot of other users (including me) AND devs are telling him is that he's apparently too close to the problem to see that it's not as obvious as he thinks, and a news item really is necessary. Compounding the problem for users is that few users actually pulled in upower on their own and don't really know or care about it -- it's pulled in due to default desktop-profile use-flags as it's the way most desktops handle suspend/hibernate. Further, certain desktop dependencies apparently got default-order reversed on the alternative-deps, so portage tries to fill the dep with systemd instead of the other package. Unfortunately that's turning everybody's world upside down, as suddenly portage wants to pull in systemd *AND* there's all these blockers! Meanwhile, even tho he didn't originally think it necessary, once pretty much all gentoo userspace (forums, irc, lists, various blogs...) erupted in chaos, the gentoo maintainer decided that even tho he didn't quite understand /why/ a news item was needed, that was the best way to get the message out as to how to fix things and to calm things back down. But, policy is that such news items must be posted to the gentoo-dev list for (ideally) several days of comment before they're committed, and a good policy it is in general too, because the news items generally turn out FAR better with multiple people looking over the drafts and making suggestions, than the single-person first-drafts tend to be! In cases such as this, however, the comment time is shortened to only a day or two unless something seriously wrong comes up in the process, and while I've not synced for a few days, I'd guess that news item has either hit before I send this, or certainly if not, it'll hit within a matter of hours. Once the news item hits, for people that actually read them at least, the problem should be pretty much eliminated, as there's appropriate instructions for how to fix the blocker, etc. So things should really be simmering back down pretty shortly. =:^) Meanwhile, in the larger perspective of things, it's just a relatively minor goof that as usual is fixed in a couple days. No big deal, except that /this/ goof happens to include the uber-lightening-rod-package that is systemd. Be that as it may, the world isn't ending, and the problem is indeed still fixed up within a couple days, as usual, with information, some reliable, some not so reliable, available via the usual channels for those who don't want to wait. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs 2014-06-05 2:00 ` [gentoo-amd64] " Duncan @ 2014-06-05 18:59 ` Mark Knecht 2014-06-06 12:11 ` Duncan [not found] ` <Alo71o01J1aVA4001lo9xP> 1 sibling, 1 reply; 30+ messages in thread From: Mark Knecht @ 2014-06-05 18:59 UTC (permalink / raw To: Gentoo AMD64 On Wed, Jun 4, 2014 at 7:00 PM, Duncan <1i5t5.duncan@cox.net> wrote: > Mark Knecht posted on Wed, 04 Jun 2014 09:41:30 -0700 as excerpted: > >> There is an in progress, higher energy thread on gentoo-user with folks >> getting upset (my interpretation) about systemd and support for >> suspend/resume features. I only found it being that I ran into an emerge >> block and went looking for a solution. (In my case it was -upower as a >> new use flag setting.) > > Yeah. I saw the original dev-list thread on the topic, before it all hit > the tree (and continuing now), which is a big part of why I subscribe to > the dev-list, to get heads-up about things like that. > Maybe all Gentoo users should subscribe! Over time we would likely all get a bit smarter. ;-) ;-) ;-) > What happened from the dev-list perspective is that after upower dropped > about half the original package as systemd replaced that functionality, > the gentoo maintainers split the package in half, the still included > functionality under the original upower name, with the dropped portion in > a new, basically-gentoo-as-upstream, package, upower-pm-utils. > I certainly have no issue with the basics of what they did, but more in a second. > But to the gentoo maintainer the portage output was sufficient that > between emerge --pretend --tree --unordered-display and eix upower, what > was needed was self-evident, so he didn't judge a news item necessary. > What a lot of other users (including me) AND devs are telling him is that > he's apparently too close to the problem to see that it's not as obvious > as he thinks, and a news item really is necessary. > Yeah, this was likely the issue. One comment in the -user thread on this subject was that at least one -dev-type thinks users should be reading change logs to figure this stuff out. I no longer remember how long I've run Gentoo but it's well beyond a decade at this point. Daniel Robbins was certainly participating. I was working at a company from mid-1999 to 2004 when I started. I can only say that I've never read a change log in that whole time. > Compounding the problem for users is that few users actually pulled in > upower on their own and don't really know or care about it -- it's pulled > in due to default desktop-profile use-flags as it's the way most desktops > handle suspend/hibernate. As is the case for me using kde-meta. However while I figured out I could set -upower on kdelibs and not have any build or boot issues pretty quickly I soon discovered that flag goes beyond my simplistic view of suspend/resume which I have never used. It also covers _everything_ in the Power Management section of systemsettings which means I lost my ability in KDE to control what I suspect is DPMI time settings on my monitors. I'll either have to learn how to do that outside of KDE or reinstall the newer upower-pm-utils package. > Further, certain desktop dependencies > apparently got default-order reversed on the alternative-deps, so portage > tries to fill the dep with systemd instead of the other package. > Unfortunately that's turning everybody's world upside down, as suddenly > portage wants to pull in systemd *AND* there's all these blockers! > Yeah, that's what got me to look at gentoo-user and find the problem. Lots of blocks involving systemd. <SNIP> > So things should really be simmering back down pretty shortly. =:^) > Meanwhile, in the larger perspective of things, it's just a relatively > minor goof that as usual is fixed in a couple days. No big deal, except > that /this/ goof happens to include the uber-lightening-rod-package that > is systemd. Be that as it may, the world isn't ending, and the problem > is indeed still fixed up within a couple days, as usual, with > information, some reliable, some not so reliable, available via the usual > channels for those who don't want to wait. > This stuff does happen once in awhile. I'm surprised it doesn't happen more often actually so for the most part the release process is pretty good. WRT to systemd, my real problem with this latest issue is the systemd profile issue, and beyond that there doesn't seem to be a systemd oriented new machine install document. In my study getting ready to build a new RAID (probably will be 2-drive 3TB RAID1) I wondered of I should give in to this portage pressure and go systemd. When I start looking there all I find are documents that seem to assume a pretty high understanding of systemd which doesn't represent my current education or abilities. Seems to me if the gentoo-devs interested in seeing systemd gain traction were serious this would be a high priority job. All we get today is http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?full=1#book_part1_chap12 which to me says it's not what Gentoo developers want Gentoo users to use. Of course, that's just me. Take care, Mark ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs 2014-06-05 18:59 ` Mark Knecht @ 2014-06-06 12:11 ` Duncan 0 siblings, 0 replies; 30+ messages in thread From: Duncan @ 2014-06-06 12:11 UTC (permalink / raw To: gentoo-amd64 Mark Knecht posted on Thu, 05 Jun 2014 11:59:23 -0700 as excerpted: > Yeah, this was likely the issue. One comment in the -user thread on this > subject was that at least one -dev-type thinks users should be reading > change logs to figure this stuff out. I no longer remember how long I've > run Gentoo but it's well beyond a decade at this point. Daniel Robbins > was certainly participating. I was working at a company from mid-1999 to > 2004 when I started. I can only say that I've never read a change log in > that whole time. Wow. I read 'em routinely. There are actually four different types of "changelogs" I read, more or less often and closely, depending on the package and how closely I'm following it. 1) The gentoo package changelogs (as found in the gentoo tree) don't normally contain a lot of information about the upstream package or changes between versions, so I don't read them all the time, but I *DO* read the gentoo package changelog much of the time when I see a -rX bump for something I already have installed at the same upstream version number, because in that case I normally want to know what the gentoo package maintainer considered important enough for a revision bump and the resulting rebuild trigger for users with that same upstream version already installed, instead of simply fixing it in the ebuild without a revision bump. These can be security bumps, for instance, and if so I want to know how bad it was and what my risk was before the update. Another common reason is config changes or patches that might affect me in other ways, that as an admin responsible for the wellbeing of my gentoo system (a responsibility I take very seriously), I want to know about. Additionally, the gentoo package changelogs contain dates for version introduction into the tree as well as stabilization on the various archs, and eventually, for removal from the tree. Most of the time when I'm checking on these, it's to help someone on some other list figure out a dependency on their non-gentoo distro, or figure out how far behind my ~amd64 installation their version is and how outdated, etc. Other times it can be basically the same stuff, but for another gentooer on stable instead of my ~amd64 plus selected live-packages system. At least back when Zac was portage dev lead, because gentoo /is/ upstream for portage and because portage changes are critical to a gentoo system's wellbeing, the portage package changelog was far more detailed than most others, including bug report numbers and mentioning big feature changes. I followed the portage changelog very closely and looked up every bug number mentioned to see what sort of changes were being made and why. 2) While not technically "changelog" files, git logs are in fact generally much more detailed changelogs, and for stuff like kde, where I run live-git-branch packages from the gentoo/kde project overlay, every time I do a sync (of both the main gentoo tree and the few overlays I run), I FAITHFULLY run git log on the overlay trees to see what updated since I last synced, and how. For the overlays, I follow *EVERY* *SINGLE* *CHANGE*, at minimum reading the git-log entry which lists the files involved as well, and if there's patches introduced that interest me, I'll use git show to pull up the full git diffs and see what actually changed, line by line in the source code. 3) Similarly, for various upstream packages I follow upstream's changelog or news files as well, not /too/ closely for most packages, but for a lot of packages, closely enough to at least be aware of major feature updates, both so I can make use of those features, and because they might affect config files that I'll be etc-updating in short order, after the package upgrade. 4) For a lot of packages that I run the live-git version of, I'll use the smart-live-rebuild output to get the upstream git commit IDs, and will then do a git log with those IDs to see what changed there, as well. For a few packages, I have a different script that I run that does an individual git pull for the package, and I git log it if there are changes, before I even run smart-live-rebuild to catch the others at all. Until I recently switched to systemd, I was one of the few non-dev users actually running openrc-9999, precisely so I COULD follow individual git commit updates, and I found and filed a number of bugs that then got fixed before a release version ever made them generally available even to ~arch users. Similarly, I've been involved with upstream pan (the news client I follow this list with, among other things) for over a decade now, helping out on its mailing list and now filling the local application historian role as well, tho I'm not a dev, and I follow its git logs *VERY* closely. Lately I've been active both as a btrfs user and on the btrfs list, and follow the btrfs-progs git commit log closely as well. For kde, where I'm on the kde4 development branch, I don't follow the git logs /quite/ that closely, but I do keep an eye on them, particularly for kdelibs, kde-baseapps and kde-workspace. I have my own scripts that I use for updating the kernel so I don't use gentoo's kernel packages at all, but there too I run (mainline Linus) kernel git, and while I don't follow individual commits especially during the merge window, I often follow the mainline merge-commits, and follow things more closely as commits slow down later in the cycle. As with openrc, I've bisected, filed and gotten fixed a number of bugs in pre- releases over the years before they hit full releases. So I must confess it's a bit hard to imagine someone who hasn't read a single changelog in at least the decade I've been on gentoo, particularly since following them to at least /some/ extent is IMO part of the responsibility of being a good sysadmin, responsible for at least their own system if no others, is all about. While I certainly don't expect people to follow changes as closely as I do, not viewing even /one/ changelog over the course of at least a decade... let's put it this way, it's not something /I'd/ be proud to admit in public. OTOH, that you've gone this long without it and are still here discussing and running gentoo definitely *IS* a testament to how good the gentoo devs (and tester-users like me filing bugs to be fixed before things hit stable, and sometimes before they hit a release or the ~arch tree at all) actually are in general at keeping things actually working for people, a bit of hiccup now and again for a few days, but basically nothing that's not fixed in a few days, and nothing that tends to actually eat systems to the point that you're not here, a decade later. That's SAYING something! =:^) > This stuff does happen once in awhile. I'm surprised it doesn't happen > more often actually so for the most part the release process is pretty > good. =:^) > WRT to systemd, my real problem with this latest issue is the systemd > profile issue, and beyond that there doesn't seem to be a systemd > oriented new machine install document. In my study getting ready to > build a new RAID (probably will be 2-drive 3TB RAID1) I wondered of I > should give in to this portage pressure and go systemd. When I start > looking there all I find are documents that seem to assume a pretty high > understanding of systemd which doesn't represent my current education or > abilities. Seems to me if the gentoo-devs interested in seeing systemd > gain traction were serious this would be a high priority job. All we get > today is > > http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml? full=1#book_part1_chap12 > > which to me says it's not what Gentoo developers want Gentoo users to > use. > Of course, that's just me. You're actually correct. Mainline gentoo remains openrc, and that's likely to remain the case for some time. Systemd is certainly available as an option, and more and more people are switching to it, but even after it's well documented in the handbook, openrc will continue to be an option for the foreseeable future, with several devs having stated quite specifically that they use gentoo and depend on openrc in their jobs, so whatever /else/ may happen to gentoo, openrc is *NOT* about to become unsupported, as I said for the foreseeable future (which in practice means at LEAST two years out as it'd take that long to switch over -- remember how long it took to stabilize baselayout-2, and more likely at least five years, even if they voted that as a goal right now!). OTOH, individual packages and specific desktop projects can change dependencies based on what upstream supports, and gentoo/gnome is only supporting systemd for some elements now. Luckily for gentoo/kdeers, upstream kde has committed to maintaining more systemd independence than has gnome, including with kde 5 frameworks. And the modularization of kde-frameworks should make that much easier too, over time, altho individual kde packages may eventually require systemd. OTOH, gentooers have it better than most in that they have more choice about actually installing individual packages, as well as keeping upstream-optional dependencies actually optional. We did almost lose the ability to opt-out of semantic-desktop, but fortunately saner heads prevailed, and had they not done so in gentoo/kde, a number of us users were making plans for a user-supported overlay similar to the user-supported kde-sunset for kde3 users, to maintain semantic-desktop-less kde4 at least until kde5/frameworks, at which point we hoped upstream policies would bring back the option due to its modularity. But while I actually had to maintain the semantic-desktop- less ebuild patches locally for awhile in ordered to continue following kde-live-branch, and I guess ~arch users faced the problem for a shorter time, the policy thankfully reverted before stable users had to make that painful choice. But various devs have made it VERY clear, gentoo as a whole isn't going to get anywhere /close/ to losing the openrc option, as I said, for the foreseeable future. And well before that were to happen, or even if gentoo really expected stable users to switch to systemd in quantity, there'd be MUCH better documentation, just as that was a prerequisite to the stable-side baselayout-2, one of the big reasons it took years. Again, that's exactly the reason users worried about gentoo suddenly switching to systemd as it /looked/ like it might be doing here, have nothing to worry about for at **LEAST** two years. A ship as big as gentoo simply doesn't turn on a dime, nor can it be forced to, and even were the council to suddenly get a brain transplant and vote that it should be the goal today, it'd take years to actually implement, including for many gentoo devs *AND* their employers. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <Alo71o01J1aVA4001lo9xP>]
* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs [not found] ` <Alo71o01J1aVA4001lo9xP> @ 2014-06-06 17:07 ` Duncan 0 siblings, 0 replies; 30+ messages in thread From: Duncan @ 2014-06-06 17:07 UTC (permalink / raw To: Martin, gentoo-amd64 On Thu, 05 Jun 2014 22:48:07 +0100 Martin <m_btrfs@ml1.co.uk> wrote: > Resend (gmane appears to be losing my email for this list... :-( ) OK, forwarding to the list too (with a bit less snippage than normal, to keep your message intact as I'm relaying) and replying below. > > On 05/06/14 16:35, Martin wrote: > > On 05/06/14 03:00, Duncan wrote: > >> So things should really be simmering back down pretty shortly. > >> =:^) > > Thanks for the good summary. > > > > Yep, I hit all the red "B" blockers... Quickly saw it was upower and > > some confusion with systemd even though I've not selected systemd > > anywhere and... > > > > I was too rushed to investigate much further and so added into my > > /etc/portage/package.mask: > > > > # Avoid pulling in systemd! > > =sys-power/upower-0.9.23-r3 > > > > > > Thanks for letting me know to await the news item and for the bits > > to settle... [Just forwarding that part and would delete it as I'm not replying to it, were I not forwarding it for you too. But I'm replying to the below.] > > As for systemd... I'm just wondering if the various heated air being > > generated/wasted is as much rushed arrogance on the part of the > > implementation as due to the grand ripples of change. > > > > The recent kernel DoS debacle due to misusing the long used kernel > > debug showed a certain 'cavalier' attitude to taking over > > functionality without a wider concern or caution to keep projects > > outside of systemd undisturbed... Or at least conscientiously > > minimise disturbance... Agreed, and for quite some time I that attitude was why I was delaying my own switch, tho I expected I'd eventually make it. But backing up a bit to reveal the larger picture... Developers in general aren't always exactly known for their ability to get along with each other or with others or necessarily the wider community. Certainly there's many examples of this in the FLOSS community, and from what I've read of the proprietary development community it's no different, save much of it happens behind closed doors, with public appearances moderated by the PR folks. Actually, I've a personal experience that rather changed my own outlook on things, that I believe explains a lot of the problem here. The following gets a bit franker and more personally honest than most discussions and I'm not really comfortable saying it, but it's important enough not to skip as it illustrates a point I want to make. I don't ordinarily talk about myself in this way, but the fact is, on most tests I score well above 90 percentile IQ. Typically, depending on the test and whether I'm hitting my groove that day or not, I run 95-97 percentile on average in most areas (tho in composition I'm relatively low for me, 70s). (FWIW, I've always been slightly frustrated. The MENSA cutoff is supposed to be 98 percentile and I typically score tantalizingly close, but not quite! It'd be nice... =:^( ) In technology and computer areas I'd guess I'm a bit higher, perhaps 98 percentile or so. 95 percentile means about 19 out of 20 people score lower, 98 percentile is 49 out of 50. But, this level of attainment presents its own set of difficulties, difficulties I'm intimately familiar with, but obviously not to the level these /real/ geniuses, the big hero coders of our community, are. I still remember the day I actually realized what dealing with a mentally challenged individual actually was, back in about 8th grade or so. He had come to visit a next door neighbor and we set out to climb a local butte, me not yet understanding his difficulty -- I knew there was /something/ different about him, but I didn't know what, I just accepted it, and him, as basically my equal, as I had been taught to accept and treat everyone. But climbing this butte didn't simply involve a hike, as is the case with many hills/buttes. It involved a bit of relatively minor technical climbing, "chimneying", etc. I had done it with a group previously, but wanted to try it again, for the exercise and challenge. But I didn't want to do it alone, and this guy was agreeable to trying it, so we set out. Everything went well, considering, but it did take somewhat longer than I had planned and our ride back got a bit worried and alerted the authorities. Fortunately, they didn't have to pull us off the mountain (or scrape us from the bottom of the chimney), but we got in a bit of trouble. When I got home, Mom asked me why on earth I'd take a r* guy up a mountain like that. I was flabbergasted! I didn't know! And to think I took him on that climb that was slightly challenging for me (something I'm not sure my Mom knew, and that I didn't tell her!), what must it have been for him? I was perhaps rather fortunate something /didn't/ happen, altho now I realize that despite (or even perhaps because of) his challenge he was remarkably resilient, and may well have picked himself up and continued better than I would have if something had gone wrong and either one of us was hurt That night or perhaps the next day, as I thought about it, I realized what had happened. I was so used to, as a matter of course, dropping to whatever level was required to meet people at their own level and treat them as equal, that I didn't realize I was even doing it. To me it was just the way one interacted with others. What I had originally noticed different about him, that I couldn't put into words before as I simply didn't have the experience or concept, was that I had to drop a bit more than normal, but I was so used to doing it for pretty much everyone, that I didn't even realize I was doing it, or know what it was... until I was forcibly confronted with the fact that this guy was (to others) noticeably below average. But to me he was simply a bit more of the normal that I always did, and that I thought was just the way it was to interact with /anyone/. Since then I've obviously become a bit wiser in the ways of the world, but realistically, I really do seldom meet people /really/ my equal in the real world, and that has really distorted my experience, and to some extent my attitude and picture of the world. But that was only experiencing the one side. I consider myself fortunate to have actually had the opposite experience as well. A bit over a decade ago I was with a Linux and Unix friendly ISP that had a lot of real knowledgeable folks as customers, including one guy that was one of only about a dozen with direct commit privs to one of the BSDs, and several others that were in the same knowledge class. While I may well be at the 95-97 percentile range, for the first time in my life I was well outranked, as several of these guys were at the 99th percentile or better I'm sure, plus they had likely decades of experience I didn't (as a newbie fresh from the MS side of the track) have! That was a humbling experience indeed! To that point, I had been used to being at least /equal/ to pretty much anyone I met, and enough above most that even if I happened to be wrong I knew more about the situation than pretty much anyone else, that I could out-argue them even in my wrongness. Here the situation was exactly reversed, *I* was the know-nothing, the slow guy that everyone else had to wait for while someone patiently explained what was going on so I could follow along! I **VERY** **QUICKLY** learned how to shut up and simply read the discussion as it happened around me, learning from the masters and occasionally asking a question or two, and to be *VERY* sure I could backup any claims I DID make, because if I was wrong, for the first time in my life I was pretty much guaranteed to be called on it, and there was no bluffing my way out of that fix with THESE guys! That had roughly the same level of effect on me as the earlier experience, but at the opposite end, something I rather badly needed as I NEEDED a bit of humbling at that point! Now here's the critical point that I've been so brutally honest to try to present: What happens to the *REAL* 99 percentilers, the guys who *NEVER* have that sort of humbling "OOPS, I screwed up and better shutup! These guys know more than me and if I'm wrong they're not afraid to demonstrate exactly why and how!" ... experience? Unfortunately, a lot of them are a**h***s! Why? Because they're at the top of their class and they know it. Nobody can prove them wrong, and if somehow someone does, they simply don't know how to react, as it's an experience they very rarely have. Even on things they know is simply opinion, they're so used to having absolutely zero peers around that can actually challenge them on it, that they simply don't know /how/ to take a real, honest challenge when it comes. Which BTW is one of the things I find so amazing about Linus Torvalds. I doubt many would argue that he's at the 99 percentile point, yet somehow he's a real person, still approachable, and unlike most folks at his level, actually able to deal with people! At the other end are people like Hans Reiser. He was and is a filesystem genius, and reiser4 was years before its time, yet never got into the kernel despite years of trying, because he was absolutely horrible at interpersonal relations and nobody anywhere near his level could work with him, because he simply didn't know how to be wrong. Unfortunately learning that was literally a fatal experience for his wife. =:^( Take it from someone who is in many areas 90 percentile plus, but who counts that experience sitting at the feet of /real/ masters as perhaps the single most fortunate and critical experience in his live, because he learned how to be wrong, that's NOT an easy lesson to learn, but it's an *EXTREMELY* critical lesson to learn! Think about that the next time you see something like that kernel command-line debug thing go down. Poettering and Sievers are extremely bright men, genius, top of their class. And Poettering in particular is a gifted speaker as well (researching systemd I watched a number of videos of presentations he has done on the subject, he really IS an impressive and gifted speaker!). But, they don't take being wrong well at all, and they have a sense of entitlement due to their sense of ultimate rightness. Never-the-less, however one might dislike and distrust the personality behind them, both systemd and reiserfs (and later reiser4) were/are top of their class for their time, unique and ahead of their time in many ways. There's no arguing that. I didn't and don't like Hans Reiser, but I used his filesystem (reiser3), and still use it on my spinning rust drives altho I've switched to the still not fully mature btrfs on my newer ssds. Unlike Reiser, I don't know so much about Poettering and Sievers personal lives and I surely hope they don't end up where Reiser did for similar reasons. But similar to Reiser, I use their software, systemd, now. And there's no arguing the fact, it's /good/, even if not exactly stable, because they continue to "gray goo" anything in their path, and haven't yet taken the time necessary to truly stabilize the thing. While I never used it, from what I have read, PulseAudio was much the same way as long as Poettering was involved -- it never truly stabilized until he lost interest and left. Unfortunately I think that's likely to be the case with systemd as well; it won't really stabilize until Poettering loses interest and moves on to something else. And for people who depend on stable, I really doubt I'll be able to recommend it (if you can avoid the gray goo, I really don't know if that will remain possible if he doesn't lose interest in another couple years) until then. But it /is/ good, good enough it's taking the Linux world by storm, gray goo and all. If systemd could just be left alone to stabilize for a year or so, I think it'd be good, /incredibly/ good, and a lot of hold outs, like I was until recently, would find little reason not to switch, once it was allowed to stabilize. But when that's likely to happen (presumably after Poettering moves on), I really haven't the foggiest. Meanwhile I'm leading edge enough (I'm running git kde4 and kernel, after all), and (fortunately) good enough at troubleshooting Linux boot issues when I have to, that I decided it was time, for me anyway. So as you can see, while I've succumbed now, I really do still have mixed feelings on it all. But meanwhile, try applying the "do they actually know how to be wrong" theory the next time you see something happening elsewhere, too. It's surprising just how much of the FLOSS-world feuding it explains!... Tho this is one area I'd be I'd be /very/ happy if I /was/ wrong about, and suddenly all these definite top-of-their-field coders started getting along with each other! Well, we can hope, anyway (and while we're at hoping, hope the lesson in being wrong isn't data eating code teaching them how to be wrong, or security code either, as seems to have been the recent case with openssl!). -- Duncan - No HTML messages please, as they are filtered as spam. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:39 ` Bob Sanders 2014-05-27 22:58 ` Harry Holt @ 2014-05-27 23:32 ` Mark Knecht 2014-05-27 23:51 ` Marc Joliet 2 siblings, 0 replies; 30+ messages in thread From: Mark Knecht @ 2014-05-27 23:32 UTC (permalink / raw To: Gentoo AMD64 On Tue, May 27, 2014 at 3:39 PM, Bob Sanders <rsanders@sgi.com> wrote: > Mark Knecht, mused, then expounded: >> Hi all, >> The list is quiet. Please excuse me waking it up. (Or trying to...) ;-) >> >> I'm at the point where I'm a few months from running out of disk >> space on my RAID6 so I'm considering how to move forward. I thought >> I'd check in here and get any ideas folks have. Thanks in advance. >> > > Beware - if Adobe acroread is used, and you opt for a 3TB home > directory, there is a chance it will not work. Or more specifically, > acroread is still 32-bit. It's only something I've seen with the xfs > filesystem. And Adobe has ignored it for approx. 3yrs now. > acroread isn't critical to me but it does get used now and then so thanks for the heads-up. <SNIP> > > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports. Humm...I suppose I might consider building a 4-drive 1TB RAID10 from my existing 500GB RE3 drives, and then buy a couple of 2TB Red drives and do a RAID1 for data storage. If I did that I'd end up with 6 drives in the box, 4 of them old, but old ain't necessarily bad. ;-) However that forces me to manage what data goes where instead of just a big, flat RAID1 which is going to be easy to live with. Still, it would probably save some money. <SNIP> > > If you change, do not use ZFS and possibly BTRFS if the system does not > have ECC DRAM. A single, unnoticed, ECC error can corrupt the data pool > and be written to the file system, which effectively renders it corrupt > without a way to recover. Thanks. No ECC and no real interest in doing anything very exotic. > > FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and > provide a boot of nfs or iSCSI to your VMs. The downside is the NAS box > and drives would go for a bit north of $636. The upside is all your > movies and VM files could move off your workstation and the workstation > would still host the VMs via a mount of the NAS box. > NAS is an interesting idea. I'll do a little study but my initial feeling is that it's more money than I really want to spend. Summer's coming. Time for Margaritas! Thanks, Mark ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:39 ` Bob Sanders 2014-05-27 22:58 ` Harry Holt 2014-05-27 23:32 ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht @ 2014-05-27 23:51 ` Marc Joliet 2014-05-28 15:26 ` Bob Sanders 2 siblings, 1 reply; 30+ messages in thread From: Marc Joliet @ 2014-05-27 23:51 UTC (permalink / raw To: gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 1618 bytes --] Am Tue, 27 May 2014 15:39:38 -0700 schrieb Bob Sanders <rsanders@sgi.com>: > Mark Knecht, mused, then expounded: [...] > > Beyond this I need to talk file system types. I'm fat dumb and > > happy with Ext4 and don't really relish dealing with new stuff but > > now's the time to at least look. > > > > If you change, do not use ZFS and possibly BTRFS if the system does not > have ECC DRAM. A single, unnoticed, ECC error can corrupt the data pool > and be written to the file system, which effectively renders it corrupt > without a way to recover. [...] As someone who recently switched an mdraid to BTRFS (with / on EXT4 on an SSD, which will be migrated at a later point, once I feel more at ease with BTRFS), I was curious about this, so I googled it. I found two threads, [0] and [3], which dispute (and most likely refute) this notion that BTRFS is more susceptible to memory errors than other file systems. While I am far from a filesystem/storage expert (I see myself as a mere user), the cited threads lead me to believe that this is most likely an overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would suggest reading them in their entirety. [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 HTH -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 23:51 ` Marc Joliet @ 2014-05-28 15:26 ` Bob Sanders 2014-05-28 15:28 ` Bob Sanders ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Bob Sanders @ 2014-05-28 15:26 UTC (permalink / raw To: gentoo-amd64 Marc Joliet, mused, then expounded: > Am Tue, 27 May 2014 15:39:38 -0700 > schrieb Bob Sanders <rsanders@sgi.com>: > > While I am far from a filesystem/storage expert (I see myself as a mere user), > the cited threads lead me to believe that this is most likely an > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would > suggest reading them in their entirety. > > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad memory bit and no ECC memory: http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ Thanks Mark! Interesting discussion on btrfs. Bob > HTH > -- > Marc Joliet > -- > "People who think they know everything really annoy those of us who know we > don't" - Bjarne Stroustrup -- - ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-28 15:26 ` Bob Sanders @ 2014-05-28 15:28 ` Bob Sanders 2014-05-28 16:10 ` Rich Freeman 2014-05-28 19:20 ` Marc Joliet 2 siblings, 0 replies; 30+ messages in thread From: Bob Sanders @ 2014-05-28 15:28 UTC (permalink / raw To: gentoo-amd64 Bob Sanders, mused, then expounded: > > Marc Joliet, mused, then expounded: > > Am Tue, 27 May 2014 15:39:38 -0700 > > schrieb Bob Sanders <rsanders@sgi.com>: > > > > While I am far from a filesystem/storage expert (I see myself as a mere user), > > the cited threads lead me to believe that this is most likely an > > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would > > suggest reading them in their entirety. > > > > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 > > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 > > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 > > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 > > > > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad > memory bit and no ECC memory: > > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ > > > Thanks Mark! Interesting discussion on btrfs. > Apologies - that should have been - Thanks Marc! > Bob > > > HTH > > -- > > Marc Joliet > > -- > > "People who think they know everything really annoy those of us who know we > > don't" - Bjarne Stroustrup > > > > -- > - > > -- - ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-28 15:26 ` Bob Sanders 2014-05-28 15:28 ` Bob Sanders @ 2014-05-28 16:10 ` Rich Freeman 2014-05-28 19:20 ` Marc Joliet 2 siblings, 0 replies; 30+ messages in thread From: Rich Freeman @ 2014-05-28 16:10 UTC (permalink / raw To: gentoo-amd64 On Wed, May 28, 2014 at 11:26 AM, Bob Sanders <rsanders@sgi.com> wrote: > Marc Joliet, mused, then expounded: >> Am Tue, 27 May 2014 15:39:38 -0700 >> schrieb Bob Sanders <rsanders@sgi.com>: >> >> While I am far from a filesystem/storage expert (I see myself as a mere user), >> the cited threads lead me to believe that this is most likely an >> overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would >> suggest reading them in their entirety. >> >> [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 >> [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 >> [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 >> [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 >> > > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad > memory bit and no ECC memory: > > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ > I don't think that anybody debates that if you use btrfs/zfs with non-ECC RAM you can potentially lose some of the protection afforded by the checksumming. What I'd question is that this is some concern unique to btrfs/zfs. I'd think the same failure modes would all apply to any other filesystem. So, the message should be that ECC RAM is better than non-ECC RAM, not that those who use non-ECC RAM are better off using ext4 instead of zfs/btrfs. I'd think that any RAM-related issue that would impact zfs/btrfs would affect ext4 just as badly, and with ext4 you're also vulnerable to all the non-RAM-related errors that checksumming was created to solve. If your RAM is bad then all kinds of stuff can go wrong. Ditto for your cache memory in the CPU, logic circuitry in the CPU, your busses, etc. Most systems are not fault-tolerant of these system components and the cost to make them fault-tolerant tends to be fairly high. On the other hand, the good news is that you're far more likely to have problems with data stored on a disk than in RAM, which is probably why we haven't bothered to improve the other components. Rich ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-28 15:26 ` Bob Sanders 2014-05-28 15:28 ` Bob Sanders 2014-05-28 16:10 ` Rich Freeman @ 2014-05-28 19:20 ` Marc Joliet 2014-05-28 19:56 ` Bob Sanders 2014-05-29 7:08 ` [gentoo-amd64] " Duncan 2 siblings, 2 replies; 30+ messages in thread From: Marc Joliet @ 2014-05-28 19:20 UTC (permalink / raw To: gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 4318 bytes --] Am Wed, 28 May 2014 08:26:58 -0700 schrieb Bob Sanders <rsanders@sgi.com>: > > Marc Joliet, mused, then expounded: > > Am Tue, 27 May 2014 15:39:38 -0700 > > schrieb Bob Sanders <rsanders@sgi.com>: > > > > While I am far from a filesystem/storage expert (I see myself as a mere user), > > the cited threads lead me to believe that this is most likely an > > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would > > suggest reading them in their entirety. > > > > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 > > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 > > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 > > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 > > > > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad > memory bit and no ECC memory: > > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ Thanks for explicitly linking that. I didn't read it the first time around, but just read through most of it, then reread the threads [0] and [3] above and *think* that I understand the problem (and how it doesn't apply to BTRFS) better now. IIUC, the claim is: data is written to disk, but it must go through the RAM first, obviously, where it is corrupted (due to a permanent bit flip caused, e.g., by deteriorating hardware). At some later point, when the data is read back from disk, it might happen to load around the damaged location in RAM, where it is further corrupted. At this point the checksum fails, and ZFS corrects the data in RAM (using parity information!), where it is immediately corrupted again (because apparently it is corrected at the same physical location in RAM? perhaps this is specific to correction via parity?). This *additionally* corrupted data is then written back to disk (without any further checks). So the point is that, apparently, without ECC RAM, you could get a (long-term) cascade of errors, especially during a scrub. The likelihood of such permanent RAM corruption happening in the first place is another question entirely. The various posts in [0] then basically say that regardless of whether this really is true of ZFS, it certainly doesn't apply to BTRFS, for various reasons. I suppose this quote from [1] (see above) says it most clearly: > In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about > reconstructing corrupted data from parity information: > > > Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted > bit. Remember, the checksum data was calculated after the corruption from the first memory error > occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM. > > i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors > automatically from the parity information. I start to suspect that there is confusion here between > checksumming for data integrity and parity information. If this is really how ZFS works, then if memory > corruption interferes with this process, then I can see how a scrub could be devastating. I don't know if > ZFS really works like this. It sounds very odd to do this without an additional checksum check. This sounds > very different to what you say below that btrfs does, which is only to check against redundantly-stored > copies, which I agree sounds much safer. The rest is also relevant, but I think the point that the data is corrected via parity information, as opposed to using a known-good redundant copy of the data (which I originally missed, and thus got confused), is the key point in understanding the (supposed) difference in behaviour between ZFS and BTRFS. All this assumes, of course, that the FreeNAS forum post that ignited this discussion is correct in the first place. > Thanks Mark! Interesting discussion on btrfs. > > Bob You're welcome! I agree, it's an interesting discussion. And regarding the misspelling of my name: no problem :-) . -- Marc Joliet -- "People who think they know everything really annoy those of us who know we don't" - Bjarne Stroustrup [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-28 19:20 ` Marc Joliet @ 2014-05-28 19:56 ` Bob Sanders 2014-05-29 7:08 ` [gentoo-amd64] " Duncan 1 sibling, 0 replies; 30+ messages in thread From: Bob Sanders @ 2014-05-28 19:56 UTC (permalink / raw To: gentoo-amd64 Marc Joliet, mused, then expounded: > Am Wed, 28 May 2014 08:26:58 -0700 > schrieb Bob Sanders <rsanders@sgi.com>: > > > > > Marc Joliet, mused, then expounded: > > > Am Tue, 27 May 2014 15:39:38 -0700 > > > schrieb Bob Sanders <rsanders@sgi.com>: > > > > > > While I am far from a filesystem/storage expert (I see myself as a mere user), > > > the cited threads lead me to believe that this is most likely an > > > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would > > > suggest reading them in their entirety. > > > > > > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832 > > > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871 > > > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877 > > > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821 > > > > > > > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad > > memory bit and no ECC memory: > > Just to beat this dead horse some more, an analysis of a academic study on drive failures - http://storagemojo.com/2007/02/20/everything-you-know-about-disks-is-wrong/ And it links to the actual study here - https://www.usenix.org/legacy/events/fast07/tech/schroeder.html Which shows that memory has a fairly high failure rate as well, though the focus is on hard drives. > > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ > > Thanks for explicitly linking that. I didn't read it the first time around, > but just read through most of it, then reread the threads [0] and [3] above and > *think* that I understand the problem (and how it doesn't apply to BTRFS) > better now. > > IIUC, the claim is: data is written to disk, but it must go through the RAM > first, obviously, where it is corrupted (due to a permanent bit flip caused, > e.g., by deteriorating hardware). At some later point, when the data is read > back from disk, it might happen to load around the damaged location in RAM, > where it is further corrupted. At this point the checksum fails, and ZFS > corrects the data in RAM (using parity information!), where it is immediately > corrupted again (because apparently it is corrected at the same physical > location in RAM? perhaps this is specific to correction via parity?). This > *additionally* corrupted data is then written back to disk (without any further > checks). > > So the point is that, apparently, without ECC RAM, you could get a (long-term) > cascade of errors, especially during a scrub. The likelihood of such permanent > RAM corruption happening in the first place is another question entirely. > > The various posts in [0] then basically say that regardless of whether this > really is true of ZFS, it certainly doesn't apply to BTRFS, for various > reasons. I suppose this quote from [1] (see above) says it most clearly: > > > In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about > > reconstructing corrupted data from parity information: > > > > > Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted > > bit. Remember, the checksum data was calculated after the corruption from the first memory error > > occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM. > > > > i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors > > automatically from the parity information. I start to suspect that there is confusion here between > > checksumming for data integrity and parity information. If this is really how ZFS works, then if memory > > corruption interferes with this process, then I can see how a scrub could be devastating. I don't know if > > ZFS really works like this. It sounds very odd to do this without an additional checksum check. This sounds > > very different to what you say below that btrfs does, which is only to check against redundantly-stored > > copies, which I agree sounds much safer. > > The rest is also relevant, but I think the point that the data is corrected via > parity information, as opposed to using a known-good redundant copy of the data > (which I originally missed, and thus got confused), is the key point in > understanding the (supposed) difference in behaviour between ZFS and BTRFS. > > All this assumes, of course, that the FreeNAS forum post that ignited this > discussion is correct in the first place. > > > Thanks Mark! Interesting discussion on btrfs. > > > > Bob > > You're welcome! I agree, it's an interesting discussion. And regarding the > misspelling of my name: no problem :-) . > > -- > Marc Joliet > -- > "People who think they know everything really annoy those of us who know we > don't" - Bjarne Stroustrup -- - ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-amd64] Re: Soliciting new RAID ideas 2014-05-28 19:20 ` Marc Joliet 2014-05-28 19:56 ` Bob Sanders @ 2014-05-29 7:08 ` Duncan 1 sibling, 0 replies; 30+ messages in thread From: Duncan @ 2014-05-29 7:08 UTC (permalink / raw To: gentoo-amd64 Marc Joliet posted on Wed, 28 May 2014 21:20:18 +0200 as excerpted: > Am Wed, 28 May 2014 08:26:58 -0700 schrieb Bob Sanders > <rsanders@sgi.com>: > >> Marc Joliet, mused, then expounded: [snipped] > >> Thanks Mark! Interesting discussion on btrfs. >> >> [followup] Apologies - that should have been - Thanks Marc! > > You're welcome! I agree, it's an interesting discussion. And regarding > the misspelling of my name: no problem :-) . =:^) But seriously, thanks Bob for pointing out the misspelling. There's a Mark (with a k) that's quite active on the btrfs list (and has in fact done quite a bit of testing on the raid56 stuff, and written most of several related pages on the btrfs wiki), and I guess my brain has so associated him with the btrfs discussion context that without actually thinking about it, I was thinking this was the same "Mark" here. So pointing out that it's actually Marc-with-a-c here actually alerted me to the fact that it's not the same person, and very possibly saved a very confused Duncan from making quite a fool of himself in some future post either here or there as a result! So thanks VERY MUCH, Bob! =:^) (FWIW, my first name is John. But at least in my generation there's so many Johns around, and Duncan as a last name isn't uncommon either, that in fact there are quite a few John Duncans around too, and it's all horribly confusing. I even worked with a Donna at one point, and in a fairly noisy environment all you hear for either is the ON bit, so we were always either both or neither answering to calls for either one of us, since neither could easily hear which one they actually called. So I switched to the mononym "Duncan". That has been MUCH less confusing over the decades I've been using it, now. Anyway, I can definitely identify with first-name confusion. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-amd64] Soliciting new RAID ideas 2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht 2014-05-27 22:39 ` Bob Sanders @ 2014-05-27 23:05 ` Alex Alexander 1 sibling, 0 replies; 30+ messages in thread From: Alex Alexander @ 2014-05-27 23:05 UTC (permalink / raw To: Gentoo-amd64 [-- Attachment #1: Type: text/plain, Size: 1400 bytes --] On Wed, May 28, 2014 at 1:13 AM, Mark Knecht <markknecht@gmail.com> wrote: > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with > RAID1. This would use the internal SATA2 ports so it wouldn't be the > highest performance but likely a lot better than my SATA2 RAID6. > This. Thinking into the future is important - drives tend to fill up faster when you have more free space available. Get three drives if possible, go RAID5, then when you run out of space (you will), you just add one more and you're happy again. This setup has one more advantage: You get to keep your old drives and re-use them. One interesting idea would be to use 3 of your old drives in a RAID5 setup for Gentoo. It wouldn't be as fast as a couple of SSDs, but you're already used to the speed and you instantly get two backup drives just in case one of the old drives fails. You could also use the spare space on this array for backups of critical stuff from the main raid. You can always switch to SSDs for main system later :) > > Beyond this I need to talk file system types. I'm fat dumb and > happy with Ext4 and don't really relish dealing with new stuff but > now's the time to at least look. New tech is nice, but I'd stick with ext4. Data is one of the few things on my systems that I don't like to toy with. Cheers, -- Alex Alexander + wired + www.linuxized.com + www.leetworks.com [-- Attachment #2: Type: text/html, Size: 2248 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2014-06-06 17:07 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht 2014-05-27 22:39 ` Bob Sanders 2014-05-27 22:58 ` Harry Holt 2014-05-27 23:38 ` thegeezer 2014-05-28 0:26 ` Rich Freeman 2014-05-28 3:12 ` [gentoo-amd64] btrfs Was: " Duncan 2014-05-28 7:29 ` thegeezer 2014-05-28 20:32 ` Marc Joliet 2014-05-29 6:41 ` [gentoo-amd64] " Duncan 2014-05-29 17:57 ` Marc Joliet 2014-05-29 17:59 ` Rich Freeman 2014-05-29 18:25 ` Mark Knecht 2014-05-29 21:05 ` Frank Peters 2014-05-30 2:04 ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan 2014-05-30 2:44 ` Frank Peters 2014-05-30 6:25 ` [gentoo-amd64] " Duncan 2014-06-04 16:41 ` [gentoo-amd64] " Mark Knecht 2014-06-05 2:00 ` [gentoo-amd64] " Duncan 2014-06-05 18:59 ` Mark Knecht 2014-06-06 12:11 ` Duncan [not found] ` <Alo71o01J1aVA4001lo9xP> 2014-06-06 17:07 ` Duncan 2014-05-27 23:32 ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht 2014-05-27 23:51 ` Marc Joliet 2014-05-28 15:26 ` Bob Sanders 2014-05-28 15:28 ` Bob Sanders 2014-05-28 16:10 ` Rich Freeman 2014-05-28 19:20 ` Marc Joliet 2014-05-28 19:56 ` Bob Sanders 2014-05-29 7:08 ` [gentoo-amd64] " Duncan 2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox