[gentoo-amd64] Soliciting new RAID ideas

public inbox for gentoo-amd64@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-amd64] Soliciting new RAID ideas
@ 2014-05-27 22:13 Mark Knecht
  2014-05-27 22:39 ` Bob Sanders
  2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander
  0 siblings, 2 replies; 30+ messages in thread
From: Mark Knecht @ 2014-05-27 22:13 UTC (permalink / raw
  To: Gentoo AMD64

Hi all,
   The list is quiet. Please excuse me waking it up. (Or trying to...) ;-)

   I'm at the point where I'm a few months from running out of disk
space on my RAID6 so I'm considering how to move forward. I thought
I'd check in here and get any ideas folks have. Thanks in advance.

   The system is a Gentoo 64-bit, mostly stable, using a i7-980x
Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable
HD bays, room for 6 other drives, a large power supply.

   The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD
RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+
years I've used this machine.

   Generally there are 4 classes of data on the RAID:

1) Gentoo (obviously), configs backed up every weekend. I plan to
rebuild from scratch using existing configs if there's a failure.
Being down for a couple of days is not an issue.
2) VMs - about 300GB. Loaded every morning, stopped & saved every
night, backed up every weekend.
3) Financial data - lots of it - stocks, futures, options, etc.
Performance requirements are pretty low. Backed up every weekend.
4) Video files - backed up to a different location than items 1/2/3
whenever there are changes

   After eclean-dist/eclean-pkg I'm down to about 80GB free and this
will fill up in 3-6 months so it's time to make some changes.

   My thoughts:

1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with
RAID1. This would use the internal SATA2 ports so it wouldn't be the
highest performance but likely a lot better than my SATA2 RAID6.

2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID
controller. This would be SATA3 so probably way more performance than
I have now. MUCH more expensive though.

3) #1 + an SSD. I have an unused 120GB SSD so I could get another,
make a 2-disk RAID1, put Gentoo on that and everything else on the
newer 3TB drives. More complex, probably lower reliability and I'm not
sure I gain much.

   Beyond this I need to talk file system types. I'm fat dumb and
happy with Ext4 and don't really relish dealing with new stuff but
now's the time to at least look.

   Anyway, that's the basic outline. Any thoughts, ideas, corrections,
expansions, etc., I'm very interested in talking about.

Cheers,
Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
@ 2014-05-27 22:39 ` Bob Sanders
  2014-05-27 22:58   ` Harry Holt
                     ` (2 more replies)
  2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander
  1 sibling, 3 replies; 30+ messages in thread
From: Bob Sanders @ 2014-05-27 22:39 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht, mused, then expounded:
> Hi all,
>    The list is quiet. Please excuse me waking it up. (Or trying to...) ;-)
> 
>    I'm at the point where I'm a few months from running out of disk
> space on my RAID6 so I'm considering how to move forward. I thought
> I'd check in here and get any ideas folks have. Thanks in advance.
>

Beware - if Adobe acroread is used, and you opt for a 3TB home
directory, there is a chance it will not work.  Or more specifically,
acroread is still 32-bit.  It's only something I've seen with the xfs
filesystem.  And Adobe has ignored it for approx. 3yrs now.

>    The system is a Gentoo 64-bit, mostly stable, using a i7-980x
> Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable
> HD bays, room for 6 other drives, a large power supply.
> 
>    The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD
> RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+
> years I've used this machine.
> 
>    Generally there are 4 classes of data on the RAID:
> 
> 1) Gentoo (obviously), configs backed up every weekend. I plan to
> rebuild from scratch using existing configs if there's a failure.
> Being down for a couple of days is not an issue.
> 2) VMs - about 300GB. Loaded every morning, stopped & saved every
> night, backed up every weekend.
> 3) Financial data - lots of it - stocks, futures, options, etc.
> Performance requirements are pretty low. Backed up every weekend.
> 4) Video files - backed up to a different location than items 1/2/3
> whenever there are changes
> 
>    After eclean-dist/eclean-pkg I'm down to about 80GB free and this
> will fill up in 3-6 months so it's time to make some changes.
> 
>    My thoughts:
> 
> 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with
> RAID1. This would use the internal SATA2 ports so it wouldn't be the
> highest performance but likely a lot better than my SATA2 RAID6.
> 
> 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID
> controller. This would be SATA3 so probably way more performance than
> I have now. MUCH more expensive though.
>

RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports.

> 3) #1 + an SSD. I have an unused 120GB SSD so I could get another,
> make a 2-disk RAID1, put Gentoo on that and everything else on the
> newer 3TB drives. More complex, probably lower reliability and I'm not
> sure I gain much.
>
>    Beyond this I need to talk file system types. I'm fat dumb and
> happy with Ext4 and don't really relish dealing with new stuff but
> now's the time to at least look.
>

If you change, do not use ZFS and possibly BTRFS if the system does not
have ECC DRAM.  A single, unnoticed, ECC error can corrupt the data pool
and be written to the file system, which effectively renders it corrupt
without a way to recover.

FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and
provide a boot of nfs or iSCSI to your VMs.  The downside is the NAS box
and drives would go for a bit north of $636.  The upside is all your
movies and VM files could move off your workstation and the workstation
would still host the VMs via a mount of the NAS box.

>    Anyway, that's the basic outline. Any thoughts, ideas, corrections,
> expansions, etc., I'm very interested in talking about.
> 
> Cheers,
> Mark
> 

-- 
-  



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:39 ` Bob Sanders
@ 2014-05-27 22:58   ` Harry Holt
  2014-05-27 23:38     ` thegeezer
  2014-05-27 23:32   ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
  2014-05-27 23:51   ` Marc Joliet
  2 siblings, 1 reply; 30+ messages in thread
From: Harry Holt @ 2014-05-27 22:58 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 3730 bytes --]

On May 27, 2014 6:39 PM, "Bob Sanders" <rsanders@sgi.com> wrote:
>
> Mark Knecht, mused, then expounded:
> > Hi all,
> >    The list is quiet. Please excuse me waking it up. (Or trying to...)
;-)
> >
> >    I'm at the point where I'm a few months from running out of disk
> > space on my RAID6 so I'm considering how to move forward. I thought
> > I'd check in here and get any ideas folks have. Thanks in advance.
> >
>
> Beware - if Adobe acroread is used, and you opt for a 3TB home
> directory, there is a chance it will not work.  Or more specifically,
> acroread is still 32-bit.  It's only something I've seen with the xfs
> filesystem.  And Adobe has ignored it for approx. 3yrs now.
>
> >    The system is a Gentoo 64-bit, mostly stable, using a i7-980x
> > Extreme Edition processor with 24GB DRAM. Large chassis, 6 removable
> > HD bays, room for 6 other drives, a large power supply.
> >
> >    The disk subsystem is a 1.4TB RAID6 built from five SATA2 500GB WD
> > RAID-Edition 3 drives. The RAID has not had a single glitch in the 4+
> > years I've used this machine.
> >
> >    Generally there are 4 classes of data on the RAID:
> >
> > 1) Gentoo (obviously), configs backed up every weekend. I plan to
> > rebuild from scratch using existing configs if there's a failure.
> > Being down for a couple of days is not an issue.
> > 2) VMs - about 300GB. Loaded every morning, stopped & saved every
> > night, backed up every weekend.
> > 3) Financial data - lots of it - stocks, futures, options, etc.
> > Performance requirements are pretty low. Backed up every weekend.
> > 4) Video files - backed up to a different location than items 1/2/3
> > whenever there are changes
> >
> >    After eclean-dist/eclean-pkg I'm down to about 80GB free and this
> > will fill up in 3-6 months so it's time to make some changes.
> >
> >    My thoughts:
> >
> > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with
> > RAID1. This would use the internal SATA2 ports so it wouldn't be the
> > highest performance but likely a lot better than my SATA2 RAID6.
> >
> > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware RAID
> > controller. This would be SATA3 so probably way more performance than
> > I have now. MUCH more expensive though.
> >
>
> RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports.
>
> > 3) #1 + an SSD. I have an unused 120GB SSD so I could get another,
> > make a 2-disk RAID1, put Gentoo on that and everything else on the
> > newer 3TB drives. More complex, probably lower reliability and I'm not
> > sure I gain much.
> >
> >    Beyond this I need to talk file system types. I'm fat dumb and
> > happy with Ext4 and don't really relish dealing with new stuff but
> > now's the time to at least look.
> >
>
> If you change, do not use ZFS and possibly BTRFS if the system does not
> have ECC DRAM.  A single, unnoticed, ECC error can corrupt the data pool
> and be written to the file system, which effectively renders it corrupt
> without a way to recover.
>
> FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and
> provide a boot of nfs or iSCSI to your VMs.  The downside is the NAS box
> and drives would go for a bit north of $636.  The upside is all your
> movies and VM files could move off your workstation and the workstation
> would still host the VMs via a mount of the NAS box.

+1 for the Synology NAS boxes, those things are awesome, fast, reliable,
upgradable (if you buy a larger one), and the best value available for
iSCSI attached VMs.

>
> >    Anyway, that's the basic outline. Any thoughts, ideas, corrections,
> > expansions, etc., I'm very interested in talking about.
> >
> > Cheers,
> > Mark
> >
>
> --
> -
>
>

[-- Attachment #2: Type: text/html, Size: 4719 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:58   ` Harry Holt
@ 2014-05-27 23:38     ` thegeezer
  2014-05-28  0:26       ` Rich Freeman
  2014-05-28  3:12       ` [gentoo-amd64] btrfs Was: " Duncan
  0 siblings, 2 replies; 30+ messages in thread
From: thegeezer @ 2014-05-27 23:38 UTC (permalink / raw
  To: gentoo-amd64

On 2014-05-27 23:58, Harry Holt wrote:
> On May 27, 2014 6:39 PM, "Bob Sanders" <rsanders@sgi.com> wrote:
>  >
>  > Mark Knecht, mused, then expounded:
>  > > Hi all,
>  > >    The list is quiet. Please excuse me waking it up. (Or trying
> to...) ;-)
>  > >
>  > >    I'm at the point where I'm a few months from running out of
> disk
>  > > space on my RAID6 so I'm considering how to move forward. I
> thought
>  > > I'd check in here and get any ideas folks have. Thanks in
> advance.
>  > >
>  >
>  > Beware - if Adobe acroread is used, and you opt for a 3TB home
>  > directory, there is a chance it will not work.  Or more
> specifically,
>  > acroread is still 32-bit.  It's only something I've seen with the
> xfs
>  > filesystem.  And Adobe has ignored it for approx. 3yrs now.
>  >
>  > >    The system is a Gentoo 64-bit, mostly stable, using a
> i7-980x
>  > > Extreme Edition processor with 24GB DRAM. Large chassis, 6
> removable
>  > > HD bays, room for 6 other drives, a large power supply.
>  > >
>  > >    The disk subsystem is a 1.4TB RAID6 built from five SATA2
> 500GB WD
>  > > RAID-Edition 3 drives. The RAID has not had a single glitch in
> the 4+
>  > > years I've used this machine.
>  > >
>  > >    Generally there are 4 classes of data on the RAID:
>  > >
>  > > 1) Gentoo (obviously), configs backed up every weekend. I plan to
>  > > rebuild from scratch using existing configs if there's a failure.
>  > > Being down for a couple of days is not an issue.
>  > > 2) VMs - about 300GB. Loaded every morning, stopped & saved every
>  > > night, backed up every weekend.
>  > > 3) Financial data - lots of it - stocks, futures, options, etc.
>  > > Performance requirements are pretty low. Backed up every weekend.
>  > > 4) Video files - backed up to a different location than items
> 1/2/3
>  > > whenever there are changes
>  > >
>  > >    After eclean-dist/eclean-pkg I'm down to about 80GB free and
> this
>  > > will fill up in 3-6 months so it's time to make some changes.
>  > >
>  > >    My thoughts:
>  > >
>  > > 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go
> with
>  > > RAID1. This would use the internal SATA2 ports so it wouldn't be
> the
>  > > highest performance but likely a lot better than my SATA2 RAID6.
>  > >
>  > > 2) Buy two 7200 RPM 3TB WD Red drives and an LSI logic hardware
> RAID
>  > > controller. This would be SATA3 so probably way more performance
> than
>  > > I have now. MUCH more expensive though.
>  > >
>  >
>  > RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA
> ports.
>  >
>  > > 3) #1 + an SSD. I have an unused 120GB SSD so I could get
> another,
>  > > make a 2-disk RAID1, put Gentoo on that and everything else on
> the
>  > > newer 3TB drives. More complex, probably lower reliability and
> I'm not
>  > > sure I gain much.
>  > >
>  > >    Beyond this I need to talk file system types. I'm fat dumb
> and
>  > > happy with Ext4 and don't really relish dealing with new stuff
> but
>  > > now's the time to at least look.
>  > >
>  >
>  > If you change, do not use ZFS and possibly BTRFS if the system does
> not
>  > have ECC DRAM.  A single, unnoticed, ECC error can corrupt the
> data pool
>  > and be written to the file system, which effectively renders it
> corrupt
>  > without a way to recover.
>  >
>  > FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives
> and
>  > provide a boot of nfs or iSCSI to your VMs.  The downside is the
> NAS box
>  > and drives would go for a bit north of $636.  The upside is all
> your
>  > movies and VM files could move off your workstation and the
> workstation
>  > would still host the VMs via a mount of the NAS box.
> 
> +1 for the Synology NAS boxes, those things are awesome, fast,
> reliable, upgradable (if you buy a larger one), and the best value
> available for iSCSI attached VMs.

while i agree on the +1 for iscsi storage, there are a few drawbacks.
yes the modularity is awesome primarily -- super simple to spin up 
backup system and "move" data with a simple connection command.
also a top tip would be to have teh "data" part of the vm as an iscsi 
connection too, so you can easily detach/reattach to another vm.

however, depending on the vm's you have you will probably start needing 
to use more than one gigabit connection to max out speeds: 1gigabit 
ethernet is not the same as 6gigabit sata3, and spinning rust is not the 
same as ssd.

looking to the spec of the existing workstation, i'd be tempted to stay 
with mdadm rather than a hardware raid card (which is probably running 
embedded anyway) though with that i7 you have disabled turboboost right?

what would be an interesting comparison is pci-express speed vs 
motherboard sata - cpu bridge speed, obviously spinning disks will not 
max 6gbit, and the motherboard may not give you 6x 6gbit real 
throughput, whereas dedicated hardware raid _might_ do if it had 
intelligent caching.

other fun to look at would be lvm cos i personally think it's awesome.
for an example the first half of spinning disks is substantially faster 
than the second half due to the tracks on the outer part, so i split 
each disk into three partitions fast,med,slow and add to lvm volume 
group, you can then group the fasts into a raid, medium into a raid and 
slows into a raid too; mdadm allows similar configs with partitions.

ZFS for me lost it's lustre when minimum requirement was 1GB RAM per 
terabyte...i may have my gigabytes and gigabits mixed up on this one 
happy for someone to correct me.  BTRFS looks very very interesting to 
me, though still not played with it but mostly for checksums, the rest i 
can do with lvm.

you might also like to consider fun with deduplication, by have a raid 
base, with lvm on top with block level dedupe ala lessfs, then lvm 
inside the deduped-lvm (yeah i know i'm sick, but the doctor tells me 
the layers of abstraction eventually combine happily :) but i'm not sure 
you'll get much benefit from virtualmachines and movies being deduped.

if you add an ssd into the mix you can also look at devicemapper caches 
such as bcache and dm-cache, or even just moving the journal of your 
ext4 partition there instead.

crucially you need to think about what your issues you _need_ to solve 
and those that you would like to solve. space is obviously one issue, 
and performance is not really an issue for you. depending on your budget 
a pair of large sata drives + mdadm will be ideal, if you had lvm 
already you could simply 'move' then 'enlarge' your existing stuff (tm) 
: i'd like to know how btrfs would do the same for anyone who can let me 
know.
you have raid6 because you probably know that raid5 is just waiting for 
trouble, so i'd probably start looking at btrfs for your finanical data 
to be checksummed.  also consider ECC memory if your motherboard 
supports it, never mind the hosing of filesystems, if you are running 
vm's you do _not_ want memory making them behave oddly or worse, and if 
you have lots of active financial data (bloomberg + analytics) you run 
the risk of the butterfly effect making odd results.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 23:38     ` thegeezer
@ 2014-05-28  0:26       ` Rich Freeman
  2014-05-28  3:12       ` [gentoo-amd64] btrfs Was: " Duncan
  1 sibling, 0 replies; 30+ messages in thread
From: Rich Freeman @ 2014-05-28  0:26 UTC (permalink / raw
  To: gentoo-amd64

On Tue, May 27, 2014 at 7:38 PM,  <thegeezer@thegeezer.net> wrote:
> if you had lvm already you could
> simply 'move' then 'enlarge' your existing stuff (tm)

Yup - if you're not running btrfs/zfs you probably should be running
lvm.  One thing I would do is backup your lvm metadata when it changes
- I once got burned by an lvm error of some kind and an fsck scrambled
the living daylights out of my disk (an fsck on one ext3 partition
scrambled a different partition).  That is pretty rare though (but I
did find one or two mentions online of similar situations.

> : i'd like to know how
> btrfs would do the same for anyone who can let me know.

A btrfs filesystem pools storage.  You can add devices to the pool,
and remove devices to the pool.  If you remove a device with data on
it the data will get moved.  When adding devices btrfs does not
automatically shuffle data around - you can issue a balance command to
do so, but I wouldn't do this until you're done adding/removing
drives.

A nice thing about btrfs is that devices do not have to be of the same
size and it generally does the right thing.

The downside of btrfs right now for raid is that raid5/6 are still
very experimental.  They will support reshaping though, which is one
of the reasons I've stayed away from zfs.  Zfs also lets you
add/remove devices from a pool, but it does not allow you to reshape a
raid.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] btrfs  Was: Soliciting new RAID ideas
  2014-05-27 23:38     ` thegeezer
  2014-05-28  0:26       ` Rich Freeman
@ 2014-05-28  3:12       ` Duncan
  2014-05-28  7:29         ` thegeezer
  1 sibling, 1 reply; 30+ messages in thread
From: Duncan @ 2014-05-28  3:12 UTC (permalink / raw
  To: gentoo-amd64

thegeezer posted on Wed, 28 May 2014 00:38:03 +0100 as excerpted:

> depending on your budget a pair of large sata drives + mdadm will be
> ideal, if you had lvm already you could simply 'move' then 'enlarge'
> your existing stuff (tm) : i'd like to know how btrfs would do the same
> for anyone who can let me know.
> you have raid6 because you probably know that raid5 is just waiting for
> trouble, so i'd probably start looking at btrfs for your finanical data
> to be checksummed.

Given that I'm a regular on the btrfs list as well as running it myself, 
I'm likely to know more about it than most.  Here's a whirlwind rundown 
with a strong emphasis on practical points a lot of people miss (IOW, I'm 
skipping a lot of the commonly covered and obvious stuff).  Point 6 below 
directly answers your move/enlarge question.  Meanwhile, points 1, 7 and 
8 are critically important, as we see a lot of people on the btrfs list 
getting them wrong.

1) Since there's raid5/6 discussion on the thread... Don't use btrfs 
raid56 modes at this time, except purely for playing around with trashable 
or fully backed up data.  The implementation as introduced isn't code-
complete, and while the operational runtime side works, recovery from 
dropped devices, not so much.  Thus, in terms of data safety you're 
effectively running a slow raid0 with lots of extra overhead that can be 
considered trash if a device drops, with the sole benefit being that when 
the raid56 mode recovery implementation code gets merged (and is tested 
for a kernel cycle or two to work out the initial bugs), you'll then get 
what amounts to a "free" upgrade to the raid5 or raid6 mode you had 
originally configured, since it was doing the operational parity 
calculation and writes to track it all along, it just couldn't yet be 
used for actual recovery as the code simply wasn't there to do so.

2) Btrfs raid0, raid1 and raid10 modes, along with single mode (on a 
single or multiple-devices) and dup mode (on a single device, metadata is 
by default duplicated -- two copies, except on ssd where the default is 
only a single copy since some ssds dedup anyway) are reasonably mature 
and stable, to the same point as btrfs in general, anyway, which is to 
say it's "mostly stable, keep your backups fresh but you're not /too/ 
likely to have to use them."  There are still enough bugs being fixed in 
each kernel release, however, that running latest stable series is 
/strongly/ recommended, as your data is at risk to known-fixed bugs (even 
if at this point they only tend to hit the corner-cases) if you're not 
doing so.

3) It's worth noting that btrfs treats data and metadata separately -- 
when you do a mkfs.btrfs, you can configure redundancy modes separately 
for each, the single-device default being (as above) dup metadata (except 
for ssd), single data, the multi-device default being raid1 metadata, 
single data

4) FWIW, most of my btrfs formatted partitions are dual-device raid1 mode 
for both data and metadata, on ssd.  (Second backup is reiserfs on 
spinning-rust, just in case some Armageddon bug eats all the btrfs at the 
same time, working copy and first backup, tho btrfs is stable enough now 
that's extremely unlikely, but I didn't consider it so back when I set 
things up nearly a year ago now.)

The reason for my raid1 mode choice isn't that of ordinary raid1, it's 
specifically due to btrfs' checksumming and data integrity features -- if 
one copy fails its checksum, btrfs will, IF IT HAS ANOTHER COPY TO TRY, 
check the second copy and if it's good, will use it and rewrite the bad 
copy.  Btrfs scrub allows checking the entire filesystem for checksum 
errors and restoring any errors it finds from good copies where possible.

Obviously, the default single data mode (or raid0) won't have a second 
copy to check and rewrite from, while raid1 (and raid10) modes will (as 
will dup-mode metadata on a single device, but with one exception, dup 
mode isn't allowed for data, only metadata, the exception being the mixed-
blockgroup mode that mixes data and metadata together, that's the default 
on filesystems under 1 GiB but isn't recommended on large filesystems for 
performance reasons).

So I wanted a second copy of both data and metadata to take advantage of 
btrfs' data integrity and scrub features, and with btrfs raid1 mode, I 
get both that and the traditional raid1 device-loss protection as well. 
=:^)

5) It's worth noting that as of now, btrfs raid1 mode is only two-way-
mirrored, no matter how many devices are configured into the filesystem.  
N-way-mirrored is the next feature on the roadmap after the raid56 work 
is completed, but given how nearly every btrfs feature has taken much 
longer to complete than originally planned, I'm not expecting it until 
sometime next year, now.

Which is unfortunate, as my risk vs. cost sweet spot would be 3-way-
mirroring, covering in case *TWO* copies of a block failed checksum.  Oh, 
well, it's coming, even if it seems at this point like the proverbial 
carrot dangling off a stick held in front of the donkey.

6) Btrfs handles moving then enlarging (parallel to LVM) using btrfs
add/delete, to add or delete a device to/from a filesystem (moving the 
content from a to-be-deleted device in the process), plus btrfs balance, 
to restripe/convert/rebalance between devices as well as to free 
allocated but empty data and metadata chunks back to unallocated.  
There's also btrfs resize, but that's more like the conventional 
filesystem resize command, resizing the part of the filesystem on an 
individual device (partitioned/virtual or whole physical device).

So to add a device, you'd btrfs device add, then btrfs balance, with an 
optional conversion to a different redundancy mode if desired, to 
rebalance the existing data and metadata onto that device.  (Without the 
rebalance it would be used for new chunks, but existing data and metadata 
chunks would stay where they were.  I'll omit the "chunk definition" 
discussion in the interest of brevity.)

To delete a device, you'd btrfs device delete, which would move all the 
data on that device onto other existing devices in the filesystem, after 
which it could be removed.

7) Given the thread, I'd be remiss to omit this one.  VM images and other 
large "internal-rewrite-pattern" files (large database files, etc) need 
special treatment on btrfs, at least currently.  As such, btrfs may not 
be the greatest solution for Mark (tho it would work fine with special 
procedures), given the several VMs he runs.  This one unfortunately hits 
a lot of people. =:^(  But here's a heads-up, so it doesn't have to hit 
anyone reading this! =:^)

As a property of the technology, any copy-on-write-based filesystem is 
going to find files where various bits of existing data within the file 
are repeatedly rewritten (as opposed to new data simply being appended, 
think a log file or live-stored audio/video stream) extremely challenging 
to deal with.  The problem is that unlike ordinary filesystems that 
rewrite the data in place such that a file continues to occupy the same 
extents as it did before, copy-on-write filesystems will write a changed 
block to a different location.  While COW does mean atomic updates and 
thus more reliability since either the new data or the old data should 
exist, never an unpredictable mixture of the two, as a result of the 
above rewrite pattern, this type of internally-rewritten file gets 
**HEAVILY** fragmented over time.

We've had filefrag reports of several gig files with over 100K extents!  
Obviously, this isn't going to be the most efficient file in the world to 
access!

For smaller files, up to a couple hundred MiB or perhaps a bit more, 
btrfs has the autodefrag mount option, which can help a lot.  With this 
option enabled, whenever a block of a file is changed and rewritten, thus 
written elsewhere, btrfs queues up a rewrite of the entire file to happen 
in the background.  The rewrite will be done sequentially, thus defragging 
the file.  This works quite well for firefox's sqlite database files, for 
instance, as they're internal-rewrite-pattern, but they're small enough 
that autodefrag handles them reasonably nicely.

But this solution doesn't scale so well as the file size increases toward 
and past a GiB, particularly for files with a continuous stream of 
internal rewrites such as can happen with an operating VM writing to its 
virtual storage device.  At some point, the stream of writes comes in 
faster than the file can be rewritten, and things start to back up!

To deal with this case, there's the NOCOW file attribute, set with chattr 
+C.  However, to be effective, this attribute must be set when the file 
is empty, before it has existing content.  The easiest way to do that is 
to set the attribute on the directory which will contain the files.  
While it doesn't affect the directory itself any, newly created files 
within that directory inherit the NOCOW attribute before they have data, 
thus allowing it to work without having to worry about it that much.  For 
existing files, create a new directory, set its NOCOW attribute, and COPY 
(don't move, and don't use cp --reflink) the existing files into it.

Once you have your large internal-rewrite-pattern files set NOCOW, btrfs 
will rewrite them in-place as an ordinary filesystem would, thus avoiding 
the problem.

Except for one thing.  I haven't mentioned btrfs snapshots yet as that 
feature, but for this caveat, is covered well enough elsewhere.  But 
here's the problem.  A snapshot locks the existing file data in place.  
As a result, the first write to a block within a file after a snapshot 
MUST be COW, even if the file is otherwise set NOCOW.  

If only the occasional one-off snapshot is done it's not /too/ bad, as 
all the internal file writes between snapshots are NOCOW, it's only the 
first one to each file block after a snapshot that must be COW.  But many 
people and distros are script-automating their snapshots in ordered to 
have rollback capacities, and on btrfs, snapshots are (ordinarily) light 
enough that people are sometimes configuring a snapshot a minute!  If 
only a minute's changes can be written to a the existing location, then 
there's a snapshot and changes must be written to a new location, then 
another snapshot and yet another location...  Basically the NOCOW we set 
on that file isn't doing us any good!

8) So making this a separate point as it's important and a lot of people 
get it wrong.  NOCOW and snapshots don't mix!

There is, however, a (partial) workaround.  Because snapshots stop at 
btrfs subvolume boundaries, if you put your large VM images and similar 
large internal-rewrite-pattern files (databases, etc) in subvolumes, 
making that directory I suggested above a full subvolume not just a NOCOW 
directory, snapshots of the parent subvolume will not include the VM 
images subvolume, thus leaving the VM images alone.  This solves the 
snapshot-broken-NOCOW and thus the fragmentation issue, but it DOES mean 
that those VM images must be backed up using more conventional methods 
since snapshotting won't work for them.

9) Some other still partially broken bits of btrfs include:

9a) Quotas:  Just don't use them on btrfs at this point.  Performance 
doesn't scale (altho there's a rewrite in progress), and they are buggy.  
Additionally, the scaling interaction with snapshots is geometrically 
negative, sometimes requiring 64 GiB of RAM or more and coming to a near 
standstill at that, for users with enough quota-groups and enough 
snapshots.  If you need quotas, use a more traditional filesystem with 
stable quota support.  Hopefully by this time next year...

9b) Snapshot-aware-defrag:  This was enabled at one point but simply 
didn't scale, when it turned out people were doing things like per-minute 
snapshots and thus had thousands and thousands of snapshots.  So this has 
been disabled for the time being.  Btrfs defrag will defrag the working 
copy it is run on, but currently doesn't account for snapshots, so data 
that was fragmented at snapshot time gets duplicated as it is 
defragmented.  However, they plan to re-enable the feature ones they have 
rewritten various bits to scale far better than they do at present.

9c) Send and receive.  Btrfs send and receive are a very nice feature 
that can make backups far faster, with far less data transferred.  
They're great when they work.  Unfortunately, there are still various 
corner-cases where they don't.  (As an example, a recent fix was for the 
case where subdir B was nested inside subdir A for the first, full send/
receive, but later, the relationship was reversed, with subdir B made the 
parent of subdir A.  Until the recent fix, send/receive couldn't handle 
that sort of corner-case.)  You can go ahead and use it if it's working 
for you, as if it finishes without error, the copy should be 100% 
reliable.  However, have an alternate plan for backups if you suddenly 
hit one of those corner-cases and send/receive quits working.

Of course it's worth mentioning that b and c deal with features that most 
filesystems don't have at all, so with the exception of quotas, it's not 
like something's broken on btrfs that works on other filesystems.  
Instead, these features are (nearly) unique to btrfs, so even if they 
come with certain limitations, that's still better than not having the 
option of using the feature at all, because it simply doesn't exist on 
the other filesystem!

10) Btrfs in general is headed toward stable now, and a lot of people, 
including me, have used it for a significant amount of time without 
problems, but it's still new enough that you're strongly urged to make 
and test your backups, because by not doing so, you're stating by your 
actions if not your words, that you simply don't care if some as yet 
undiscovered and unfixed bug in the filesystem eats your data.

For similar reasons altho already mentioned above, run the latest stable 
kernel from the latest stable kernel series, at the oldest, and consider 
running rc kernels from at least rc2 or so (by which time any real data 
eating bugs, in btrfs or elsewhere, should be found and fixed, or at 
least published).  Because anything older and you are literally risking 
your data to known and fixed bugs.

As is said, take reasonable care and you're much less likely to be the 
statistic!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] btrfs  Was: Soliciting new RAID ideas
  2014-05-28  3:12       ` [gentoo-amd64] btrfs Was: " Duncan
@ 2014-05-28  7:29         ` thegeezer
  2014-05-28 20:32           ` Marc Joliet
  0 siblings, 1 reply; 30+ messages in thread
From: thegeezer @ 2014-05-28  7:29 UTC (permalink / raw
  To: gentoo-amd64

top man, thanks for detail and the tips !


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] btrfs  Was: Soliciting new RAID ideas
  2014-05-28  7:29         ` thegeezer
@ 2014-05-28 20:32           ` Marc Joliet
  2014-05-29  6:41             ` [gentoo-amd64] " Duncan
  0 siblings, 1 reply; 30+ messages in thread
From: Marc Joliet @ 2014-05-28 20:32 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 3028 bytes --]

(Dammit, it seems that I've developed a habit of writing somewhat long-winded
emails :-/ . Sorry!)

Am Wed, 28 May 2014 08:29:07 +0100
schrieb thegeezer <thegeezer@thegeezer.net>:

> top man, thanks for detail and the tips !

I second this :) . In fact, I think I'll link to it in my btrfs thread on
gentoo-user.

I do have a question for Duncan (or anybody else who knows, but I know that
Duncan is fairly active on the BTRFS ML), though:

How does btrfs handle checksum errors on a single drive (or when self-healing
fails)?

That is, does it return a hard error, rendering the file unreadable, or is it
possible to read from a corrupted file?  Sadly, I don't remember finding the
answer to this from my own research into BTRFS before I made the switch (my
thread is here: [0]), and searching online now hasn't revealed anything; all I
can find are mentions of its self-healing capability.

I *think* BTRFS treats this as a hard error? But I'm just not sure.

(I feel kind of stupid, because I'm sure I saw the answer in some of the emails
on linux-btrfs that I read through via GMANE.)

I ask because I'm considering converting the 2TB data partition on my 3TB
external hard drive from NTFS to BTRFS [1] .  It primarily contains media
files, where random corruption is decidedly *not* the end of the world.
However, it also contains ISOs and other large files where corruption matters
more, but which are not important enough to land on my BTRFS RAID (on the other
hand, my music collection is ;-) ).

In any case, reconstructing a corrupted file can be fairly difficult: It might
involve re-ripping a (game) disk, or it might be something I got from a friend,
delaying file recovery until I can get it again, or the file might be a youtube
download (or a conference video, or something from archive.org, or ...) and I
have to track it down online again.  However, I might want to *know* that a file
is corrupt, so that I *can* reconstruct it if I want to.

The obvious answer, retrieving from backup, is difficult to implement, since I
would need an additional external drive for that.  Also, the files are not
*that* important, e.g., in the case of a youtube download, where most of the
time I delete the file afterwards anyway.

(It seems to me that the optimal solution would be to use some sort of NAS, with
a multi-device ZFS or BTRFS file system, in place of an external hard drive; I
expect to go that route in the future, when I can afford it.)

[0] http://thread.gmane.org/gmane.linux.gentoo.user/274236

[1] I used NTFS under the assumption that I might want to keep the drive Windows
compatible (for family), but have decided that I don't really care, since the drive is
pretty much permanently attached to my desktop (it also has an EXT4 partition
for automatic local backups, so removing it would be less than optimal ;-) ).

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] Re: btrfs  Was: Soliciting new RAID ideas
  2014-05-28 20:32           ` Marc Joliet
@ 2014-05-29  6:41             ` Duncan
  2014-05-29 17:57               ` Marc Joliet
  0 siblings, 1 reply; 30+ messages in thread
From: Duncan @ 2014-05-29  6:41 UTC (permalink / raw
  To: gentoo-amd64

Marc Joliet posted on Wed, 28 May 2014 22:32:47 +0200 as excerpted:

> (Dammit, it seems that I've developed a habit of writing somewhat
> long-winded emails :-/ . Sorry!)

You?  <looking this way and that>  What does that make mine? =:^)

> Am Wed, 28 May 2014 08:29:07 +0100 schrieb thegeezer
> <thegeezer@thegeezer.net>:
> 
>> top man, thanks for detail and the tips !
> 
> I second this :) . In fact, I think I'll link to it in my btrfs thread
> on gentoo-user.

Thanks.  I was on the user list for a short time back in 2004 when I 
first started with gentoo, but back then it was mostly x86, while my 
interest was amd64, and the amd64 list was active enough back then that I 
didn't really feel the need for the mostly x86 user list, so I 
unsubscribed and never got around to subscribing again, when the amd64 
list traffic mostly dried up.  But if it'll help people there... go right 
ahead and link or repost.  (Also, anyone who wants to put it up on the 
gentoo wiki, go ahead.  I work best on newsgroups and mailing lists, and 
find wikis, like most of the web, in practice read-only for my usage.  
I'll read up on them, but somehow never get around to actually writing 
anything on them, even if it would in theory save me a bunch of time 
since I could write stuff once and link it instead of repeating on the 
lists.)

> I do have a question for Duncan (or anybody else who knows, but I know
> that Duncan is fairly active on the BTRFS ML), though:
> 
> How does btrfs handle checksum errors on a single drive (or when
> self-healing fails)?
> 
> That is, does it return a hard error, rendering the file unreadable, or
> is it possible to read from a corrupted file?

As you suspect, it's a hard error.

There has been developer discussion on the btrfs list of some sort of 
mount option or the like that would allow retrieval even with bad 
checksums, presumably with dmesg then being the only indication something 
was wrong, in case it's a simple single bit-flip or the like in something 
like text where it should be obvious, or media, where it'll likely not 
even be noticed, but I've not seen an actual patch for it.  Presumably 
it'll eventually happen, but to now there's a lot more potential features 
and bug fixes to code up than developers and time in their days to code 
them, so no idea when.  I guess when the right person gets that itch to 
scratch.

Which is yet another reason I have chosen the raid1 mode for both data 
and metadata and am eagerly awaiting the N-way-mirroring code in ordered 
to let me do 3-way as well, because I'd really /hate/ to think it's just 
a bitflip, yet not have any way at all to get to it.

Which of course makes it that much more critical to keep your backups as 
current as you're willing to risk losing, *AND* test that they're 
actually recoverable, as well.

(FWIW here, while I do have backups, they aren't always current.  Still, 
for my purposes the *REAL* backups are the experiences and knowledge in 
my head.  As long as I have that, I can recreate the real valuable stuff, 
and to the extent that I can't, I don't consider it /that/ valuable.  And 
if I lose those REAL backups... well I won't have enough left then to 
realize what I've lost, will I?  That's ultimately the attitude I take, 
appreciating the real important stuff for what it is, and the rest, well, 
if it comes to it, I lose what I lose, but yes, I do still keep backups, 
actually multiple levels deep, tho as I said they aren't always current.)

However, one trick that I alluded to, that actually turned out to be an 
accidental side effect feature of fixing an entirely different problem, 
is setting mixed-blockgroup mode at mkfs.btrfs and selecting dup mode for 
both data and metadata at that time as well.  (In mixed-mode, data and 
metadata must be set the same, and the default except on ssd is then dup, 
but the point here is to ensure dup, not single.)  As I said, the reason 
mixed-mode is there is to deal with really small filesystems and it's the 
default for under a gig.  And there's definitely a performance cost as 
well as the double-space cost when using dup.  But it *DOES* allow one to 
run dup mode for both data and metadata, and some users are willing to 
pay its performance costs for the additional data integrity it offers.  

Certainly, if you can possibly do two devices, the paired device raid1 
mode is preferable, but for instance my netbook has only a single SATA 
port, so either mixed-bg and dup mode, or partitioning up and using two 
partitions to fake two devices for raid1 mode, are what I'm likely to do.

(I actually don't know which I'll do as I haven't messed with the netbook 
in awhile, but I have an SSD already laying around to throw in it and I 
keep thinking about it, and with its single SATA port, it's a perfect 
example of sometimes not being /able/ to run two devices.  OTOH, I might 
just throw some money at it and buy a full 64-bit replacement machine, 
thus allowing me to use the 64-bit packages I build for my main machine 
on the (new) little one too, and thus to do away with the 32-bit chroot 
on my main machine that I use as a built image for the netbook.)

(I snipped it there to reply to this bit first as it was a 
straightforward answer.  I'll go back and read the rest now, to see if 
there's anything else I want to reply to.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Re: btrfs  Was: Soliciting new RAID ideas
  2014-05-29  6:41             ` [gentoo-amd64] " Duncan
@ 2014-05-29 17:57               ` Marc Joliet
  2014-05-29 17:59                 ` Rich Freeman
  0 siblings, 1 reply; 30+ messages in thread
From: Marc Joliet @ 2014-05-29 17:57 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 6405 bytes --]

Am Thu, 29 May 2014 06:41:14 +0000 (UTC)
schrieb Duncan <1i5t5.duncan@cox.net>:

> Marc Joliet posted on Wed, 28 May 2014 22:32:47 +0200 as excerpted:
> 
> > (Dammit, it seems that I've developed a habit of writing somewhat
> > long-winded emails :-/ . Sorry!)
> 
> You?  <looking this way and that>  What does that make mine? =:^)

Novels, duh ;-) .

> > Am Wed, 28 May 2014 08:29:07 +0100 schrieb thegeezer
> > <thegeezer@thegeezer.net>:
> > 
> >> top man, thanks for detail and the tips !
> > 
> > I second this :) . In fact, I think I'll link to it in my btrfs thread
> > on gentoo-user.
> 
> Thanks.  I was on the user list for a short time back in 2004 when I 
> first started with gentoo, but back then it was mostly x86, while my 
> interest was amd64, and the amd64 list was active enough back then that I 
> didn't really feel the need for the mostly x86 user list, so I 
> unsubscribed and never got around to subscribing again, when the amd64 
> list traffic mostly dried up.  But if it'll help people there... go right 
> ahead and link or repost.

I ended up simply forwarding it, as opposed to bumping my inactive thread.

> (Also, anyone who wants to put it up on the 
> gentoo wiki, go ahead.  I work best on newsgroups and mailing lists, and 
> find wikis, like most of the web, in practice read-only for my usage.  
> I'll read up on them, but somehow never get around to actually writing 
> anything on them, even if it would in theory save me a bunch of time 
> since I could write stuff once and link it instead of repeating on the 
> lists.)

Heh, the only Wiki I ever edited was at my old student job.  But yeah, I don't
feel comfortable enough in my BTRFS knowledge to write a Wiki entry myself.

> > I do have a question for Duncan (or anybody else who knows, but I know
> > that Duncan is fairly active on the BTRFS ML), though:
> > 
> > How does btrfs handle checksum errors on a single drive (or when
> > self-healing fails)?
> > 
> > That is, does it return a hard error, rendering the file unreadable, or
> > is it possible to read from a corrupted file?
> 
> As you suspect, it's a hard error.

Damn >:-( .

> There has been developer discussion on the btrfs list of some sort of 
> mount option or the like that would allow retrieval even with bad 
> checksums, presumably with dmesg then being the only indication something 
> was wrong, in case it's a simple single bit-flip or the like in something 
> like text where it should be obvious, or media, where it'll likely not 
> even be noticed, but I've not seen an actual patch for it.  Presumably 
> it'll eventually happen, but to now there's a lot more potential features 
> and bug fixes to code up than developers and time in their days to code 
> them, so no idea when.  I guess when the right person gets that itch to 
> scratch.

That's really too bad, I guess this isn't a situation that often arises for
BTRFS users.

> Which is yet another reason I have chosen the raid1 mode for both data 
> and metadata and am eagerly awaiting the N-way-mirroring code in ordered 
> to let me do 3-way as well, because I'd really /hate/ to think it's just 
> a bitflip, yet not have any way at all to get to it.
> 
> Which of course makes it that much more critical to keep your backups as 
> current as you're willing to risk losing, *AND* test that they're 
> actually recoverable, as well.

Of course, but like I said, I can't back up this one data partition.  I do have
backups for everything on my desktop computer, though, which are on the other
partition of this external drive.

> (FWIW here, while I do have backups, they aren't always current.  Still, 
> for my purposes the *REAL* backups are the experiences and knowledge in 
> my head.  As long as I have that, I can recreate the real valuable stuff, 
> and to the extent that I can't, I don't consider it /that/ valuable.  And 
> if I lose those REAL backups... well I won't have enough left then to 
> realize what I've lost, will I?  That's ultimately the attitude I take, 
> appreciating the real important stuff for what it is, and the rest, well, 
> if it comes to it, I lose what I lose, but yes, I do still keep backups, 
> actually multiple levels deep, tho as I said they aren't always current.)

Hehe, good philosophy :-) .

> However, one trick that I alluded to, that actually turned out to be an 
> accidental side effect feature of fixing an entirely different problem, 
> is setting mixed-blockgroup mode at mkfs.btrfs and selecting dup mode for 
> both data and metadata at that time as well.  (In mixed-mode, data and 
> metadata must be set the same, and the default except on ssd is then dup, 
> but the point here is to ensure dup, not single.)  As I said, the reason 
> mixed-mode is there is to deal with really small filesystems and it's the 
> default for under a gig.  And there's definitely a performance cost as 
> well as the double-space cost when using dup.  But it *DOES* allow one to 
> run dup mode for both data and metadata, and some users are willing to 
> pay its performance costs for the additional data integrity it offers.  

That is an interesting idea.  I might consider that.  Or I might just create a
third partition and make a RAID 1 out of those, once I know how much space my
backups will ultimately take.

But really, why is there no dup for data?

(I only set up my backups about a month ago just before my migration to BTRFS,
using rsnapshot, and the backups aren't fully there yet; the one monthly backup
is still missing, and I wanted to wait a bit after that to see how much space
the backups ultimately require.  Plus, I might back up (parts of) my laptop to
there, too, although there isn't that much stuff on it that isn't already
synchronised in some other fashion, so it's not decided yet.)

> Certainly, if you can possibly do two devices, the paired device raid1 
> mode is preferable, but for instance my netbook has only a single SATA 
> port, so either mixed-bg and dup mode, or partitioning up and using two 
> partitions to fake two devices for raid1 mode, are what I'm likely to do.
[...]

Ah, you mentioned the RAID 1 idea already :-) .

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas
  2014-05-29 17:57               ` Marc Joliet
@ 2014-05-29 17:59                 ` Rich Freeman
  2014-05-29 18:25                   ` Mark Knecht
  2014-05-29 21:05                   ` Frank Peters
  0 siblings, 2 replies; 30+ messages in thread
From: Rich Freeman @ 2014-05-29 17:59 UTC (permalink / raw
  To: gentoo-amd64

On Thu, May 29, 2014 at 1:57 PM, Marc Joliet <marcec@gmx.de> wrote:
> Am Thu, 29 May 2014 06:41:14 +0000 (UTC)
> schrieb Duncan <1i5t5.duncan@cox.net>:
>> Thanks.  I was on the user list for a short time back in 2004 when I
>> first started with gentoo, but back then it was mostly x86, while my
>> interest was amd64, and the amd64 list was active enough back then that I
>> didn't really feel the need for the mostly x86 user list, so I
>> unsubscribed and never got around to subscribing again, when the amd64
>> list traffic mostly dried up.  But if it'll help people there... go right
>> ahead and link or repost.
>
> I ended up simply forwarding it, as opposed to bumping my inactive thread.

When was the last time we actually had an amd64-specific discussion on
this list?  Part of me wonders if the list ought to be retired.  It
made a lot more sense back when amd64 was fairly experimental and
prone to fairly unique issues.  I deleted my 32-bit chroot some time
ago.

Rich


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas
  2014-05-29 17:59                 ` Rich Freeman
@ 2014-05-29 18:25                   ` Mark Knecht
  2014-05-29 21:05                   ` Frank Peters
  1 sibling, 0 replies; 30+ messages in thread
From: Mark Knecht @ 2014-05-29 18:25 UTC (permalink / raw
  To: Gentoo AMD64

On Thu, May 29, 2014 at 10:59 AM, Rich Freeman <rich0@gentoo.org> wrote:
> On Thu, May 29, 2014 at 1:57 PM, Marc Joliet <marcec@gmx.de> wrote:
>> Am Thu, 29 May 2014 06:41:14 +0000 (UTC)
>> schrieb Duncan <1i5t5.duncan@cox.net>:
>>> Thanks.  I was on the user list for a short time back in 2004 when I
>>> first started with gentoo, but back then it was mostly x86, while my
>>> interest was amd64, and the amd64 list was active enough back then that I
>>> didn't really feel the need for the mostly x86 user list, so I
>>> unsubscribed and never got around to subscribing again, when the amd64
>>> list traffic mostly dried up.  But if it'll help people there... go right
>>> ahead and link or repost.
>>
>> I ended up simply forwarding it, as opposed to bumping my inactive thread.
>
> When was the last time we actually had an amd64-specific discussion on
> this list?  Part of me wonders if the list ought to be retired.  It
> made a lot more sense back when amd64 was fairly experimental and
> prone to fairly unique issues.  I deleted my 32-bit chroot some time
> ago.
>
> Rich
>

I completely understand your point but in my case, after about a
decade ongentoo-user, I quit posting gentoo-user completely due to the
attitudes of some folks there, flame posts, put-downs, etc. I have no
idea how it is now but I have no real desire to go back there.

The two things I really value about this list are the quality of posts
as well as the very civil way folks treat each other.

Just my 2 cents

Cheers,
Mark


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Re: btrfs Was: Soliciting new RAID ideas
  2014-05-29 17:59                 ` Rich Freeman
  2014-05-29 18:25                   ` Mark Knecht
@ 2014-05-29 21:05                   ` Frank Peters
  2014-05-30  2:04                     ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan
  1 sibling, 1 reply; 30+ messages in thread
From: Frank Peters @ 2014-05-29 21:05 UTC (permalink / raw
  To: gentoo-amd64

On Thu, 29 May 2014 13:59:25 -0400
Rich Freeman <rich0@gentoo.org> wrote:

> 
> When was the last time we actually had an amd64-specific discussion on
> this list?  Part of me wonders if the list ought to be retired.  It
> made a lot more sense back when amd64 was fairly experimental and
> prone to fairly unique issues.
>

There may not be any amd64 issues, but there certainly are a lot of
gripes.

For those who operate a pure 64-bit system (no multi-lib), there is
a fair amount of highly useful software that has not yet been updated
to be 64-bit clean.  For example, Adobe PDF Reader, Foxit PDF Reader,
and the Intel ICC compiler are still 32-bit.  I wish these folks would
get with the modern trends.

Frank Peters

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] amd64 list, still useful?  Was: btrfs
  2014-05-29 21:05                   ` Frank Peters
@ 2014-05-30  2:04                     ` Duncan
  2014-05-30  2:44                       ` Frank Peters
  2014-06-04 16:41                       ` [gentoo-amd64] " Mark Knecht
  0 siblings, 2 replies; 30+ messages in thread
From: Duncan @ 2014-05-30  2:04 UTC (permalink / raw
  To: gentoo-amd64

Frank Peters posted on Thu, 29 May 2014 17:05:26 -0400 as excerpted:

> There may not be any amd64 issues, but there certainly are a lot of
> gripes.
> 
> For those who operate a pure 64-bit system (no multi-lib), there is a
> fair amount of highly useful software that has not yet been updated to
> be 64-bit clean.  For example, Adobe PDF Reader, Foxit PDF Reader, and
> the Intel ICC compiler are still 32-bit.  I wish these folks would get
> with the modern trends.

FWIW, I'm no-multilib as well, but I guess for a different reason.

I don't do proprietary and in general couldn't even if I wanted to, since 
I cannot and will not agree to the EULAs, so non-free software that 
hasn't been amd64 ported is of no concern to me, except that it's yet 
another case where authors chose not to respect my rights, so I simply 
don't use their software.

Meanwhile, all the software I actually use has long since been ported, 
and I no longer even use grub-static, since I've switched to grub2, which 
builds just fine on amd64.

So there's literally no reason for me to run multilib at all, and in 
fact, when I switched over some years ago, I had already had various 
problems due to the 32-bit side which I never used except to build 
toolchain 32-bit support, breaking.  As a result, simply switching to no-
multilib significantly decomplicated life and resulted in far faster gcc 
and glibc rebuilds as well, and there was literally no down side 
whatsoever, except that I had to run grub-static for a couple years.

Tho I do still have a 32-bit chroot as the build-root for my 32-bit only 
netbook.

But by policy I don't keep anything private on the netbook and actually 
don't use it as a NET-book anyway, only connecting it via ethernet here 
at home. (I never did get the wifi working on it, I tried at one point, 
but apparently there was some bug in the kernel wifi driver at that point 
and I couldn't connect, and I simply never bothered since.)  So security 
isn't a huge deal on it, and I actually haven't updated it in a couple 
years now, to the point that I'd have severe problems updating it using a 
current gentoo tree due to EAPI upgrade issues, so I'd have to do 
staggered updates using archived trees.

At this point that means I'll probably just do a full from-stage3-rebuild 
at some point... if I even bother at all.  I might actually just hardware 
upgrade to a 64-bit machine, such that I can use my main system's binpkgs 
for both machines.

Meanwhile, Mark's reason for staying on this list, as opposed to the 
general user list, are more or less mine, as well.  I never actually saw 
the negatives he saw there, and there was a time when there was an attack 
on me here, which I'll never forget as it was quite an experience seeing 
other regulars and lurkers too come out of the woodwork to defend me. I 
knew a lot of folks liked my posts due to thanks now and again, but WOW, 
I had no idea I had benefitted THAT many lurkers along with the others, 
and it was quite humbling indeed to see them post perhaps their only post 
in years, to defend me!  Certainly a life-changing experience!

Rather, in my case it is more that I remember the high traffic of the 
user list and kind of like the lower but perhaps higher quality traffic 
here, tho at times it's /too/ low traffic, these days.  Probably at some 
point I'll get back to the user list, but if this list were to shut down, 
I'd still miss it, because while there's not a lot of traffic here these 
days, the signal to noise ratio really is about the highest I can imagine.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] amd64 list, still useful?  Was: btrfs
  2014-05-30  2:04                     ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan
@ 2014-05-30  2:44                       ` Frank Peters
  2014-05-30  6:25                         ` [gentoo-amd64] " Duncan
  2014-06-04 16:41                       ` [gentoo-amd64] " Mark Knecht
  1 sibling, 1 reply; 30+ messages in thread
From: Frank Peters @ 2014-05-30  2:44 UTC (permalink / raw
  To: gentoo-amd64

On Fri, 30 May 2014 02:04:39 +0000 (UTC)
Duncan <1i5t5.duncan@cox.net> wrote:

> 
> FWIW, I'm no-multilib as well, but I guess for a different reason.
> 
> I don't do proprietary and in general couldn't even if I wanted to, since 
> I cannot and will not agree to the EULAs, so non-free software that 
> hasn't been amd64 ported is of no concern to me,
>

It's not just proprietary software that lags behind.  I continue
to encounter FOSS packages from time to time that are still 32-bit only.

One example, for audio enthusiasts, is the excellent AudioCutter:
http://www.virtualworlds.de/AudioCutter/

(There are many other examples but at this moment I can't recall any specific
names so you'll just have to trust me).

However, when it comes to the PDF file format it is hard to beat the
proprietary Foxit Reader.  With FOSS only evince comes close but evince
lacks a lot of capability and seems to be buggy in places.

AMD64 should be the standard but many projects refuse to update since
reliance on multi-lib is so much simpler.  As a consequence we 64-bit
purists are at a disadvantage.

Frank Peters

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] Re: amd64 list, still useful?  Was: btrfs
  2014-05-30  2:44                       ` Frank Peters
@ 2014-05-30  6:25                         ` Duncan
  0 siblings, 0 replies; 30+ messages in thread
From: Duncan @ 2014-05-30  6:25 UTC (permalink / raw
  To: gentoo-amd64

Frank Peters posted on Thu, 29 May 2014 22:44:05 -0400 as excerpted:

> On Fri, 30 May 2014 02:04:39 +0000 (UTC)
> Duncan <1i5t5.duncan@cox.net> wrote:
> 
>> FWIW, I'm no-multilib as well, but I guess for a different reason.
>> 
>> I don't do proprietary [...]
>>
> It's not just proprietary software that lags behind.  I continue to
> encounter FOSS packages from time to time that are still 32-bit only.
> 
> One example, for audio enthusiasts, is the excellent AudioCutter:
> http://www.virtualworlds.de/AudioCutter/

I'm not saying 32-bit-only FLOSS isn't out there, only that by now, and 
actually from 2010 or so (to pick the turn of the decade as a convenient 
date, one could actually say by 2008 or so), it's increasingly non-
mainstream.  There's the occasional exception, but for most people, 
either their 32-bit concerns are proprietary only, or there's a more 
mainstream 64-bit alternative.

Luckily for me, my interests are mainstream enough...

> (There are many other examples but at this moment I can't recall any
> specific names so you'll just have to trust me).
> 
> However, when it comes to the PDF file format it is hard to beat the
> proprietary Foxit Reader.  With FOSS only evince comes close but evince
> lacks a lot of capability and seems to be buggy in places.

I should explicitly mention that I'm all for people making their own 
decisions regarding proprietary.  Because I know if someone had tried to 
push me before I was ready, even while I was preparing for my ultimate 
switch, the results would have been nothing but negative.  So everyone 
must move when they are ready, and if that time never comes, well...  But 
at the same time, that decision is behind me personally, and there's 
simply no way I'm going back to the days of proprietary.

As for pdf, I'm running (semantic-desktop-stripped) kde and okular, and 
have been reasonably happy with it.  Where I've seen people complain 
about PDF readability or compatibility and have checked, okular has done 
well enough for me, to the point I never saw what they were complaining 
about.

Meanwhile, even if I did find some PDF nothing I could run would handle, 
that would simply mean I'd not read that pdf, tho if it was worth it I 
could envision taking it to the library to read or to a printer to have 
them print it out or something.  But I wouldn't install anything 
proprietary on my own systems to read it.  There are too many other 
things to do in the world to worry about missing what's in one pdf, 
especially if it meant my freedom was on the line.

> AMD64 should be the standard but many projects refuse to update since
> reliance on multi-lib is so much simpler.  As a consequence we 64-bit
> purists are at a disadvantage.

True at times.  Luckily, those times aren't so frequent these days.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] amd64 list, still useful? Was: btrfs
  2014-05-30  2:04                     ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan
  2014-05-30  2:44                       ` Frank Peters
@ 2014-06-04 16:41                       ` Mark Knecht
  2014-06-05  2:00                         ` [gentoo-amd64] " Duncan
  1 sibling, 1 reply; 30+ messages in thread
From: Mark Knecht @ 2014-06-04 16:41 UTC (permalink / raw
  To: Gentoo AMD64

On Thu, May 29, 2014 at 7:04 PM, Duncan <1i5t5.duncan@cox.net> wrote:
<SNIP>
> Meanwhile, Mark's reason for staying on this list, as opposed to the
> general user list, are more or less mine, as well.
<SNIP>
>
> Rather, in my case it is more that I remember the high traffic of the
> user list and kind of like the lower but perhaps higher quality traffic
> here, tho at times it's /too/ low traffic, these days.  Probably at some
> point I'll get back to the user list, but if this list were to shut down,
> I'd still miss it, because while there's not a lot of traffic here these
> days, the signal to noise ratio really is about the highest I can imagine.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman

Hi Duncan,
   There is an in progress, higher energy thread on gentoo-user with folks
getting upset (my interpretation) about systemd and support for
suspend/resume features. I only found it being that I ran into an emerge
block and went looking for a solution. (In my case it was -upower as
a new use flag setting.)

   Anyway, I prefer it here. If I was reading that thread real-time I know I'd
be responding to a few things even though I don't have anything of much
value to add. It's just my nature in the presence of threads like that! ;-)

Cheers,
Mark

P.S. - BTW - I love your long answers although I seldom have time
to read them when they arrive. Stay true. They are of value.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs
  2014-06-04 16:41                       ` [gentoo-amd64] " Mark Knecht
@ 2014-06-05  2:00                         ` Duncan
  2014-06-05 18:59                           ` Mark Knecht
       [not found]                           ` <Alo71o01J1aVA4001lo9xP>
  0 siblings, 2 replies; 30+ messages in thread
From: Duncan @ 2014-06-05  2:00 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Wed, 04 Jun 2014 09:41:30 -0700 as excerpted:

> There is an in progress, higher energy thread on gentoo-user with folks
> getting upset (my interpretation) about systemd and support for
> suspend/resume features. I only found it being that I ran into an emerge
> block and went looking for a solution. (In my case it was -upower as a
> new use flag setting.)

Yeah.  I saw the original dev-list thread on the topic, before it all hit 
the tree (and continuing now), which is a big part of why I subscribe to 
the dev-list, to get heads-up about things like that.

What happened from the dev-list perspective is that after upower dropped 
about half the original package as systemd replaced that functionality, 
the gentoo maintainers split the package in half, the still included 
functionality under the original upower name, with the dropped portion in 
a new, basically-gentoo-as-upstream, package, upower-pm-utils.

But to the gentoo maintainer the portage output was sufficient that 
between emerge --pretend --tree --unordered-display and eix upower, what 
was needed was self-evident, so he didn't judge a news item necessary.  
What a lot of other users (including me) AND devs are telling him is that 
he's apparently too close to the problem to see that it's not as obvious 
as he thinks, and a news item really is necessary.

Compounding the problem for users is that few users actually pulled in 
upower on their own and don't really know or care about it -- it's pulled 
in due to default desktop-profile use-flags as it's the way most desktops 
handle suspend/hibernate.  Further, certain desktop dependencies 
apparently got default-order reversed on the alternative-deps, so portage 
tries to fill the dep with systemd instead of the other package.  
Unfortunately that's turning everybody's world upside down, as suddenly 
portage wants to pull in systemd *AND* there's all these blockers!

Meanwhile, even tho he didn't originally think it necessary, once pretty 
much all gentoo userspace (forums, irc, lists, various blogs...) erupted 
in chaos, the gentoo maintainer decided that even tho he didn't quite 
understand /why/ a news item was needed, that was the best way to get the 
message out as to how to fix things and to calm things back down.

But, policy is that such news items must be posted to the gentoo-dev list 
for (ideally) several days of comment before they're committed, and a 
good policy it is in general too, because the news items generally turn 
out FAR better with multiple people looking over the drafts and making 
suggestions, than the single-person first-drafts tend to be!

In cases such as this, however, the comment time is shortened to only a 
day or two unless something seriously wrong comes up in the process, and 
while I've not synced for a few days, I'd guess that news item has either 
hit before I send this, or certainly if not, it'll hit within a matter of 
hours.

Once the news item hits, for people that actually read them at least, the 
problem should be pretty much eliminated, as there's appropriate 
instructions for how to fix the blocker, etc.

So things should really be simmering back down pretty shortly. =:^)  
Meanwhile, in the larger perspective of things, it's just a relatively 
minor goof that as usual is fixed in a couple days.  No big deal, except 
that /this/ goof happens to include the uber-lightening-rod-package that 
is systemd.  Be that as it may, the world isn't ending, and the problem 
is indeed still fixed up within a couple days, as usual, with 
information, some reliable, some not so reliable, available via the usual 
channels for those who don't want to wait.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs
  2014-06-05  2:00                         ` [gentoo-amd64] " Duncan
@ 2014-06-05 18:59                           ` Mark Knecht
  2014-06-06 12:11                             ` Duncan
       [not found]                           ` <Alo71o01J1aVA4001lo9xP>
  1 sibling, 1 reply; 30+ messages in thread
From: Mark Knecht @ 2014-06-05 18:59 UTC (permalink / raw
  To: Gentoo AMD64

On Wed, Jun 4, 2014 at 7:00 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Wed, 04 Jun 2014 09:41:30 -0700 as excerpted:
>
>> There is an in progress, higher energy thread on gentoo-user with folks
>> getting upset (my interpretation) about systemd and support for
>> suspend/resume features. I only found it being that I ran into an emerge
>> block and went looking for a solution. (In my case it was -upower as a
>> new use flag setting.)
>
> Yeah.  I saw the original dev-list thread on the topic, before it all hit
> the tree (and continuing now), which is a big part of why I subscribe to
> the dev-list, to get heads-up about things like that.
>

Maybe all Gentoo users should subscribe! Over time we would likely
all get a bit smarter. ;-)  ;-)  ;-)

> What happened from the dev-list perspective is that after upower dropped
> about half the original package as systemd replaced that functionality,
> the gentoo maintainers split the package in half, the still included
> functionality under the original upower name, with the dropped portion in
> a new, basically-gentoo-as-upstream, package, upower-pm-utils.
>

I certainly have no issue with the basics of what they did, but more
in a second.

> But to the gentoo maintainer the portage output was sufficient that
> between emerge --pretend --tree --unordered-display and eix upower, what
> was needed was self-evident, so he didn't judge a news item necessary.
> What a lot of other users (including me) AND devs are telling him is that
> he's apparently too close to the problem to see that it's not as obvious
> as he thinks, and a news item really is necessary.
>

Yeah, this was likely the issue. One comment in the -user thread on this
subject was that at least one -dev-type thinks users should be reading
change logs to figure this stuff out. I no longer remember how long I've
run Gentoo but it's well beyond a decade at this point. Daniel Robbins
was certainly participating. I was working at a company from mid-1999
to 2004 when I started. I can only say that I've never read a change log
in that whole time.

> Compounding the problem for users is that few users actually pulled in
> upower on their own and don't really know or care about it -- it's pulled
> in due to default desktop-profile use-flags as it's the way most desktops
> handle suspend/hibernate.

As is the case for me using kde-meta. However while I figured out I could
set -upower on kdelibs and not have any build or boot issues pretty quickly
I soon discovered that flag goes beyond my simplistic view of
suspend/resume which I have never used. It also covers _everything_ in
the Power Management section of systemsettings which means I lost
my ability in KDE to control what I suspect is DPMI time settings on
my monitors. I'll either have to learn how to do that outside of KDE or
reinstall the newer upower-pm-utils package.

> Further, certain desktop dependencies
> apparently got default-order reversed on the alternative-deps, so portage
> tries to fill the dep with systemd instead of the other package.
> Unfortunately that's turning everybody's world upside down, as suddenly
> portage wants to pull in systemd *AND* there's all these blockers!
>

Yeah, that's what got me to look at gentoo-user and find the problem. Lots
of blocks involving systemd.

<SNIP>
> So things should really be simmering back down pretty shortly. =:^)
> Meanwhile, in the larger perspective of things, it's just a relatively
> minor goof that as usual is fixed in a couple days.  No big deal, except
> that /this/ goof happens to include the uber-lightening-rod-package that
> is systemd.  Be that as it may, the world isn't ending, and the problem
> is indeed still fixed up within a couple days, as usual, with
> information, some reliable, some not so reliable, available via the usual
> channels for those who don't want to wait.
>

This stuff does happen once in awhile. I'm surprised it doesn't happen
more often actually so for the most part the release process is pretty
good.

WRT to systemd, my real problem with this latest issue is the systemd
profile issue, and beyond that there doesn't seem to be a systemd
oriented new machine install document. In my study getting ready to
build a new RAID (probably will be 2-drive 3TB RAID1) I wondered
of I should give in to this portage pressure and go systemd. When
I start looking there all I find are documents that seem to assume
a pretty high understanding of systemd which doesn't represent
my current education or abilities. Seems to me if the gentoo-devs
interested in seeing systemd gain traction were serious this would be
a high priority job. All we get today is

http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?full=1#book_part1_chap12

which to me says it's not what Gentoo developers want Gentoo users to
use. Of course, that's just me.

Take care,
Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs
  2014-06-05 18:59                           ` Mark Knecht
@ 2014-06-06 12:11                             ` Duncan
  0 siblings, 0 replies; 30+ messages in thread
From: Duncan @ 2014-06-06 12:11 UTC (permalink / raw
  To: gentoo-amd64

Mark Knecht posted on Thu, 05 Jun 2014 11:59:23 -0700 as excerpted:

> Yeah, this was likely the issue. One comment in the -user thread on this
> subject was that at least one -dev-type thinks users should be reading
> change logs to figure this stuff out. I no longer remember how long I've
> run Gentoo but it's well beyond a decade at this point. Daniel Robbins
> was certainly participating. I was working at a company from mid-1999 to
> 2004 when I started. I can only say that I've never read a change log in
> that whole time.

Wow.  I read 'em routinely.  There are actually four different types of 
"changelogs" I read, more or less often and closely, depending on the 
package and how closely I'm following it.

1) The gentoo package changelogs (as found in the gentoo tree) don't 
normally contain a lot of information about the upstream package or 
changes between versions, so I don't read them all the time, but I *DO* 
read the gentoo package changelog much of the time when I see a -rX bump 
for something I already have installed at the same upstream version 
number, because in that case I normally want to know what the gentoo 
package maintainer considered important enough for a revision bump and 
the resulting rebuild trigger for users with that same upstream version 
already installed, instead of simply fixing it in the ebuild without a 
revision bump.  These can be security bumps, for instance, and if so I 
want to know how bad it was and what my risk was before the update.  
Another common reason is config changes or patches that might affect me 
in other ways, that as an admin responsible for the wellbeing of my gentoo 
system (a responsibility I take very seriously), I want to know about.

Additionally, the gentoo package changelogs contain dates for version 
introduction into the tree as well as stabilization on the various archs, 
and eventually, for removal from the tree.  Most of the time when I'm 
checking on these, it's to help someone on some other list figure out a 
dependency on their non-gentoo distro, or figure out how far behind my 
~amd64 installation their version is and how outdated, etc.  Other times 
it can be basically the same stuff, but for another gentooer on stable 
instead of my ~amd64 plus selected live-packages system.

At least back when Zac was portage dev lead, because gentoo /is/ upstream 
for portage and because portage changes are critical to a gentoo system's 
wellbeing, the portage package changelog was far more detailed than most 
others, including bug report numbers and mentioning big feature changes.  
I followed the portage changelog very closely and looked up every bug 
number mentioned to see what sort of changes were being made and why.

2) While not technically "changelog" files, git logs are in fact 
generally much more detailed changelogs, and for stuff like kde, where I 
run live-git-branch packages from the gentoo/kde project overlay, every 
time I do a sync (of both the main gentoo tree and the few overlays I 
run), I FAITHFULLY run git log on the overlay trees to see what updated 
since I last synced, and how.  For the overlays, I follow *EVERY* 
*SINGLE* *CHANGE*, at minimum reading the git-log entry which lists the 
files involved as well, and if there's patches introduced that interest 
me, I'll use git show to pull up the full git diffs and see what actually 
changed, line by line in the source code.

3) Similarly, for various upstream packages I follow upstream's changelog 
or news files as well, not /too/ closely for most packages, but for a lot 
of packages, closely enough to at least be aware of major feature 
updates, both so I can make use of those features, and because they might 
affect config files that I'll be etc-updating in short order, after the 
package upgrade.

4) For a lot of packages that I run the live-git version of, I'll use the 
smart-live-rebuild output to get the upstream git commit IDs, and will 
then do a git log with those IDs to see what changed there, as well.  For 
a few packages, I have a different script that I run that does an 
individual git pull for the package, and I git log it if there are 
changes, before I even run smart-live-rebuild to catch the others at all.

Until I recently switched to systemd, I was one of the few non-dev users 
actually running openrc-9999, precisely so I COULD follow individual git 
commit updates, and I found and filed a number of bugs that then got 
fixed before a release version ever made them generally available even to 
~arch users.  Similarly, I've been involved with upstream pan (the news 
client I follow this list with, among other things) for over a decade 
now, helping out on its mailing list and now filling the local 
application historian role as well, tho I'm not a dev, and I follow its 
git logs *VERY* closely.  Lately I've been active both as a btrfs user 
and on the btrfs list, and follow the btrfs-progs git commit log closely 
as well.

For kde, where I'm on the kde4 development branch, I don't follow the git 
logs /quite/ that closely, but I do keep an eye on them, particularly for 
kdelibs, kde-baseapps and kde-workspace.

I have my own scripts that I use for updating the kernel so I don't use 
gentoo's kernel packages at all, but there too I run (mainline Linus) 
kernel git, and while I don't follow individual commits especially during 
the merge window, I often follow the mainline merge-commits, and follow 
things more closely as commits slow down later in the cycle.  As with 
openrc, I've bisected, filed and gotten fixed a number of bugs in pre-
releases over the years before they hit full releases.

So I must confess it's a bit hard to imagine someone who hasn't read a 
single changelog in at least the decade I've been on gentoo, particularly 
since following them to at least /some/ extent is IMO part of the 
responsibility of being a good sysadmin, responsible for at least their 
own system if no others, is all about.  While I certainly don't expect 
people to follow changes as closely as I do, not viewing even /one/ 
changelog over the course of at least a decade... let's put it this way, 
it's not something /I'd/ be proud to admit in public.

OTOH, that you've gone this long without it and are still here discussing 
and running gentoo definitely *IS* a testament to how good the gentoo 
devs (and tester-users like me filing bugs to be fixed before things hit 
stable, and sometimes before they hit a release or the ~arch tree at all) 
actually are in general at keeping things actually working for people, a 
bit of hiccup now and again for a few days, but basically nothing that's 
not fixed in a few days, and nothing that tends to actually eat systems 
to the point that you're not here, a decade later.

That's SAYING something! =:^)

> This stuff does happen once in awhile. I'm surprised it doesn't happen
> more often actually so for the most part the release process is pretty
> good.

=:^)

> WRT to systemd, my real problem with this latest issue is the systemd
> profile issue, and beyond that there doesn't seem to be a systemd
> oriented new machine install document. In my study getting ready to
> build a new RAID (probably will be 2-drive 3TB RAID1) I wondered of I
> should give in to this portage pressure and go systemd. When I start
> looking there all I find are documents that seem to assume a pretty high
> understanding of systemd which doesn't represent my current education or
> abilities. Seems to me if the gentoo-devs interested in seeing systemd
> gain traction were serious this would be a high priority job. All we get
> today is
> 
> http://www.gentoo.org/doc/en/handbook/handbook-amd64.xml?
full=1#book_part1_chap12
> 
> which to me says it's not what Gentoo developers want Gentoo users to
> use.
> Of course, that's just me.

You're actually correct.  Mainline gentoo remains openrc, and that's 
likely to remain the case for some time.  Systemd is certainly available 
as an option, and more and more people are switching to it, but even 
after it's well documented in the handbook, openrc will continue to be an 
option for the foreseeable future, with several devs having stated quite 
specifically that they use gentoo and depend on openrc in their jobs, so 
whatever /else/ may happen to gentoo, openrc is *NOT* about to become 
unsupported, as I said for the foreseeable future (which in practice 
means at LEAST two years out as it'd take that long to switch over -- 
remember how long it took to stabilize baselayout-2, and more likely at 
least five years, even if they voted that as a goal right now!).

OTOH, individual packages and specific desktop projects can change 
dependencies based on what upstream supports, and gentoo/gnome is only 
supporting systemd for some elements now.

Luckily for gentoo/kdeers, upstream kde has committed to maintaining more 
systemd independence than has gnome, including with kde 5 frameworks.  
And the modularization of kde-frameworks should make that much easier 
too, over time, altho individual kde packages may eventually require 
systemd.  OTOH, gentooers have it better than most in that they have more 
choice about actually installing individual packages, as well as keeping 
upstream-optional dependencies actually optional.

We did almost lose the ability to opt-out of semantic-desktop, but 
fortunately saner heads prevailed, and had they not done so in gentoo/kde, 
a number of us users were making plans for a user-supported overlay 
similar to the user-supported kde-sunset for kde3 users, to maintain 
semantic-desktop-less kde4 at least until kde5/frameworks, at which point 
we hoped upstream policies would bring back the option due to its 
modularity.  But while I actually had to maintain the semantic-desktop-
less ebuild patches locally for awhile in ordered to continue following 
kde-live-branch, and I guess ~arch users faced the problem for a shorter 
time, the policy thankfully reverted before stable users had to make that 
painful choice.

But various devs have made it VERY clear, gentoo as a whole isn't going 
to get anywhere /close/ to losing the openrc option, as I said, for the 
foreseeable future.

And well before that were to happen, or even if gentoo really expected 
stable users to switch to systemd in quantity, there'd be MUCH better 
documentation, just as that was a prerequisite to the stable-side 
baselayout-2, one of the big reasons it took years.  Again, that's 
exactly the reason users worried about gentoo suddenly switching to 
systemd as it /looked/ like it might be doing here, have nothing to worry 
about for at **LEAST** two years.  A ship as big as gentoo simply doesn't 
turn on a dime, nor can it be forced to, and even were the council to 
suddenly get a brain transplant and vote that it should be the goal 
today, it'd take years to actually implement, including for many gentoo 
devs *AND* their employers.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

[parent not found: <Alo71o01J1aVA4001lo9xP>]

* [gentoo-amd64] Re: amd64 list, still useful? Was: btrfs
       [not found]                           ` <Alo71o01J1aVA4001lo9xP>
@ 2014-06-06 17:07                             ` Duncan
  0 siblings, 0 replies; 30+ messages in thread
From: Duncan @ 2014-06-06 17:07 UTC (permalink / raw
  To: Martin, gentoo-amd64

On Thu, 05 Jun 2014 22:48:07 +0100
Martin <m_btrfs@ml1.co.uk> wrote:

> Resend (gmane appears to be losing my email for this list... :-( )

OK, forwarding to the list too (with a bit less snippage than normal,
to keep your message intact as I'm relaying) and replying below.

> 
> On 05/06/14 16:35, Martin wrote:
> > On 05/06/14 03:00, Duncan wrote:
> >> So things should really be simmering back down pretty shortly.
> >> =:^)  

> > Thanks for the good summary.
> > 
> > Yep, I hit all the red "B" blockers... Quickly saw it was upower and
> > some confusion with systemd even though I've not selected systemd
> > anywhere and...
> > 
> > I was too rushed to investigate much further and so added into my
> > /etc/portage/package.mask:
> > 
> > # Avoid pulling in systemd!
> > =sys-power/upower-0.9.23-r3
> > 
> > 
> > Thanks for letting me know to await the news item and for the bits
> > to settle...

[Just forwarding that part and would delete it as I'm not replying to
it, were I not forwarding it for you too.  But I'm replying to the
below.]

> > As for systemd... I'm just wondering if the various heated air being
> > generated/wasted is as much rushed arrogance on the part of the
> > implementation as due to the grand ripples of change.
> > 
> > The recent kernel DoS debacle due to misusing the long used kernel
> > debug showed a certain 'cavalier' attitude to taking over
> > functionality without a wider concern or caution to keep projects
> > outside of systemd undisturbed... Or at least conscientiously
> > minimise disturbance...

Agreed, and for quite some time I that attitude was why I was delaying
my own switch, tho I expected I'd eventually make it.

But backing up a bit to reveal the larger picture...

Developers in general aren't always exactly known for their ability to
get along with each other or with others or necessarily the wider
community.  Certainly there's many examples of this in the FLOSS
community, and from what I've read of the proprietary development
community it's no different, save much of it happens behind closed
doors, with public appearances moderated by the PR folks.

Actually, I've a personal experience that rather changed my own outlook
on things, that I believe explains a lot of the problem here.  The
following gets a bit franker and more personally honest than most
discussions and I'm not really comfortable saying it, but it's
important enough not to skip as it illustrates a point I want to make.

I don't ordinarily talk about myself in this way, but the fact is, on
most tests I score well above 90 percentile IQ.  Typically, depending
on the test and whether I'm hitting my groove that day or not, I run
95-97 percentile on average in most areas (tho in composition I'm
relatively low for me, 70s). (FWIW, I've always been slightly
frustrated.  The MENSA cutoff is supposed to be 98 percentile and I
typically score tantalizingly close, but not quite!  It'd be nice...
=:^( )  In technology and computer areas I'd guess I'm a bit higher,
perhaps 98 percentile or so.  95 percentile means about 19 out of 20
people score lower, 98 percentile is 49 out of 50. 

But, this level of attainment presents its own set of difficulties,
difficulties I'm intimately familiar with, but obviously not to the
level these /real/ geniuses, the big hero coders of our community, are.

I still remember the day I actually realized what dealing with a
mentally challenged individual actually was, back in about 8th grade or
so.  He had come to visit a next door neighbor and we set out to climb
a local butte, me not yet understanding his difficulty -- I knew
there was /something/ different about him, but I didn't know what, I
just accepted it, and him, as basically my equal, as I had been taught
to accept and treat everyone.  But climbing this butte didn't simply
involve a hike, as is the case with many hills/buttes.  It involved a
bit of relatively minor technical climbing, "chimneying", etc. I
had done it with a group previously, but wanted to try it again, for
the exercise and challenge.  But I didn't want to do it alone, and this
guy was agreeable to trying it, so we set out.

Everything went well, considering, but it did take somewhat longer than
I had planned and our ride back got a bit worried and alerted the
authorities.  Fortunately, they didn't have to pull us off the mountain
(or scrape us from the bottom of the chimney), but we got in a bit of
trouble.

When I got home, Mom asked me why on earth I'd take a r* guy up a
mountain like that.  I was flabbergasted!  I didn't know!  And to think
I took him on that climb that was slightly challenging for me
(something I'm not sure my Mom knew, and that I didn't tell her!), what
must it have been for him? I was perhaps rather fortunate
something /didn't/ happen, altho now I realize that despite (or even
perhaps because of) his challenge he was remarkably resilient, and may
well have picked himself up and continued better than I would have if
something had gone wrong and either one of us was hurt

That night or perhaps the next day, as I thought about it, I realized
what had happened.  I was so used to, as a matter of course, dropping
to whatever level was required to meet people at their own level and
treat them as equal, that I didn't realize I was even doing it.  To me
it was just the way one interacted with others. What I had originally
noticed different about him, that I couldn't put into words before as I
simply didn't have the experience or concept, was that I had to drop a
bit more than normal, but I was so used to doing it for pretty much
everyone, that I didn't even realize I was doing it, or know what it
was... until I was forcibly confronted with the fact that this guy was
(to others) noticeably below average. But to me he was simply a bit
more of the normal that I always did, and that I thought was just the
way it was to interact with /anyone/.

Since then I've obviously become a bit wiser in the ways of the world,
but realistically, I really do seldom meet people /really/ my equal in
the real world, and that has really distorted my experience, and to
some extent my attitude and picture of the world.

But that was only experiencing the one side.  I consider myself
fortunate to have actually had the opposite experience as well.  A bit
over a decade ago I was with a Linux and Unix friendly ISP that had a
lot of real knowledgeable folks as customers, including one guy that was
one of only about a dozen with direct commit privs to one of the BSDs,
and several others that were in the same knowledge class.  While I
may well be at the 95-97 percentile range, for the first time in my
life I was well outranked, as several of these guys were at the 99th
percentile or better I'm sure, plus they had likely decades of
experience I didn't (as a newbie fresh from the MS side of the track)
have!

That was a humbling experience indeed!  To that point, I had been used
to being at least /equal/ to pretty much anyone I met, and enough above
most that even if I happened to be wrong I knew more about the
situation than pretty much anyone else, that I could out-argue them
even in my wrongness.

Here the situation was exactly reversed, *I* was the know-nothing, the
slow guy that everyone else had to wait for while someone patiently
explained what was going on so I could follow along!

I **VERY** **QUICKLY** learned how to shut up and simply read the
discussion as it happened around me, learning from the masters and
occasionally asking a question or two, and to be *VERY* sure I could
backup any claims I DID make, because if I was wrong, for the first
time in my life I was pretty much guaranteed to be called on it, and
there was no bluffing my way out of that fix with THESE guys!

That had roughly the same level of effect on me as the earlier
experience, but at the opposite end, something I rather badly needed as
I NEEDED a bit of humbling at that point!

Now here's the critical point that I've been so brutally honest to try
to present:  What happens to the *REAL* 99 percentilers, the guys who
*NEVER* have that sort of humbling "OOPS, I screwed up and better
shutup!  These guys know more than me and if I'm wrong they're not
afraid to demonstrate exactly why and how!" ... experience?

Unfortunately, a lot of them are a**h***s!  Why?  Because they're at
the top of their class and they know it.  Nobody can prove them wrong,
and if somehow someone does, they simply don't know how to react, as
it's an experience they very rarely have.  Even on things they know is
simply opinion, they're so used to having absolutely zero peers around
that can actually challenge them on it, that they simply don't
know /how/ to take a real, honest challenge when it comes.

Which BTW is one of the things I find so amazing about Linus Torvalds.
I doubt many would argue that he's at the 99 percentile point, yet
somehow he's a real person, still approachable, and unlike most folks
at his level, actually able to deal with people!

At the other end are people like Hans Reiser.  He was and is a
filesystem genius, and reiser4 was years before its time, yet never got
into the kernel despite years of trying, because he was absolutely
horrible at interpersonal relations and nobody anywhere near his
level could work with him, because he simply didn't know how to be
wrong.  Unfortunately learning that was literally a fatal experience
for his wife. =:^(

Take it from someone who is in many areas 90 percentile plus, but who
counts that experience sitting at the feet of /real/ masters as perhaps
the single most fortunate and critical experience in his live, because
he learned how to be wrong, that's NOT an easy lesson to learn, but
it's an *EXTREMELY* critical lesson to learn!

Think about that the next time you see something like that kernel
command-line debug thing go down.  Poettering and Sievers are extremely
bright men, genius, top of their class.  And Poettering in particular
is a gifted speaker as well (researching systemd I watched a number of
videos of presentations he has done on the subject, he really IS an
impressive and gifted speaker!).

But, they don't take being wrong well at all, and they have a sense of
entitlement due to their sense of ultimate rightness.

Never-the-less, however one might dislike and distrust the personality
behind them, both systemd and reiserfs (and later reiser4) were/are top
of their class for their time, unique and ahead of their time in many
ways.  There's no arguing that.

I didn't and don't like Hans Reiser, but I used his filesystem
(reiser3), and still use it on my spinning rust drives altho I've
switched to the still not fully mature btrfs on my newer ssds.

Unlike Reiser, I don't know so much about Poettering and Sievers
personal lives and I surely hope they don't end up where Reiser did for
similar reasons.  But similar to Reiser, I use their software, systemd,
now.  And there's no arguing the fact, it's /good/, even if not exactly
stable, because they continue to "gray goo" anything in their path, and
haven't yet taken the time necessary to truly stabilize the thing.
While I never used it, from what I have read, PulseAudio was much the
same way as long as Poettering was involved -- it never truly
stabilized until he lost interest and left.

Unfortunately I think that's likely to be the case with systemd as
well; it won't really stabilize until Poettering loses interest and
moves on to something else.  And for people who depend on stable, I
really doubt I'll be able to recommend it (if you can avoid the
gray goo, I really don't know if that will remain possible if he
doesn't lose interest in another couple years) until then. But it /is/
good, good enough it's taking the Linux world by storm, gray goo and
all.  If systemd could just be left alone to stabilize for a year or
so, I think it'd be good, /incredibly/ good, and a lot of hold outs,
like I was until recently, would find little reason not to switch, once
it was allowed to stabilize.  But when that's likely to happen
(presumably after Poettering moves on), I really haven't the foggiest.

Meanwhile I'm leading edge enough (I'm running git kde4 and kernel,
after all), and (fortunately) good enough at troubleshooting Linux boot
issues when I have to, that I decided it was time, for me anyway.

So as you can see, while I've succumbed now, I really do still have
mixed feelings on it all.

But meanwhile, try applying the "do they actually know how to be wrong"
theory the next time you see something happening elsewhere, too.  It's
surprising just how much of the FLOSS-world feuding it explains!...

Tho this is one area I'd be I'd be /very/ happy if I /was/ wrong about,
and suddenly all these definite top-of-their-field coders started
getting along with each other!  Well, we can hope, anyway (and while
we're at hoping, hope the lesson in being wrong isn't data eating code
teaching them how to be wrong, or security code either, as seems to
have been the recent case with openssl!).

-- 
Duncan - No HTML messages please, as they are filtered as spam.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:39 ` Bob Sanders
  2014-05-27 22:58   ` Harry Holt
@ 2014-05-27 23:32   ` Mark Knecht
  2014-05-27 23:51   ` Marc Joliet
  2 siblings, 0 replies; 30+ messages in thread
From: Mark Knecht @ 2014-05-27 23:32 UTC (permalink / raw
  To: Gentoo AMD64

On Tue, May 27, 2014 at 3:39 PM, Bob Sanders <rsanders@sgi.com> wrote:
> Mark Knecht, mused, then expounded:
>> Hi all,
>>    The list is quiet. Please excuse me waking it up. (Or trying to...) ;-)
>>
>>    I'm at the point where I'm a few months from running out of disk
>> space on my RAID6 so I'm considering how to move forward. I thought
>> I'd check in here and get any ideas folks have. Thanks in advance.
>>
>
> Beware - if Adobe acroread is used, and you opt for a 3TB home
> directory, there is a chance it will not work.  Or more specifically,
> acroread is still 32-bit.  It's only something I've seen with the xfs
> filesystem.  And Adobe has ignored it for approx. 3yrs now.
>

acroread isn't critical to me but it does get used now and then so
thanks for the heads-up.

<SNIP>
>
> RAID 1 is fine, RAID 10 is better, but comsumes 4 drives and SATA ports.

Humm...I suppose I might consider building a 4-drive 1TB RAID10 from
my existing 500GB RE3 drives, and then buy a couple of 2TB Red drives
and do a RAID1 for data storage. If I did that I'd end up with 6
drives in the box, 4 of them old, but old ain't necessarily bad. ;-)
However that forces me to manage what data goes where instead of just
a big, flat RAID1 which is going to be easy to live with. Still, it
would probably save some money.

<SNIP>
>
> If you change, do not use ZFS and possibly BTRFS if the system does not
> have ECC DRAM.  A single, unnoticed, ECC error can corrupt the data pool
> and be written to the file system, which effectively renders it corrupt
> without a way to recover.

Thanks. No ECC and no real interest in doing anything very exotic.

>
> FWIW - a Synology DS414slim can hold 4 x 1TB WD Red NAS 2.5" drives and
> provide a boot of nfs or iSCSI to your VMs.  The downside is the NAS box
> and drives would go for a bit north of $636.  The upside is all your
> movies and VM files could move off your workstation and the workstation
> would still host the VMs via a mount of the NAS box.
>

NAS is an interesting idea. I'll do a little study but my initial
feeling is that it's more money than I really want to spend. Summer's
coming. Time for Margaritas!

Thanks,
Mark


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:39 ` Bob Sanders
  2014-05-27 22:58   ` Harry Holt
  2014-05-27 23:32   ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
@ 2014-05-27 23:51   ` Marc Joliet
  2014-05-28 15:26     ` Bob Sanders
  2 siblings, 1 reply; 30+ messages in thread
From: Marc Joliet @ 2014-05-27 23:51 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 1618 bytes --]

Am Tue, 27 May 2014 15:39:38 -0700
schrieb Bob Sanders <rsanders@sgi.com>:

> Mark Knecht, mused, then expounded:
[...]
> >    Beyond this I need to talk file system types. I'm fat dumb and
> > happy with Ext4 and don't really relish dealing with new stuff but
> > now's the time to at least look.
> >
> 
> If you change, do not use ZFS and possibly BTRFS if the system does not
> have ECC DRAM.  A single, unnoticed, ECC error can corrupt the data pool
> and be written to the file system, which effectively renders it corrupt
> without a way to recover.
[...]

As someone who recently switched an mdraid to BTRFS (with / on EXT4 on an
SSD, which will be migrated at a later point, once I feel more at ease with
BTRFS), I was curious about this, so I googled it.  I found two threads, [0]
and [3], which dispute (and most likely refute) this notion that BTRFS is more
susceptible to memory errors than other file systems.

While I am far from a filesystem/storage expert (I see myself as a mere user),
the cited threads lead me to believe that this is most likely an
overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
suggest reading them in their entirety.

[0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
[1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
[2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
[3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821

HTH
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 23:51   ` Marc Joliet
@ 2014-05-28 15:26     ` Bob Sanders
  2014-05-28 15:28       ` Bob Sanders
                         ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Bob Sanders @ 2014-05-28 15:26 UTC (permalink / raw
  To: gentoo-amd64


Marc Joliet, mused, then expounded:
> Am Tue, 27 May 2014 15:39:38 -0700
> schrieb Bob Sanders <rsanders@sgi.com>:
> 
> While I am far from a filesystem/storage expert (I see myself as a mere user),
> the cited threads lead me to believe that this is most likely an
> overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
> suggest reading them in their entirety.
> 
> [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
> [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
> [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
> [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
>

FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
memory bit and no ECC memory:

http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/


Thanks Mark!  Interesting discussion on btrfs.

Bob

> HTH
> -- 
> Marc Joliet
> --
> "People who think they know everything really annoy those of us who know we
> don't" - Bjarne Stroustrup



-- 
-  



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-28 15:26     ` Bob Sanders
@ 2014-05-28 15:28       ` Bob Sanders
  2014-05-28 16:10       ` Rich Freeman
  2014-05-28 19:20       ` Marc Joliet
  2 siblings, 0 replies; 30+ messages in thread
From: Bob Sanders @ 2014-05-28 15:28 UTC (permalink / raw
  To: gentoo-amd64

Bob Sanders, mused, then expounded:
> 
> Marc Joliet, mused, then expounded:
> > Am Tue, 27 May 2014 15:39:38 -0700
> > schrieb Bob Sanders <rsanders@sgi.com>:
> > 
> > While I am far from a filesystem/storage expert (I see myself as a mere user),
> > the cited threads lead me to believe that this is most likely an
> > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
> > suggest reading them in their entirety.
> > 
> > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
> > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
> > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
> > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
> >
> 
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> memory bit and no ECC memory:
> 
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/
> 
> 
> Thanks Mark!  Interesting discussion on btrfs.
>

Apologies - that should have been - Thanks Marc!


> Bob
> 
> > HTH
> > -- 
> > Marc Joliet
> > --
> > "People who think they know everything really annoy those of us who know we
> > don't" - Bjarne Stroustrup
> 
> 
> 
> -- 
> -  
> 
> 

-- 
-  



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-28 15:26     ` Bob Sanders
  2014-05-28 15:28       ` Bob Sanders
@ 2014-05-28 16:10       ` Rich Freeman
  2014-05-28 19:20       ` Marc Joliet
  2 siblings, 0 replies; 30+ messages in thread
From: Rich Freeman @ 2014-05-28 16:10 UTC (permalink / raw
  To: gentoo-amd64

On Wed, May 28, 2014 at 11:26 AM, Bob Sanders <rsanders@sgi.com> wrote:
> Marc Joliet, mused, then expounded:
>> Am Tue, 27 May 2014 15:39:38 -0700
>> schrieb Bob Sanders <rsanders@sgi.com>:
>>
>> While I am far from a filesystem/storage expert (I see myself as a mere user),
>> the cited threads lead me to believe that this is most likely an
>> overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
>> suggest reading them in their entirety.
>>
>> [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
>> [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
>> [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
>> [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
>>
>
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> memory bit and no ECC memory:
>
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/
>

I don't think that anybody debates that if you use btrfs/zfs with
non-ECC RAM you can potentially lose some of the protection afforded
by the checksumming.

What I'd question is that this is some concern unique to btrfs/zfs.
I'd think the same failure modes would all apply to any other
filesystem.

So, the message should be that ECC RAM is better than non-ECC RAM, not
that those who use non-ECC RAM are better off using ext4 instead of
zfs/btrfs.  I'd think that any RAM-related issue that would impact
zfs/btrfs would affect ext4 just as badly, and with ext4 you're also
vulnerable to all the non-RAM-related errors that checksumming was
created to solve.

If your RAM is bad then all kinds of stuff can go wrong.  Ditto for
your cache memory in the CPU, logic circuitry in the CPU, your busses,
etc.  Most systems are not fault-tolerant of these system components
and the cost to make them fault-tolerant tends to be fairly high.  On
the other hand, the good news is that you're far more likely to have
problems with data stored on a disk than in RAM, which is probably why
we haven't bothered to improve the other components.

Rich

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-28 15:26     ` Bob Sanders
  2014-05-28 15:28       ` Bob Sanders
  2014-05-28 16:10       ` Rich Freeman
@ 2014-05-28 19:20       ` Marc Joliet
  2014-05-28 19:56         ` Bob Sanders
  2014-05-29  7:08         ` [gentoo-amd64] " Duncan
  2 siblings, 2 replies; 30+ messages in thread
From: Marc Joliet @ 2014-05-28 19:20 UTC (permalink / raw
  To: gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 4318 bytes --]

Am Wed, 28 May 2014 08:26:58 -0700
schrieb Bob Sanders <rsanders@sgi.com>:

> 
> Marc Joliet, mused, then expounded:
> > Am Tue, 27 May 2014 15:39:38 -0700
> > schrieb Bob Sanders <rsanders@sgi.com>:
> > 
> > While I am far from a filesystem/storage expert (I see myself as a mere user),
> > the cited threads lead me to believe that this is most likely an
> > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
> > suggest reading them in their entirety.
> > 
> > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
> > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
> > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
> > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
> >
> 
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> memory bit and no ECC memory:
> 
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/

Thanks for explicitly linking that.  I didn't read it the first time around,
but just read through most of it, then reread the threads [0] and [3] above and
*think* that I understand the problem (and how it doesn't apply to BTRFS)
better now.

IIUC, the claim is: data is written to disk, but it must go through the RAM
first, obviously, where it is corrupted (due to a permanent bit flip caused,
e.g., by deteriorating hardware).  At some later point, when the data is read
back from disk, it might happen to load around the damaged location in RAM,
where it is further corrupted.  At this point the checksum fails, and ZFS
corrects the data in RAM (using parity information!), where it is immediately
corrupted again (because apparently it is corrected at the same physical
location in RAM? perhaps this is specific to correction via parity?). This
*additionally* corrupted data is then written back to disk (without any further
checks).

So the point is that, apparently, without ECC RAM, you could get a (long-term)
cascade of errors, especially during a scrub.  The likelihood of such permanent
RAM corruption happening in the first place is another question entirely.

The various posts in [0] then basically say that regardless of whether this
really is true of ZFS, it certainly doesn't apply to BTRFS, for various
reasons.  I suppose this quote from [1] (see above) says it most clearly:

> In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about
> reconstructing corrupted data from parity information:
> 
> > Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted
> bit. Remember, the checksum data was calculated after the corruption from the first memory error
> occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM.
> 
> i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors
> automatically from the parity information.  I start to suspect that there is confusion here between
> checksumming for data integrity and parity information.  If this is really how ZFS works, then if memory
> corruption interferes with this process, then I can see how a scrub could be devastating.  I don't know if
> ZFS really works like this.  It sounds very odd to do this without an additional checksum check.  This sounds
> very different to what you say below that btrfs does, which is only to check against redundantly-stored
> copies, which I agree sounds much safer.

The rest is also relevant, but I think the point that the data is corrected via
parity information, as opposed to using a known-good redundant copy of the data
(which I originally missed, and thus got confused), is the key point in
understanding the (supposed) difference in behaviour between ZFS and BTRFS.

All this assumes, of course, that the FreeNAS forum post that ignited this
discussion is correct in the first place.

> Thanks Mark!  Interesting discussion on btrfs.
> 
> Bob

You're welcome!  I agree, it's an interesting discussion.  And regarding the
misspelling of my name: no problem :-) .

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-28 19:20       ` Marc Joliet
@ 2014-05-28 19:56         ` Bob Sanders
  2014-05-29  7:08         ` [gentoo-amd64] " Duncan
  1 sibling, 0 replies; 30+ messages in thread
From: Bob Sanders @ 2014-05-28 19:56 UTC (permalink / raw
  To: gentoo-amd64

Marc Joliet, mused, then expounded:
> Am Wed, 28 May 2014 08:26:58 -0700
> schrieb Bob Sanders <rsanders@sgi.com>:
> 
> > 
> > Marc Joliet, mused, then expounded:
> > > Am Tue, 27 May 2014 15:39:38 -0700
> > > schrieb Bob Sanders <rsanders@sgi.com>:
> > > 
> > > While I am far from a filesystem/storage expert (I see myself as a mere user),
> > > the cited threads lead me to believe that this is most likely an
> > > overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
> > > suggest reading them in their entirety.
> > > 
> > > [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
> > > [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
> > > [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
> > > [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
> > >
> > 
> > FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> > memory bit and no ECC memory:
> >

Just to beat this dead horse some more, an analysis of a academic study
on drive failures -

http://storagemojo.com/2007/02/20/everything-you-know-about-disks-is-wrong/

And it links to the actual study here -

https://www.usenix.org/legacy/events/fast07/tech/schroeder.html

Which shows that memory has a fairly high failure rate as well, though
the focus is on hard drives.

> > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/
> 
> Thanks for explicitly linking that.  I didn't read it the first time around,
> but just read through most of it, then reread the threads [0] and [3] above and
> *think* that I understand the problem (and how it doesn't apply to BTRFS)
> better now.
> 
> IIUC, the claim is: data is written to disk, but it must go through the RAM
> first, obviously, where it is corrupted (due to a permanent bit flip caused,
> e.g., by deteriorating hardware).  At some later point, when the data is read
> back from disk, it might happen to load around the damaged location in RAM,
> where it is further corrupted.  At this point the checksum fails, and ZFS
> corrects the data in RAM (using parity information!), where it is immediately
> corrupted again (because apparently it is corrected at the same physical
> location in RAM? perhaps this is specific to correction via parity?). This
> *additionally* corrupted data is then written back to disk (without any further
> checks).
> 
> So the point is that, apparently, without ECC RAM, you could get a (long-term)
> cascade of errors, especially during a scrub.  The likelihood of such permanent
> RAM corruption happening in the first place is another question entirely.
> 
> The various posts in [0] then basically say that regardless of whether this
> really is true of ZFS, it certainly doesn't apply to BTRFS, for various
> reasons.  I suppose this quote from [1] (see above) says it most clearly:
> 
> > In hxxp://forums.freenas.org/threads/ecc-vs-non-ecc-ram-and-zfs.15449, they talk about
> > reconstructing corrupted data from parity information:
> > 
> > > Ok, no problem. ZFS will check against its parity. Oops, the parity failed since we have a new corrupted
> > bit. Remember, the checksum data was calculated after the corruption from the first memory error
> > occurred. So now the parity data is used to "repair" the bad data. So the data is "fixed" in RAM.
> > 
> > i.e. that there is parity information stored with every piece of data, and ZFS will "correct" errors
> > automatically from the parity information.  I start to suspect that there is confusion here between
> > checksumming for data integrity and parity information.  If this is really how ZFS works, then if memory
> > corruption interferes with this process, then I can see how a scrub could be devastating.  I don't know if
> > ZFS really works like this.  It sounds very odd to do this without an additional checksum check.  This sounds
> > very different to what you say below that btrfs does, which is only to check against redundantly-stored
> > copies, which I agree sounds much safer.
> 
> The rest is also relevant, but I think the point that the data is corrected via
> parity information, as opposed to using a known-good redundant copy of the data
> (which I originally missed, and thus got confused), is the key point in
> understanding the (supposed) difference in behaviour between ZFS and BTRFS.
> 
> All this assumes, of course, that the FreeNAS forum post that ignited this
> discussion is correct in the first place.
> 
> > Thanks Mark!  Interesting discussion on btrfs.
> > 
> > Bob
> 
> You're welcome!  I agree, it's an interesting discussion.  And regarding the
> misspelling of my name: no problem :-) .
> 
> -- 
> Marc Joliet
> --
> "People who think they know everything really annoy those of us who know we
> don't" - Bjarne Stroustrup



-- 
-  



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [gentoo-amd64] Re: Soliciting new RAID ideas
  2014-05-28 19:20       ` Marc Joliet
  2014-05-28 19:56         ` Bob Sanders
@ 2014-05-29  7:08         ` Duncan
  1 sibling, 0 replies; 30+ messages in thread
From: Duncan @ 2014-05-29  7:08 UTC (permalink / raw
  To: gentoo-amd64

Marc Joliet posted on Wed, 28 May 2014 21:20:18 +0200 as excerpted:

> Am Wed, 28 May 2014 08:26:58 -0700 schrieb Bob Sanders
> <rsanders@sgi.com>:
> 
>> Marc Joliet, mused, then expounded: [snipped]
> 
>> Thanks Mark!  Interesting discussion on btrfs.
>> 
>> [followup] Apologies - that should have been - Thanks Marc!
> 
> You're welcome!  I agree, it's an interesting discussion.  And regarding
> the misspelling of my name: no problem :-) .

=:^)

But seriously, thanks Bob for pointing out the misspelling.

There's a Mark (with a k) that's quite active on the btrfs list (and has 
in fact done quite a bit of testing on the raid56 stuff, and written most 
of several related pages on the btrfs wiki), and I guess my brain has so 
associated him with the btrfs discussion context that without actually 
thinking about it, I was thinking this was the same "Mark" here.

So pointing out that it's actually Marc-with-a-c here actually alerted me 
to the fact that it's not the same person, and very possibly saved a very 
confused Duncan from making quite a fool of himself in some future post 
either here or there as a result!

So thanks VERY MUCH, Bob! =:^)

(FWIW, my first name is John.  But at least in my generation there's so 
many Johns around, and Duncan as a last name isn't uncommon either, that 
in fact there are quite a few John Duncans around too, and it's all 
horribly confusing.  I even worked with a Donna at one point, and in a 
fairly noisy environment all you hear for either is the ON bit, so we 
were always either both or neither answering to calls for either one of 
us, since neither could easily hear which one they actually called.  So I 
switched to the mononym "Duncan".  That has been MUCH less confusing over 
the decades I've been using it, now.  Anyway, I can definitely identify 
with first-name confusion. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [gentoo-amd64] Soliciting new RAID ideas
  2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
  2014-05-27 22:39 ` Bob Sanders
@ 2014-05-27 23:05 ` Alex Alexander
  1 sibling, 0 replies; 30+ messages in thread
From: Alex Alexander @ 2014-05-27 23:05 UTC (permalink / raw
  To: Gentoo-amd64

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

On Wed, May 28, 2014 at 1:13 AM, Mark Knecht <markknecht@gmail.com> wrote:

> 1) Buy three (or even just two) 5400 RPM 3TB WD Red drives and go with
> RAID1. This would use the internal SATA2 ports so it wouldn't be the
> highest performance but likely a lot better than my SATA2 RAID6.
>

This.

Thinking into the future is important - drives tend to fill up faster when
you have more free space available.
Get three drives if possible, go RAID5, then when you run out of space (you
will), you just add one more and you're happy again.

This setup has one more advantage: You get to keep your old drives and
re-use them.

One interesting idea would be to use 3 of your old drives in a RAID5 setup
for Gentoo. It wouldn't be as fast as a couple of SSDs, but you're already
used to the speed and you instantly get two backup drives just in case one
of the old drives fails. You could also use the spare space on this array
for backups of critical stuff from the main raid.

You can always switch to SSDs for main system later :)

>
>    Beyond this I need to talk file system types. I'm fat dumb and
> happy with Ext4 and don't really relish dealing with new stuff but
> now's the time to at least look.

New tech is nice, but I'd stick with ext4. Data is one of the few things on
my systems that I don't like to toy with.

Cheers,
-- 
Alex Alexander
+ wired
+ www.linuxized.com
+ www.leetworks.com

[-- Attachment #2: Type: text/html, Size: 2248 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-06-06 17:07 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-27 22:13 [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
2014-05-27 22:39 ` Bob Sanders
2014-05-27 22:58   ` Harry Holt
2014-05-27 23:38     ` thegeezer
2014-05-28  0:26       ` Rich Freeman
2014-05-28  3:12       ` [gentoo-amd64] btrfs Was: " Duncan
2014-05-28  7:29         ` thegeezer
2014-05-28 20:32           ` Marc Joliet
2014-05-29  6:41             ` [gentoo-amd64] " Duncan
2014-05-29 17:57               ` Marc Joliet
2014-05-29 17:59                 ` Rich Freeman
2014-05-29 18:25                   ` Mark Knecht
2014-05-29 21:05                   ` Frank Peters
2014-05-30  2:04                     ` [gentoo-amd64] amd64 list, still useful? Was: btrfs Duncan
2014-05-30  2:44                       ` Frank Peters
2014-05-30  6:25                         ` [gentoo-amd64] " Duncan
2014-06-04 16:41                       ` [gentoo-amd64] " Mark Knecht
2014-06-05  2:00                         ` [gentoo-amd64] " Duncan
2014-06-05 18:59                           ` Mark Knecht
2014-06-06 12:11                             ` Duncan
     [not found]                           ` <Alo71o01J1aVA4001lo9xP>
2014-06-06 17:07                             ` Duncan
2014-05-27 23:32   ` [gentoo-amd64] Soliciting new RAID ideas Mark Knecht
2014-05-27 23:51   ` Marc Joliet
2014-05-28 15:26     ` Bob Sanders
2014-05-28 15:28       ` Bob Sanders
2014-05-28 16:10       ` Rich Freeman
2014-05-28 19:20       ` Marc Joliet
2014-05-28 19:56         ` Bob Sanders
2014-05-29  7:08         ` [gentoo-amd64] " Duncan
2014-05-27 23:05 ` [gentoo-amd64] " Alex Alexander

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox