Re: [gentoo-user] Hard drive and maximum data percentage.

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

From: Michael <confabulate@kintzios.com>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Hard drive and maximum data percentage.
Date: Sun, 02 Feb 2025 11:00:13 +0000	[thread overview]
Message-ID: <22583521.EfDdHjke4D@rogueboard> (raw)
In-Reply-To: <CAGfcS_mh-BdaU2XN-=zx69h2F26RXuMREwKhCSJ+tG_wA_RMiQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3503 bytes --]

On Sunday 2 February 2025 02:07:07 Greenwich Mean Time Rich Freeman wrote:
> On Sat, Feb 1, 2025 at 8:40 PM Dale <rdalek1967@gmail.com> wrote:
> > Rich Freeman wrote:
> > > Now, if you were running btrfs or cephfs or some other exotic
> > > filesystems, then it would be a whole different matter,
> > 
> > I could see
> > some RAID systems having issues but not some of the more advanced file
> > systems that are designed to handle large amounts of data.
> 
> Those are "RAID-like" systems, which is part of why they struggle when
> full.  Unlike traditional RAID they also don't require identical
> drives for replication, which can also make it tricky when they start
> to get full and finding blocks that meet the replication requirements
> is difficult.
> 
> With a COW approach like btrfs you also have the issue that altering
> the metadata requires free space.  To delete a file you first write
> new metadata that deallocates the space for the file, then you update
> the pointers to make it part of the disk metadata.  Since the metadata
> is stored in a tree, updating a leaf node requires modifying all of
> its parents up to the root, which requires making new copies of them.
> It isn't until the entire branch of the tree is copied that you can
> delete the old version of it.  The advantage of this approach is that
> it is very safe, and accomplishes the equivalent of full data
> journaling without actually having to make more than one write of
> things.  If that operation is aborted the tree just points at the old
> metadata and the in-progress copies are inside of free space, ignored
> by the filesystem, and thus they just get overwritten the next time
> the operation is attempted.
> 
> For something like ceph it isn't really much of a downside since this
> it is intended to be professionally managed.  For something like btrfs
> it seems like more of an issue as it was intended to be a
> general-purpose filesystem for desktops/etc, and so it would be
> desirable to make it less likely to break when it runs low on space.
> However, that's just one of many ways to break btrfs, so...  :)
> 
> In any case, running out of space is one of those things that becomes
> more of an issue the more complicated the metadata gets.  For
> something simple like ext4 that just overwrites stuff in place by
> default it isn't a big deal at all.

I've had /var/cache/distfiles on ext4 filling up more than a dozen times, 
because I forgot to run eclean-dist and didn't get a chance to tweak 
partitions to accommodate a larger fs in time.  Similarly, I've also had / on 
ext4 filling up on a number of occasions over the years.  Both of my ext4 
filesystems mentioned above were created with default options.  Hence -m, the 
reserved blocks % for the OS, would have been 5%.  I cannot recall ever losing 
data or ending up with a corrupted fs.  Removing some file(s) to create empty 
space allowed the file which didn't fit before to be written successfully and 
that was that.  Resuming whatever process was stopped (typically emerge) 
allowed it to complete.

I also had smaller single btrfs partitions binding up a couple of times.  I 
didn't lose any data, but then again these were stand alone filesystems not 
part of some ill advised buggy btrfs RAID5 configuration.

I don't deal with data volumes of the size Dale is playing with, so I can't 
comment on suitability of different filesystems for such a use case.

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

next prev parent reply	other threads:[~2025-02-02 11:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-26 18:15 [gentoo-user] Hard drive and maximum data percentage Dale
2025-02-01 21:15 ` Dale
2025-02-01 22:10   ` Mark Knecht
2025-02-01 23:51     ` Dale
2025-02-01 21:55 ` Rich Freeman
2025-02-02  0:15   ` Dale
2025-02-02  0:29     ` Rich Freeman
2025-02-02  1:40       ` Dale
2025-02-02  2:07         ` Rich Freeman
2025-02-02 11:00           ` Michael [this message]
2025-02-02 18:08             ` Dale

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22583521.EfDdHjke4D@rogueboard \
    --to=confabulate@kintzios.com \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox