* [gentoo-user] Fast file system for cache directory with lot's of files @ 2012-08-13 13:16 Michael Hampicke 2012-08-13 13:22 ` Nilesh Govindrajan ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 13:16 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1060 bytes --] Howdy gentooers, I am looking for a filesystem that perfomes well for a cache directory. Here's some data on that dir: - cache for prescaled images files + metadata files - nested directory structure ( 20/2022/202231/*files* ) - about 20GB - 100.000 directories - about 2 million files The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two 10.000rpm hard drives running a RAID1. Up until now I was using ext4 with noatime, but I am not happy with it's performence. Finding and deleting old files with 'find' is incredible slow, so I am looking for a filesystem that performs better. First candiate that came to mind was reiserfs, but last time I tried it, it became slower over time (fragmentation?). Currently I am running a test with btrfs and so far I am quiet happy with it as it is much faster in my use case. Do you guys have any other suggestions? How about JFS? I used that on my old NAS box because of it's low cpu usage. Should I give reiser4 a try, or better leave it be given Hans Reiser's current status? Thx in advance, Mike [-- Attachment #2: Type: text/html, Size: 1283 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:16 [gentoo-user] Fast file system for cache directory with lot's of files Michael Hampicke @ 2012-08-13 13:22 ` Nilesh Govindrajan 2012-08-13 13:54 ` Michael Hampicke 2012-08-13 14:38 ` Daniel Troeder 2012-08-13 20:13 ` Paul Hartman 2 siblings, 1 reply; 41+ messages in thread From: Nilesh Govindrajan @ 2012-08-13 13:22 UTC (permalink / raw To: gentoo-user On Mon 13 Aug 2012 06:46:53 PM IST, Michael Hampicke wrote: > Howdy gentooers, > > I am looking for a filesystem that perfomes well for a cache > directory. Here's some data on that dir: > - cache for prescaled images files + metadata files > - nested directory structure ( 20/2022/202231/*files* ) > - about 20GB > - 100.000 directories > - about 2 million files > > The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two > 10.000rpm hard drives running a RAID1. > > Up until now I was using ext4 with noatime, but I am not happy with > it's performence. Finding and deleting old files with 'find' is > incredible slow, so I am looking for a filesystem that performs > better. First candiate that came to mind was reiserfs, but last time I > tried it, it became slower over time (fragmentation?). > Currently I am running a test with btrfs and so far I am quiet happy > with it as it is much faster in my use case. > > Do you guys have any other suggestions? How about JFS? I used that on > my old NAS box because of it's low cpu usage. Should I give reiser4 a > try, or better leave it be given Hans Reiser's current status? > > Thx in advance, > Mike You should have a look at xfs. I used to use ext4 earlier, traversing through /usr/portage used to be very slow. When I switched xfs, speed increased drastically. This might be kind of unrelated, but makes sense. -- Nilesh Govindrajan http://nileshgr.com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:22 ` Nilesh Govindrajan @ 2012-08-13 13:54 ` Michael Hampicke 2012-08-13 14:19 ` Pandu Poluan 2012-08-13 14:40 ` Dale 0 siblings, 2 replies; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 13:54 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 524 bytes --] > > You should have a look at xfs. > > I used to use ext4 earlier, traversing through /usr/portage used to be > very slow. When I switched xfs, speed increased drastically. > > This might be kind of unrelated, but makes sense. I guess traversing through directories may be faster with XFS, but in my experience ext4 perfoms better than XFS in regard to operations (cp, rm) on small files. I read that there are some tuning options for XFS and small files, but never tried it. But if somone seconds XFS I will try it too. [-- Attachment #2: Type: text/html, Size: 782 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:54 ` Michael Hampicke @ 2012-08-13 14:19 ` Pandu Poluan 2012-08-13 14:42 ` Michael Hampicke 2012-08-13 14:40 ` Dale 1 sibling, 1 reply; 41+ messages in thread From: Pandu Poluan @ 2012-08-13 14:19 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 734 bytes --] On Aug 13, 2012 9:01 PM, "Michael Hampicke" <mgehampicke@gmail.com> wrote: >> >> You should have a look at xfs. >> >> I used to use ext4 earlier, traversing through /usr/portage used to be very slow. When I switched xfs, speed increased drastically. >> >> This might be kind of unrelated, but makes sense. > > > I guess traversing through directories may be faster with XFS, but in my experience ext4 perfoms better than XFS in regard to operations (cp, rm) on small files. > I read that there are some tuning options for XFS and small files, but never tried it. > > But if somone seconds XFS I will try it too. Have you indexed your ext4 partition? # tune2fs -O dir_index /dev/your_partition # e2fsck -D /dev/your_partition Rgds, [-- Attachment #2: Type: text/html, Size: 955 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:19 ` Pandu Poluan @ 2012-08-13 14:42 ` Michael Hampicke 2012-08-13 14:52 ` Michael Mol 0 siblings, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 14:42 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 254 bytes --] > > Have you indexed your ext4 partition? > > # tune2fs -O dir_index /dev/your_partition > # e2fsck -D /dev/your_partition > Hi, the dir_index is active. I guess that's why delete operations take as long as they take (index has to be updated every time) [-- Attachment #2: Type: text/html, Size: 424 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:42 ` Michael Hampicke @ 2012-08-13 14:52 ` Michael Mol 2012-08-13 15:26 ` Michael Hampicke 2012-08-13 17:14 ` Florian Philipp 0 siblings, 2 replies; 41+ messages in thread From: Michael Mol @ 2012-08-13 14:52 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 434 bytes --] On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke <mgehampicke@gmail.com>wrote: > Have you indexed your ext4 partition? >> >> # tune2fs -O dir_index /dev/your_partition >> # e2fsck -D /dev/your_partition >> > Hi, the dir_index is active. I guess that's why delete operations take as > long as they take (index has to be updated every time) > 1) Scan for files to remove 2) disable index 3) Remove files 4) enable index ? -- :wq [-- Attachment #2: Type: text/html, Size: 947 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:52 ` Michael Mol @ 2012-08-13 15:26 ` Michael Hampicke 2012-08-13 15:52 ` Michael Mol 2012-08-13 17:14 ` Florian Philipp 1 sibling, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 15:26 UTC (permalink / raw To: gentoo-user Am 13.08.2012 16:52, schrieb Michael Mol: > On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke <mgehampicke@gmail.com>wrote: > >> Have you indexed your ext4 partition? >>> >>> # tune2fs -O dir_index /dev/your_partition >>> # e2fsck -D /dev/your_partition >>> >> Hi, the dir_index is active. I guess that's why delete operations take as >> long as they take (index has to be updated every time) >> > > 1) Scan for files to remove > 2) disable index > 3) Remove files > 4) enable index > > ? > That's what I love about gentoo-users :) , I would never have thought of that myself. I will try this and see how much of an performance gain there is. Disabling the index should only require a 'mount -o remount' I guess. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 15:26 ` Michael Hampicke @ 2012-08-13 15:52 ` Michael Mol 0 siblings, 0 replies; 41+ messages in thread From: Michael Mol @ 2012-08-13 15:52 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1407 bytes --] On Mon, Aug 13, 2012 at 11:26 AM, Michael Hampicke <mgehampicke@gmail.com>wrote: > Am 13.08.2012 16:52, schrieb Michael Mol: > > On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke < > mgehampicke@gmail.com>wrote: > > > >> Have you indexed your ext4 partition? > >>> > >>> # tune2fs -O dir_index /dev/your_partition > >>> # e2fsck -D /dev/your_partition > >>> > >> Hi, the dir_index is active. I guess that's why delete operations take > as > >> long as they take (index has to be updated every time) > >> > > > > 1) Scan for files to remove > > 2) disable index > > 3) Remove files > > 4) enable index > > > > ? > > > > That's what I love about gentoo-users :) , I would never have thought of > that myself. I will try this and see how much of an performance gain > there is. Disabling the index should only require a 'mount -o remount' I > guess. > > It's the same logic as behind database programming; do a bulk modification, then update the index afterwards. The index update will take longer than that for a single file, but it has the potential to be more efficient in bulk operations. (It'd be nice if ext4 supported transactional behaviors, where you could defer index updates until after a commit, but I don't think it (or any filesystem on Linux) does.) You *should* be able to enable/disable indexes on a per-directory basis, so if your search pattern is confined, I'd go that route. -- :wq [-- Attachment #2: Type: text/html, Size: 2009 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:52 ` Michael Mol 2012-08-13 15:26 ` Michael Hampicke @ 2012-08-13 17:14 ` Florian Philipp 2012-08-13 18:18 ` Michael Hampicke 1 sibling, 1 reply; 41+ messages in thread From: Florian Philipp @ 2012-08-13 17:14 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1352 bytes --] Am 13.08.2012 16:52, schrieb Michael Mol: > On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke > <mgehampicke@gmail.com <mailto:mgehampicke@gmail.com>> wrote: > > Have you indexed your ext4 partition? > > # tune2fs -O dir_index /dev/your_partition > # e2fsck -D /dev/your_partition > > Hi, the dir_index is active. I guess that's why delete operations > take as long as they take (index has to be updated every time) > > > 1) Scan for files to remove > 2) disable index > 3) Remove files > 4) enable index > > ? > > -- > :wq Other things to think about: 1. Play around with data=journal/writeback/ordered. IIRC, data=journal actually used to improve performance depending on the workload as it delays random IO in favor of sequential IO (when updating the journal). 2. Increase the journal size. 3. Take a look at `man 1 chattr`. Especially the 'T' attribute. Of course this only helps after re-allocating everything. 4. Try parallelizing. Ext4 requires relatively few locks nowadays (since 2.6.39 IIRC). For example: find $TOP_DIR -mindepth 1 -maxdepth 1 -print0 | \ xargs -0 -n 1 -r -P 4 -I '{}' find '{}' -type f 5. Use a separate device for the journal. 6. Temporarily deactivate the journal with tune2fs similar to MM's idea. Regards, Florian Philipp [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 17:14 ` Florian Philipp @ 2012-08-13 18:18 ` Michael Hampicke 2012-08-14 14:00 ` Florian Philipp 0 siblings, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 18:18 UTC (permalink / raw To: gentoo-user Am 13.08.2012 19:14, schrieb Florian Philipp: > Am 13.08.2012 16:52, schrieb Michael Mol: >> On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke >> <mgehampicke@gmail.com <mailto:mgehampicke@gmail.com>> wrote: >> >> Have you indexed your ext4 partition? >> >> # tune2fs -O dir_index /dev/your_partition >> # e2fsck -D /dev/your_partition >> >> Hi, the dir_index is active. I guess that's why delete operations >> take as long as they take (index has to be updated every time) >> >> >> 1) Scan for files to remove >> 2) disable index >> 3) Remove files >> 4) enable index >> >> ? >> >> -- >> :wq > > Other things to think about: > > 1. Play around with data=journal/writeback/ordered. IIRC, data=journal > actually used to improve performance depending on the workload as it > delays random IO in favor of sequential IO (when updating the journal). > > 2. Increase the journal size. > > 3. Take a look at `man 1 chattr`. Especially the 'T' attribute. Of > course this only helps after re-allocating everything. > > 4. Try parallelizing. Ext4 requires relatively few locks nowadays (since > 2.6.39 IIRC). For example: > find $TOP_DIR -mindepth 1 -maxdepth 1 -print0 | \ > xargs -0 -n 1 -r -P 4 -I '{}' find '{}' -type f > > 5. Use a separate device for the journal. > > 6. Temporarily deactivate the journal with tune2fs similar to MM's idea. > > Regards, > Florian Philipp > Trying out different journals-/options was already on my list, but the manpage on chattr regarding the T attribute is an interesting read. Definitely worth trying. Parallelizing multiple finds was something I already did, but the only thing that increased was the IO wait :) But now having read all the suggestions in this thread, I might try it again. Separate device for the journal is a good idea, but not possible atm (machine is abroad in a data center) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 18:18 ` Michael Hampicke @ 2012-08-14 14:00 ` Florian Philipp 2012-08-14 17:42 ` Michael Hampicke 0 siblings, 1 reply; 41+ messages in thread From: Florian Philipp @ 2012-08-14 14:00 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 3176 bytes --] Am 13.08.2012 20:18, schrieb Michael Hampicke: > Am 13.08.2012 19:14, schrieb Florian Philipp: >> Am 13.08.2012 16:52, schrieb Michael Mol: >>> On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke >>> <mgehampicke@gmail.com <mailto:mgehampicke@gmail.com>> wrote: >>> >>> Have you indexed your ext4 partition? >>> >>> # tune2fs -O dir_index /dev/your_partition >>> # e2fsck -D /dev/your_partition >>> >>> Hi, the dir_index is active. I guess that's why delete operations >>> take as long as they take (index has to be updated every time) >>> >>> >>> 1) Scan for files to remove >>> 2) disable index >>> 3) Remove files >>> 4) enable index >>> >>> ? >>> >>> -- >>> :wq >> >> Other things to think about: >> >> 1. Play around with data=journal/writeback/ordered. IIRC, data=journal >> actually used to improve performance depending on the workload as it >> delays random IO in favor of sequential IO (when updating the journal). >> >> 2. Increase the journal size. >> >> 3. Take a look at `man 1 chattr`. Especially the 'T' attribute. Of >> course this only helps after re-allocating everything. >> >> 4. Try parallelizing. Ext4 requires relatively few locks nowadays (since >> 2.6.39 IIRC). For example: >> find $TOP_DIR -mindepth 1 -maxdepth 1 -print0 | \ >> xargs -0 -n 1 -r -P 4 -I '{}' find '{}' -type f >> >> 5. Use a separate device for the journal. >> >> 6. Temporarily deactivate the journal with tune2fs similar to MM's idea. >> >> Regards, >> Florian Philipp >> > > Trying out different journals-/options was already on my list, but the > manpage on chattr regarding the T attribute is an interesting read. > Definitely worth trying. > > Parallelizing multiple finds was something I already did, but the only > thing that increased was the IO wait :) But now having read all the > suggestions in this thread, I might try it again. > > Separate device for the journal is a good idea, but not possible atm > (machine is abroad in a data center) > Something else I just remembered. I guess it doesn't help you with your current problem but it might come in handy when working with such large cache dirs: I once wrote a script that sorts files by their starting physical block. This improved reading them quite a bit (2 minutes instead of 11 minutes for copying the portage tree). It's a terrible clutch, will probably fail when passing FS boundaries or a thousand other oddities and requires root for some very scary programs. I never had the time to finish an improved C version. Anyway, maybe it helps you: #!/bin/bash # # Example below copies /usr/portage/* to /tmp/portage. # Replace /usr/portage with the input directory. # Replace `cpio` with whatever does the actual work. Input is a # \0-delimited file list. # FIFO=/tmp/$(uuidgen).fifo mkfifo "$FIFO" find /usr/portage -type f -fprintf "$FIFO" 'bmap <%i> 0\n' -print0 | tr '\n\0' '\0\n' | paste <( debugfs -f "$FIFO" /dev/mapper/vg-portage | grep -E '^[[:digit:]]+' ) - | sort -k 1,1n | cut -f 2- | tr '\n\0' '\0\n' | cpio -p0 --make-directories /tmp/portage/ unlink "$FIFO" [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 14:00 ` Florian Philipp @ 2012-08-14 17:42 ` Michael Hampicke 0 siblings, 0 replies; 41+ messages in thread From: Michael Hampicke @ 2012-08-14 17:42 UTC (permalink / raw To: gentoo-user Am 14.08.2012 16:00, schrieb Florian Philipp: > Am 13.08.2012 20:18, schrieb Michael Hampicke: >> Am 13.08.2012 19:14, schrieb Florian Philipp: >>> Am 13.08.2012 16:52, schrieb Michael Mol: >>>> On Mon, Aug 13, 2012 at 10:42 AM, Michael Hampicke >>>> <mgehampicke@gmail.com <mailto:mgehampicke@gmail.com>> wrote: >>>> >>>> Have you indexed your ext4 partition? >>>> >>>> # tune2fs -O dir_index /dev/your_partition >>>> # e2fsck -D /dev/your_partition >>>> >>>> Hi, the dir_index is active. I guess that's why delete operations >>>> take as long as they take (index has to be updated every time) >>>> >>>> >>>> 1) Scan for files to remove >>>> 2) disable index >>>> 3) Remove files >>>> 4) enable index >>>> >>>> ? >>>> >>>> -- >>>> :wq >>> >>> Other things to think about: >>> >>> 1. Play around with data=journal/writeback/ordered. IIRC, data=journal >>> actually used to improve performance depending on the workload as it >>> delays random IO in favor of sequential IO (when updating the journal). >>> >>> 2. Increase the journal size. >>> >>> 3. Take a look at `man 1 chattr`. Especially the 'T' attribute. Of >>> course this only helps after re-allocating everything. >>> >>> 4. Try parallelizing. Ext4 requires relatively few locks nowadays (since >>> 2.6.39 IIRC). For example: >>> find $TOP_DIR -mindepth 1 -maxdepth 1 -print0 | \ >>> xargs -0 -n 1 -r -P 4 -I '{}' find '{}' -type f >>> >>> 5. Use a separate device for the journal. >>> >>> 6. Temporarily deactivate the journal with tune2fs similar to MM's idea. >>> >>> Regards, >>> Florian Philipp >>> >> >> Trying out different journals-/options was already on my list, but the >> manpage on chattr regarding the T attribute is an interesting read. >> Definitely worth trying. >> >> Parallelizing multiple finds was something I already did, but the only >> thing that increased was the IO wait :) But now having read all the >> suggestions in this thread, I might try it again. >> >> Separate device for the journal is a good idea, but not possible atm >> (machine is abroad in a data center) >> > > Something else I just remembered. I guess it doesn't help you with your > current problem but it might come in handy when working with such large > cache dirs: I once wrote a script that sorts files by their starting > physical block. This improved reading them quite a bit (2 minutes > instead of 11 minutes for copying the portage tree). > > It's a terrible clutch, will probably fail when passing FS boundaries or > a thousand other oddities and requires root for some very scary > programs. I never had the time to finish an improved C version. Anyway, > maybe it helps you: > > #!/bin/bash > # > # Example below copies /usr/portage/* to /tmp/portage. > # Replace /usr/portage with the input directory. > # Replace `cpio` with whatever does the actual work. Input is a > # \0-delimited file list. > # > FIFO=/tmp/$(uuidgen).fifo > mkfifo "$FIFO" > find /usr/portage -type f -fprintf "$FIFO" 'bmap <%i> 0\n' -print0 | > tr '\n\0' '\0\n' | > paste <( > debugfs -f "$FIFO" /dev/mapper/vg-portage | > grep -E '^[[:digit:]]+' > ) - | > sort -k 1,1n | > cut -f 2- | > tr '\n\0' '\0\n' | > cpio -p0 --make-directories /tmp/portage/ > unlink "$FIFO" > No, I don't think that's practicable with the number of files in my setup. To be honest, currently I am quite happy with the performance of btrfs. Running through the directory tree only takes 1/10th of the time it took with ext4, and deletes are pretty fast as well. I'm sure there's still room for more improvement, but right now it's much better than it was before. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:54 ` Michael Hampicke 2012-08-13 14:19 ` Pandu Poluan @ 2012-08-13 14:40 ` Dale 2012-08-13 14:58 ` Michael Hampicke 1 sibling, 1 reply; 41+ messages in thread From: Dale @ 2012-08-13 14:40 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 807 bytes --] Michael Hampicke wrote: > > You should have a look at xfs. > > I used to use ext4 earlier, traversing through /usr/portage used > to be very slow. When I switched xfs, speed increased drastically. > > This might be kind of unrelated, but makes sense. > > > I guess traversing through directories may be faster with XFS, but in > my experience ext4 perfoms better than XFS in regard to operations > (cp, rm) on small files. > I read that there are some tuning options for XFS and small files, but > never tried it. > > But if somone seconds XFS I will try it too. It's been a while since I messed with this but isn't XFS the one that hates power failures and such? Dale :-) :-) -- I am only responsible for what I said ... Not for what you understood or how you interpreted my words! [-- Attachment #2: Type: text/html, Size: 1715 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:40 ` Dale @ 2012-08-13 14:58 ` Michael Hampicke 2012-08-13 15:20 ` Nilesh Govindrajan 0 siblings, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 14:58 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 967 bytes --] > > I guess traversing through directories may be faster with XFS, but in my > experience ext4 perfoms better than XFS in regard to operations (cp, rm) on > small files. > I read that there are some tuning options for XFS and small files, but > never tried it. > > But if somone seconds XFS I will try it too. > > > It's been a while since I messed with this but isn't XFS the one that > hates power failures and such? > > Dale > > :-) :-) > > -- > I am only responsible for what I said ... Not for what you understood or how you interpreted my words! > > Well, it's the delayed allocation of XFS (which prevents fragmentation) that does not like sudden power losses :) But ext4 has that too, you can disable it though - that should be true for XFS too. But the power situation in the datacenter has never been a problem so far, and even if the cache partition get's screwed, we can always rebuild it. Takes a few hours, but it would not be the end of the world :) [-- Attachment #2: Type: text/html, Size: 1549 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:58 ` Michael Hampicke @ 2012-08-13 15:20 ` Nilesh Govindrajan 0 siblings, 0 replies; 41+ messages in thread From: Nilesh Govindrajan @ 2012-08-13 15:20 UTC (permalink / raw To: gentoo-user On Mon 13 Aug 2012 08:28:15 PM IST, Michael Hampicke wrote: >> I guess traversing through directories may be faster with XFS, >> but in my experience ext4 perfoms better than XFS in regard to >> operations (cp, rm) on small files. >> I read that there are some tuning options for XFS and small >> files, but never tried it. >> >> But if somone seconds XFS I will try it too. > > It's been a while since I messed with this but isn't XFS the one > that hates power failures and such? > > Dale > > :-) :-) > > -- > I am only responsible for what I said ... Not for what you understood or how you interpreted my words! > > Well, it's the delayed allocation of XFS (which prevents > fragmentation) that does not like sudden power losses :) But ext4 has > that too, you can disable it though - that should be true for XFS too. > But the power situation in the datacenter has never been a problem so > far, and even if the cache partition get's screwed, we can always > rebuild it. Takes a few hours, but it would not be the end of the world :) Yes, XFS hates power failures. I got a giant UPS for my home desktop to use XFS because of it's excellent performance ;-) -- Nilesh Govindrajan http://nileshgr.com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:16 [gentoo-user] Fast file system for cache directory with lot's of files Michael Hampicke 2012-08-13 13:22 ` Nilesh Govindrajan @ 2012-08-13 14:38 ` Daniel Troeder 2012-08-13 14:53 ` Michael Hampicke 2012-08-13 20:13 ` Paul Hartman 2 siblings, 1 reply; 41+ messages in thread From: Daniel Troeder @ 2012-08-13 14:38 UTC (permalink / raw To: gentoo-user On 13.08.2012 15:16, Michael Hampicke wrote: > - about 20GB > - 100.000 directories > - about 2 million files > > The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two > 10.000rpm hard drives running a RAID1. 1st thought: switch to SSDs 2nd thought: maybe lots of writes? -> get a SSD for the fs metadata 3rd thought: purging old files with "find"? your cache system should have some kind of DB that holds that information. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:38 ` Daniel Troeder @ 2012-08-13 14:53 ` Michael Hampicke 2012-08-14 8:21 ` Daniel Troeder 0 siblings, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-13 14:53 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 833 bytes --] 2012/8/13 Daniel Troeder <daniel@admin-box.com> > On 13.08.2012 15:16, Michael Hampicke wrote: > > - about 20GB > > - 100.000 directories > > - about 2 million files > > > > The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two > > 10.000rpm hard drives running a RAID1. > 1st thought: switch to SSDs > 2nd thought: maybe lots of writes? -> get a SSD for the fs metadata > 3rd thought: purging old files with "find"? your cache system should > have some kind of DB that holds that information. > > 1: SSDs are not possible atm. The machine is in a data center abroad. 2: Writes are not that much of a problem at this time. 3: Well, it's a 3rd party application that - in theory - should take care of removing old files. Sadly, it does not work as it's supposed to be, While time passes the number of orphans grow :( [-- Attachment #2: Type: text/html, Size: 1210 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 14:53 ` Michael Hampicke @ 2012-08-14 8:21 ` Daniel Troeder 2012-08-14 9:46 ` Neil Bothwick 2012-08-14 17:45 ` Michael Hampicke 0 siblings, 2 replies; 41+ messages in thread From: Daniel Troeder @ 2012-08-14 8:21 UTC (permalink / raw To: gentoo-user On 13.08.2012 16:53, Michael Hampicke wrote: > 2012/8/13 Daniel Troeder <daniel@admin-box.com > 3rd thought: purging old files with "find"? your cache system should > have some kind of DB that holds that information. > 3: Well, it's a 3rd party application that - in theory - should take > care of removing old files. Sadly, it does not work as it's supposed to > be, While time passes the number of orphans grow :( There is also the possibility to write a really small daemon (less than 50 lines of C) that registers with inotify for the entire fs and journals the file activity to a sqlite-db. A simple sql-query from a cron/bash script will then give you all the files to delete with paths. It will probably be less work to write the daemon than to do 40 fs-benchmarks - and the result will be the most efficient. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 8:21 ` Daniel Troeder @ 2012-08-14 9:46 ` Neil Bothwick 2012-08-14 13:00 ` Florian Philipp 2012-08-14 13:54 ` Daniel Troeder 2012-08-14 17:45 ` Michael Hampicke 1 sibling, 2 replies; 41+ messages in thread From: Neil Bothwick @ 2012-08-14 9:46 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 437 bytes --] On Tue, 14 Aug 2012 10:21:54 +0200, Daniel Troeder wrote: > There is also the possibility to write a really small daemon (less than > 50 lines of C) that registers with inotify for the entire fs and > journals the file activity to a sqlite-db. sys-process/incron ? -- Neil Bothwick A friend of mine sent me a postcard with a satellite photo of the entire planet on it, and on the back he wrote, "Wish you were here." [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 9:46 ` Neil Bothwick @ 2012-08-14 13:00 ` Florian Philipp 2012-08-14 13:54 ` Daniel Troeder 1 sibling, 0 replies; 41+ messages in thread From: Florian Philipp @ 2012-08-14 13:00 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 524 bytes --] Am 14.08.2012 11:46, schrieb Neil Bothwick: > On Tue, 14 Aug 2012 10:21:54 +0200, Daniel Troeder wrote: > >> There is also the possibility to write a really small daemon (less than >> 50 lines of C) that registers with inotify for the entire fs and >> journals the file activity to a sqlite-db. > > sys-process/incron ? > > I think in order to make it work, you have to increase the number of file descriptors available to inotify. See /proc/sys/fs/inotify/max_user_watches Regards, Florian Philipp [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 9:46 ` Neil Bothwick 2012-08-14 13:00 ` Florian Philipp @ 2012-08-14 13:54 ` Daniel Troeder 2012-08-14 15:09 ` Florian Philipp 2012-08-16 16:54 ` Neil Bothwick 1 sibling, 2 replies; 41+ messages in thread From: Daniel Troeder @ 2012-08-14 13:54 UTC (permalink / raw To: gentoo-user On 14.08.2012 11:46, Neil Bothwick wrote: > On Tue, 14 Aug 2012 10:21:54 +0200, Daniel Troeder wrote: > >> There is also the possibility to write a really small daemon (less than >> 50 lines of C) that registers with inotify for the entire fs and >> journals the file activity to a sqlite-db. > > sys-process/incron ? Uh... didn't know that one! ... very interesting :) Have you used it? How does it perform if there are lots of modifications going on? Does it have a throttle against fork bombing? must-read-myself-a-little..... A incron line # sqlite3 /file.sql 'INSERT filename, date INTO table' would be inefficient, because it spawn lots of processes, but it would be very nice to simply test out the idea. Then a # sqlite3 /file.sql 'SELECT filename FROM table SORTBY date < date-30days' or something to get the files older than 30 days, and voilá :) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 13:54 ` Daniel Troeder @ 2012-08-14 15:09 ` Florian Philipp 2012-08-14 15:33 ` Florian Philipp 2012-08-16 16:54 ` Neil Bothwick 1 sibling, 1 reply; 41+ messages in thread From: Florian Philipp @ 2012-08-14 15:09 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 2227 bytes --] Am 14.08.2012 15:54, schrieb Daniel Troeder: > On 14.08.2012 11:46, Neil Bothwick wrote: >> On Tue, 14 Aug 2012 10:21:54 +0200, Daniel Troeder wrote: >> >>> There is also the possibility to write a really small daemon (less than >>> 50 lines of C) that registers with inotify for the entire fs and >>> journals the file activity to a sqlite-db. >> >> sys-process/incron ? > Uh... didn't know that one! ... very interesting :) > > Have you used it? > How does it perform if there are lots of modifications going on? > Does it have a throttle against fork bombing? > must-read-myself-a-little..... > > A incron line > # sqlite3 /file.sql 'INSERT filename, date INTO table' > would be inefficient, because it spawn lots of processes, but it would > be very nice to simply test out the idea. Then a > # sqlite3 /file.sql 'SELECT filename FROM table SORTBY date < date-30days' > or something to get the files older than 30 days, and voilá :) > > Maybe inotifywait is better for this kind of batch job. Collecting events: inotifywait -rm -e CREATE,DELETE --timefmt '%s' --format \ "$(printf '%%T\t%%e\t%%w%%f')" /tmp > events.tbl # the printf is there because inotifywait's format does not # recognize common escapes like \t # Output format: # Seconds since epoch \t CREATE/DELETE \t file name \n Filtering events: sort --stable -k3 events.tbl | awk ' function update() { line=$0; exists= $2=="DELETE" ? 0 : 1; file=$3 } NR==1{ update(); next } { if($3!=file && exists==1){ print line } update() }' # Sorts by file name while preserving temporal order. # Uses awk to suppress output of files that have been deleted. # Output: Last CREATE event for each existing file Retrieving files created 30+ days ago: awk -v newest=$(date -d -5seconds +%s) ' $1>newest{ nextfile } { print $3 }' Remarks: The awk scripts need some improvement if you have to handle whitespaces in filenames but with the input format, it should be able to work with everything except newlines. Inotifywait itself is utterly useless when dealing with newlines in file names unless you want to put some serious effort into sanitizing the output. Regards, Florian Philipp [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 15:09 ` Florian Philipp @ 2012-08-14 15:33 ` Florian Philipp 0 siblings, 0 replies; 41+ messages in thread From: Florian Philipp @ 2012-08-14 15:33 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 213 bytes --] Am 14.08.2012 17:09, schrieb Florian Philipp: > > Retrieving files created 30+ days ago: > awk -v newest=$(date -d -5seconds +%s) ' > $1>newest{ nextfile } > { print $3 }' > s/-5seconds/-30days/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 13:54 ` Daniel Troeder 2012-08-14 15:09 ` Florian Philipp @ 2012-08-16 16:54 ` Neil Bothwick 1 sibling, 0 replies; 41+ messages in thread From: Neil Bothwick @ 2012-08-16 16:54 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 499 bytes --] On Tue, 14 Aug 2012 15:54:17 +0200, Daniel Troeder wrote: > > sys-process/incron ? > Uh... didn't know that one! ... very interesting :) > > Have you used it? Yes... > How does it perform if there are lots of modifications going on? > Does it have a throttle against fork bombing? > must-read-myself-a-little..... but only for fairly infrequently written locations. I have no idea how well it scales. -- Neil Bothwick Hors d'oeuvres: 3 sandwiches cut into 40 pieces. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 8:21 ` Daniel Troeder 2012-08-14 9:46 ` Neil Bothwick @ 2012-08-14 17:45 ` Michael Hampicke 1 sibling, 0 replies; 41+ messages in thread From: Michael Hampicke @ 2012-08-14 17:45 UTC (permalink / raw To: gentoo-user Am 14.08.2012 10:21, schrieb Daniel Troeder: > On 13.08.2012 16:53, Michael Hampicke wrote: >> 2012/8/13 Daniel Troeder <daniel@admin-box.com >> 3rd thought: purging old files with "find"? your cache system should >> have some kind of DB that holds that information. >> 3: Well, it's a 3rd party application that - in theory - should take >> care of removing old files. Sadly, it does not work as it's supposed to >> be, While time passes the number of orphans grow :( > There is also the possibility to write a really small daemon (less than > 50 lines of C) that registers with inotify for the entire fs and > journals the file activity to a sqlite-db. > > A simple sql-query from a cron/bash script will then give you all the > files to delete with paths. > > It will probably be less work to write the daemon than to do 40 > fs-benchmarks - and the result will be the most efficient. > That is an interesting idea, but I have never used inotify on such a huge file base, I am not sure what impact that has in terms of cpu cycles being used. But I am going to try this on some snowy winter weekend :) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 13:16 [gentoo-user] Fast file system for cache directory with lot's of files Michael Hampicke 2012-08-13 13:22 ` Nilesh Govindrajan 2012-08-13 14:38 ` Daniel Troeder @ 2012-08-13 20:13 ` Paul Hartman 2012-08-13 20:41 ` Volker Armin Hemmann 2012-08-14 2:07 ` Adam Carter 2 siblings, 2 replies; 41+ messages in thread From: Paul Hartman @ 2012-08-13 20:13 UTC (permalink / raw To: gentoo-user On Mon, Aug 13, 2012 at 8:16 AM, Michael Hampicke <mgehampicke@gmail.com> wrote: > Howdy gentooers, > > I am looking for a filesystem that perfomes well for a cache directory. > Here's some data on that dir: > - cache for prescaled images files + metadata files > - nested directory structure ( 20/2022/202231/*files* ) > - about 20GB > - 100.000 directories > - about 2 million files > > The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two > 10.000rpm hard drives running a RAID1. > > Up until now I was using ext4 with noatime, but I am not happy with it's > performence. Finding and deleting old files with 'find' is incredible slow, > so I am looking for a filesystem that performs better. First candiate that > came to mind was reiserfs, but last time I tried it, it became slower over > time (fragmentation?). > Currently I am running a test with btrfs and so far I am quiet happy with it > as it is much faster in my use case. > > Do you guys have any other suggestions? How about JFS? I used that on my old > NAS box because of it's low cpu usage. Should I give reiser4 a try, or > better leave it be given Hans Reiser's current status? I think btrfs probably is meant to provide a lot of the modern features like reiser4 or xfs (tail-packing, indexing, compression, snapshots, subvolumes, etc). Don't know if it is considered stable enough for your usage but at least it is under active development and funded by large names. I think if you would consider reiser4 as a possibility then you should consider btrfs as well. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 20:13 ` Paul Hartman @ 2012-08-13 20:41 ` Volker Armin Hemmann 2012-08-14 2:07 ` Adam Carter 1 sibling, 0 replies; 41+ messages in thread From: Volker Armin Hemmann @ 2012-08-13 20:41 UTC (permalink / raw To: gentoo-user; +Cc: Paul Hartman Am Montag, 13. August 2012, 15:13:03 schrieb Paul Hartman: > On Mon, Aug 13, 2012 at 8:16 AM, Michael Hampicke <mgehampicke@gmail.com> wrote: > > Howdy gentooers, > > > > I am looking for a filesystem that perfomes well for a cache directory. > > Here's some data on that dir: > > - cache for prescaled images files + metadata files > > - nested directory structure ( 20/2022/202231/*files* ) > > - about 20GB > > - 100.000 directories > > - about 2 million files > > > > The system has 2x Intel Xon Quad-cores (Nehalem), 16GB of RAM and two > > 10.000rpm hard drives running a RAID1. > > > > Up until now I was using ext4 with noatime, but I am not happy with it's > > performence. Finding and deleting old files with 'find' is incredible > > slow, > > so I am looking for a filesystem that performs better. First candiate that > > came to mind was reiserfs, but last time I tried it, it became slower over > > time (fragmentation?). > > Currently I am running a test with btrfs and so far I am quiet happy with > > it as it is much faster in my use case. > > > > Do you guys have any other suggestions? How about JFS? I used that on my > > old NAS box because of it's low cpu usage. Should I give reiser4 a try, > > or better leave it be given Hans Reiser's current status? > > I think btrfs probably is meant to provide a lot of the modern > features like reiser4 or xfs (tail-packing, indexing, compression, > snapshots, subvolumes, etc). Don't know if it is considered stable > enough for your usage but at least it is under active development and > funded by large names. I think if you would consider reiser4 as a > possibility then you should consider btrfs as well. reiser4 has one feature btrfs and ever other is missing. atomic operations. Which is a wonderful feature. Too bad 'politics' killed reiser4. -- #163933 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-13 20:13 ` Paul Hartman 2012-08-13 20:41 ` Volker Armin Hemmann @ 2012-08-14 2:07 ` Adam Carter 2012-08-14 16:36 ` Helmut Jarausch 1 sibling, 1 reply; 41+ messages in thread From: Adam Carter @ 2012-08-14 2:07 UTC (permalink / raw To: gentoo-user > I think btrfs probably is meant to provide a lot of the modern > features like reiser4 or xfs Unfortunately btrfs is still generally slower than ext4 for example. Checkout http://openbenchmarking.org/, eg http://openbenchmarking.org/s/ext4%20btrfs The OS will use any spare RAM for disk caching, so if there's not much else running on that box, most of your content will be served from RAM. It may be that whatever fs you choose wont make that much of a difference anyways. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 2:07 ` Adam Carter @ 2012-08-14 16:36 ` Helmut Jarausch 2012-08-14 17:05 ` Pandu Poluan 2012-08-15 7:31 ` Bill Kenworthy 0 siblings, 2 replies; 41+ messages in thread From: Helmut Jarausch @ 2012-08-14 16:36 UTC (permalink / raw To: gentoo-user On 08/14/2012 04:07:39 AM, Adam Carter wrote: > > I think btrfs probably is meant to provide a lot of the modern > > features like reiser4 or xfs > > Unfortunately btrfs is still generally slower than ext4 for example. > Checkout http://openbenchmarking.org/, eg > http://openbenchmarking.org/s/ext4%20btrfs > > The OS will use any spare RAM for disk caching, so if there's not much > else running on that box, most of your content will be served from > RAM. It may be that whatever fs you choose wont make that much of a > difference anyways. > If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's used by some distribution and Oracle for real work) Most benchmark don't use compression since other FS can't use it. But that's unfair. With compression, one needs to read much less data (my /usr partition has less than 50% of an ext4 partition, savings with the root partition are even higher). I'm using the mount options compress=lzo,noacl,noatime,autodefrag,space_cache which require a recent kernel. I'd give it a try. Helmut. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 16:36 ` Helmut Jarausch @ 2012-08-14 17:05 ` Pandu Poluan 2012-08-14 17:21 ` Jason Weisberger ` (2 more replies) 2012-08-15 7:31 ` Bill Kenworthy 1 sibling, 3 replies; 41+ messages in thread From: Pandu Poluan @ 2012-08-14 17:05 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1311 bytes --] On Aug 14, 2012 11:42 PM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> wrote: > > On 08/14/2012 04:07:39 AM, Adam Carter wrote: >> >> > I think btrfs probably is meant to provide a lot of the modern >> > features like reiser4 or xfs >> >> Unfortunately btrfs is still generally slower than ext4 for example. >> Checkout http://openbenchmarking.org/, eg >> http://openbenchmarking.org/s/ext4%20btrfs >> >> The OS will use any spare RAM for disk caching, so if there's not much >> else running on that box, most of your content will be served from >> RAM. It may be that whatever fs you choose wont make that much of a >> difference anyways. >> > > If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's used by some distribution and Oracle for real work) > Most benchmark don't use compression since other FS can't use it. But that's unfair. With compression, one needs to read > much less data (my /usr partition has less than 50% of an ext4 partition, savings with the root partition are even higher). > > I'm using the mount options compress=lzo,noacl,noatime,autodefrag,space_cache which require a recent kernel. > > I'd give it a try. > > Helmut. > Are the support tools for btrfs (fsck, defrag, etc.) already complete? If so, I certainly would like to take it out for a spin... Rgds, [-- Attachment #2: Type: text/html, Size: 1773 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:05 ` Pandu Poluan @ 2012-08-14 17:21 ` Jason Weisberger 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 17:48 ` Michael Hampicke 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 19:39 ` Paul Hartman 2 siblings, 2 replies; 41+ messages in thread From: Jason Weisberger @ 2012-08-14 17:21 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1560 bytes --] Sure, but wouldn't compression make write operations slower? And isn't he looking for performance? On Aug 14, 2012 1:14 PM, "Pandu Poluan" <pandu@poluan.info> wrote: > > On Aug 14, 2012 11:42 PM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> > wrote: > > > > On 08/14/2012 04:07:39 AM, Adam Carter wrote: > >> > >> > I think btrfs probably is meant to provide a lot of the modern > >> > features like reiser4 or xfs > >> > >> Unfortunately btrfs is still generally slower than ext4 for example. > >> Checkout http://openbenchmarking.org/, eg > >> http://openbenchmarking.org/s/ext4%20btrfs > >> > >> The OS will use any spare RAM for disk caching, so if there's not much > >> else running on that box, most of your content will be served from > >> RAM. It may be that whatever fs you choose wont make that much of a > >> difference anyways. > >> > > > > If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's > used by some distribution and Oracle for real work) > > Most benchmark don't use compression since other FS can't use it. But > that's unfair. With compression, one needs to read > > much less data (my /usr partition has less than 50% of an ext4 > partition, savings with the root partition are even higher). > > > > I'm using the mount options > compress=lzo,noacl,noatime,autodefrag,space_cache which require a recent > kernel. > > > > I'd give it a try. > > > > Helmut. > > > > Are the support tools for btrfs (fsck, defrag, etc.) already complete? > > If so, I certainly would like to take it out for a spin... > > Rgds, > > [-- Attachment #2: Type: text/html, Size: 2274 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:21 ` Jason Weisberger @ 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 17:50 ` Michael Hampicke 2012-08-14 17:48 ` Michael Hampicke 1 sibling, 1 reply; 41+ messages in thread From: Volker Armin Hemmann @ 2012-08-14 17:42 UTC (permalink / raw To: gentoo-user; +Cc: Jason Weisberger Am Dienstag, 14. August 2012, 13:21:35 schrieb Jason Weisberger: > Sure, but wouldn't compression make write operations slower? And isn't he > looking for performance? not really. As long as the CPU can compress faster than the disk can write stuff. More interessting: is btrfs trying to be smart - only compressing compressible stuff? -- #163933 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:42 ` Volker Armin Hemmann @ 2012-08-14 17:50 ` Michael Hampicke 2012-08-14 19:55 ` Alecks Gates 0 siblings, 1 reply; 41+ messages in thread From: Michael Hampicke @ 2012-08-14 17:50 UTC (permalink / raw To: gentoo-user Am 14.08.2012 19:42, schrieb Volker Armin Hemmann: > Am Dienstag, 14. August 2012, 13:21:35 schrieb Jason Weisberger: >> Sure, but wouldn't compression make write operations slower? And isn't he >> looking for performance? > > not really. As long as the CPU can compress faster than the disk can write > stuff. > > More interessting: is btrfs trying to be smart - only compressing compressible > stuff? > It does do that, but letting btrfs check if the files are already compressed, if you know, that they are compressed, is a waste of cpu cycles :) ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:50 ` Michael Hampicke @ 2012-08-14 19:55 ` Alecks Gates 2012-08-14 20:17 ` Michael Mol 0 siblings, 1 reply; 41+ messages in thread From: Alecks Gates @ 2012-08-14 19:55 UTC (permalink / raw To: gentoo-user On Tue, Aug 14, 2012 at 12:50 PM, Michael Hampicke <gentoo-user@hadt.biz> wrote: > Am 14.08.2012 19:42, schrieb Volker Armin Hemmann: >> Am Dienstag, 14. August 2012, 13:21:35 schrieb Jason Weisberger: >>> Sure, but wouldn't compression make write operations slower? And isn't he >>> looking for performance? >> >> not really. As long as the CPU can compress faster than the disk can write >> stuff. >> >> More interessting: is btrfs trying to be smart - only compressing compressible >> stuff? >> > > It does do that, but letting btrfs check if the files are already > compressed, if you know, that they are compressed, is a waste of cpu > cycles :) > Also look into the difference between compress and compress-force[0]. I wonder how much overhead checking whether or not to compress a file costs. I use mount options similar to Helmut and get great results: defaults,autodefrag,space_cache,compress=lzo,subvol=@,relatime But most of my data is compressible. Compression makes such a huge difference, it surprises me. Apparently on this Ubuntu system it automatically makes use of all files on / as a subvolume in "@". Interesting. Anyway, btrfs-progs does include basic fsck now but I wouldn't use it for anything serious[1]. [0] https://btrfs.wiki.kernel.org/index.php/Mount_options [1] https://btrfs.wiki.kernel.org/index.php/Btrfsck ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 19:55 ` Alecks Gates @ 2012-08-14 20:17 ` Michael Mol 2012-08-14 20:57 ` Alecks Gates 0 siblings, 1 reply; 41+ messages in thread From: Michael Mol @ 2012-08-14 20:17 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1505 bytes --] On Tue, Aug 14, 2012 at 3:55 PM, Alecks Gates <alecks.g@gmail.com> wrote: > On Tue, Aug 14, 2012 at 12:50 PM, Michael Hampicke <gentoo-user@hadt.biz> > wrote: > > Am 14.08.2012 19:42, schrieb Volker Armin Hemmann: > >> Am Dienstag, 14. August 2012, 13:21:35 schrieb Jason Weisberger: > >>> Sure, but wouldn't compression make write operations slower? And > isn't he > >>> looking for performance? > >> > >> not really. As long as the CPU can compress faster than the disk can > write > >> stuff. > >> > >> More interessting: is btrfs trying to be smart - only compressing > compressible > >> stuff? > >> > > > > It does do that, but letting btrfs check if the files are already > > compressed, if you know, that they are compressed, is a waste of cpu > > cycles :) > > > > Also look into the difference between compress and compress-force[0]. > I wonder how much overhead checking whether or not to compress a file > costs. I use mount options similar to Helmut and get great results: > defaults,autodefrag,space_cache,compress=lzo,subvol=@,relatime > > But most of my data is compressible. Compression makes such a huge > difference, it surprises me. Apparently on this Ubuntu system it > automatically makes use of all files on / as a subvolume in "@". > Interesting. > Huge difference, how? Could we see some bonnie++ comparisons between the various configurations we've discussed for ext4 and btrfs? Depending on the results, it might be getting time for me to take the plunge myself. -- :wq [-- Attachment #2: Type: text/html, Size: 2089 bytes --] ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 20:17 ` Michael Mol @ 2012-08-14 20:57 ` Alecks Gates 0 siblings, 0 replies; 41+ messages in thread From: Alecks Gates @ 2012-08-14 20:57 UTC (permalink / raw To: gentoo-user On Tue, Aug 14, 2012 at 3:17 PM, Michael Mol <mikemol@gmail.com> wrote: > On Tue, Aug 14, 2012 at 3:55 PM, Alecks Gates <alecks.g@gmail.com> wrote: >> >> On Tue, Aug 14, 2012 at 12:50 PM, Michael Hampicke <gentoo-user@hadt.biz> >> wrote: >> > Am 14.08.2012 19:42, schrieb Volker Armin Hemmann: >> >> Am Dienstag, 14. August 2012, 13:21:35 schrieb Jason Weisberger: >> >>> Sure, but wouldn't compression make write operations slower? And >> >>> isn't he >> >>> looking for performance? >> >> >> >> not really. As long as the CPU can compress faster than the disk can >> >> write >> >> stuff. >> >> >> >> More interessting: is btrfs trying to be smart - only compressing >> >> compressible >> >> stuff? >> >> >> > >> > It does do that, but letting btrfs check if the files are already >> > compressed, if you know, that they are compressed, is a waste of cpu >> > cycles :) >> > >> >> Also look into the difference between compress and compress-force[0]. >> I wonder how much overhead checking whether or not to compress a file >> costs. I use mount options similar to Helmut and get great results: >> defaults,autodefrag,space_cache,compress=lzo,subvol=@,relatime >> >> But most of my data is compressible. Compression makes such a huge >> difference, it surprises me. Apparently on this Ubuntu system it >> automatically makes use of all files on / as a subvolume in "@". >> Interesting. > > > Huge difference, how? > > Could we see some bonnie++ comparisons between the various configurations > we've discussed for ext4 and btrfs? Depending on the results, it might be > getting time for me to take the plunge myself. > > -- > :wq Check out some of the benchmarks on Phoronix[0]. It's definitely not a win-win scenario, but it seems to be great at random writes and compiling. And a lot of those wins are without compress=lzo enabled, so it only gets better. I'm not going to say it's the absolute best out there (because it isn't, of course), but it's at least worth checking into. I'm using a standard 2.5" HDD like in this[1] so perhaps that's why I see the results. [0] http://www.phoronix.com/scan.php?page=search&q=Btrfs [1] http://www.phoronix.com/scan.php?page=article&item=btrfs_old_linux31 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:21 ` Jason Weisberger 2012-08-14 17:42 ` Volker Armin Hemmann @ 2012-08-14 17:48 ` Michael Hampicke 1 sibling, 0 replies; 41+ messages in thread From: Michael Hampicke @ 2012-08-14 17:48 UTC (permalink / raw To: gentoo-user Am 14.08.2012 19:21, schrieb Jason Weisberger: > Sure, but wouldn't compression make write operations slower? And isn't he > looking for performance? > On Aug 14, 2012 1:14 PM, "Pandu Poluan" <pandu@poluan.info> wrote: > >> >> On Aug 14, 2012 11:42 PM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> >> wrote: >>> >>> On 08/14/2012 04:07:39 AM, Adam Carter wrote: >>>> >>>>> I think btrfs probably is meant to provide a lot of the modern >>>>> features like reiser4 or xfs >>>> >>>> Unfortunately btrfs is still generally slower than ext4 for example. >>>> Checkout http://openbenchmarking.org/, eg >>>> http://openbenchmarking.org/s/ext4%20btrfs >>>> >>>> The OS will use any spare RAM for disk caching, so if there's not much >>>> else running on that box, most of your content will be served from >>>> RAM. It may be that whatever fs you choose wont make that much of a >>>> difference anyways. >>>> >>> >>> If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's >> used by some distribution and Oracle for real work) >>> Most benchmark don't use compression since other FS can't use it. But >> that's unfair. With compression, one needs to read >>> much less data (my /usr partition has less than 50% of an ext4 >> partition, savings with the root partition are even higher). >>> >>> I'm using the mount options >> compress=lzo,noacl,noatime,autodefrag,space_cache which require a recent >> kernel. >>> >>> I'd give it a try. >>> >>> Helmut. >>> >> >> Are the support tools for btrfs (fsck, defrag, etc.) already complete? >> >> If so, I certainly would like to take it out for a spin... >> >> Rgds, >> >> > I have enough cpu power at hand for compression, I guess that should not be the issue. But the cache dir mostly consists of prescaled jpeg images, so compressing them again would not give me any benefits, speed- or size-wise. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:05 ` Pandu Poluan 2012-08-14 17:21 ` Jason Weisberger @ 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 19:39 ` Paul Hartman 2 siblings, 0 replies; 41+ messages in thread From: Volker Armin Hemmann @ 2012-08-14 17:42 UTC (permalink / raw To: gentoo-user; +Cc: Pandu Poluan Am Mittwoch, 15. August 2012, 00:05:40 schrieb Pandu Poluan: > > Are the support tools for btrfs (fsck, defrag, etc.) already complete? no -- #163933 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 17:05 ` Pandu Poluan 2012-08-14 17:21 ` Jason Weisberger 2012-08-14 17:42 ` Volker Armin Hemmann @ 2012-08-14 19:39 ` Paul Hartman 2 siblings, 0 replies; 41+ messages in thread From: Paul Hartman @ 2012-08-14 19:39 UTC (permalink / raw To: gentoo-user On Tue, Aug 14, 2012 at 12:05 PM, Pandu Poluan <pandu@poluan.info> wrote: > > On Aug 14, 2012 11:42 PM, "Helmut Jarausch" <jarausch@igpm.rwth-aachen.de> > wrote: >> >> On 08/14/2012 04:07:39 AM, Adam Carter wrote: >>> >>> > I think btrfs probably is meant to provide a lot of the modern >>> > features like reiser4 or xfs >>> >>> Unfortunately btrfs is still generally slower than ext4 for example. >>> Checkout http://openbenchmarking.org/, eg >>> http://openbenchmarking.org/s/ext4%20btrfs >>> >>> The OS will use any spare RAM for disk caching, so if there's not much >>> else running on that box, most of your content will be served from >>> RAM. It may be that whatever fs you choose wont make that much of a >>> difference anyways. >>> >> >> If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's used >> by some distribution and Oracle for real work) >> Most benchmark don't use compression since other FS can't use it. But >> that's unfair. With compression, one needs to read >> much less data (my /usr partition has less than 50% of an ext4 partition, >> savings with the root partition are even higher). >> >> I'm using the mount options >> compress=lzo,noacl,noatime,autodefrag,space_cache which require a recent >> kernel. >> >> I'd give it a try. >> >> Helmut. >> > > Are the support tools for btrfs (fsck, defrag, etc.) already complete? Do they exist? Yes (sys-fs/btrfs-progs). Are they complete? Probably not... ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-14 16:36 ` Helmut Jarausch 2012-08-14 17:05 ` Pandu Poluan @ 2012-08-15 7:31 ` Bill Kenworthy 2012-08-15 8:13 ` Bill Kenworthy 1 sibling, 1 reply; 41+ messages in thread From: Bill Kenworthy @ 2012-08-15 7:31 UTC (permalink / raw To: gentoo-user On Tue, 2012-08-14 at 18:36 +0200, Helmut Jarausch wrote: > On 08/14/2012 04:07:39 AM, Adam Carter wrote: > > > I think btrfs probably is meant to provide a lot of the modern > > > features like reiser4 or xfs > > > > Unfortunately btrfs is still generally slower than ext4 for example. > > Checkout http://openbenchmarking.org/, eg > > http://openbenchmarking.org/s/ext4%20btrfs > > > > The OS will use any spare RAM for disk caching, so if there's not much > > else running on that box, most of your content will be served from > > RAM. It may be that whatever fs you choose wont make that much of a > > difference anyways. > > > > If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's > used by some distribution and Oracle for real work) > Most benchmark don't use compression since other FS can't use it. But > that's unfair. With compression, one needs to read > much less data (my /usr partition has less than 50% of an ext4 > partition, savings with the root partition are even higher). > > I'm using the mount options > compress=lzo,noacl,noatime,autodefrag,space_cache which require a > recent kernel. > > I'd give it a try. > > Helmut. > > Whats the latest on fsck tools for BTRFS? - useful ones are still not available right? Reason I am asking is that is not an easy question to google, and my last attempt to use BTRFS for serious work ended in tears when I couldn't rescue a corrupted file system. BillK ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [gentoo-user] Fast file system for cache directory with lot's of files 2012-08-15 7:31 ` Bill Kenworthy @ 2012-08-15 8:13 ` Bill Kenworthy 0 siblings, 0 replies; 41+ messages in thread From: Bill Kenworthy @ 2012-08-15 8:13 UTC (permalink / raw To: gentoo-user On Wed, 2012-08-15 at 15:31 +0800, Bill Kenworthy wrote: > On Tue, 2012-08-14 at 18:36 +0200, Helmut Jarausch wrote: > > On 08/14/2012 04:07:39 AM, Adam Carter wrote: > > > > I think btrfs probably is meant to provide a lot of the modern > > > > features like reiser4 or xfs > > > > > > Unfortunately btrfs is still generally slower than ext4 for example. > > > Checkout http://openbenchmarking.org/, eg > > > http://openbenchmarking.org/s/ext4%20btrfs > > > > > > The OS will use any spare RAM for disk caching, so if there's not much > > > else running on that box, most of your content will be served from > > > RAM. It may be that whatever fs you choose wont make that much of a > > > difference anyways. > > > > > > > If one can run a recent kernel (3.5.x) btrfs seems quite stable (It's > > used by some distribution and Oracle for real work) > > Most benchmark don't use compression since other FS can't use it. But > > that's unfair. With compression, one needs to read > > much less data (my /usr partition has less than 50% of an ext4 > > partition, savings with the root partition are even higher). > > > > I'm using the mount options > > compress=lzo,noacl,noatime,autodefrag,space_cache which require a > > recent kernel. > > > > I'd give it a try. > > > > Helmut. > > > > > > Whats the latest on fsck tools for BTRFS? - useful ones are still not > available right? Reason I am asking is that is not an easy question to > google, and my last attempt to use BTRFS for serious work ended in tears > when I couldn't rescue a corrupted file system. > > BillK Sorry, replying to myself to clarify ... I sent this as I was reading the backlog before the statement that the tools are incomplete. my question is more along the lines of do they work? (which was answered as "I do not know" in posted links which are probably old) Another point I just saw is its inability to support swapfiles. Also in the past OO would not compile on a btrfs (/tmp/portage) filesystem as it did something that basicly killed everything. Other packages were fine. Then there was a certain man page I couldnt backup to a btrfs file system, ~/.gvfs files that hung the system when I tried to put on btrfs. Hopefully they have been fixed. BillK ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2012-08-16 16:56 UTC | newest] Thread overview: 41+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-13 13:16 [gentoo-user] Fast file system for cache directory with lot's of files Michael Hampicke 2012-08-13 13:22 ` Nilesh Govindrajan 2012-08-13 13:54 ` Michael Hampicke 2012-08-13 14:19 ` Pandu Poluan 2012-08-13 14:42 ` Michael Hampicke 2012-08-13 14:52 ` Michael Mol 2012-08-13 15:26 ` Michael Hampicke 2012-08-13 15:52 ` Michael Mol 2012-08-13 17:14 ` Florian Philipp 2012-08-13 18:18 ` Michael Hampicke 2012-08-14 14:00 ` Florian Philipp 2012-08-14 17:42 ` Michael Hampicke 2012-08-13 14:40 ` Dale 2012-08-13 14:58 ` Michael Hampicke 2012-08-13 15:20 ` Nilesh Govindrajan 2012-08-13 14:38 ` Daniel Troeder 2012-08-13 14:53 ` Michael Hampicke 2012-08-14 8:21 ` Daniel Troeder 2012-08-14 9:46 ` Neil Bothwick 2012-08-14 13:00 ` Florian Philipp 2012-08-14 13:54 ` Daniel Troeder 2012-08-14 15:09 ` Florian Philipp 2012-08-14 15:33 ` Florian Philipp 2012-08-16 16:54 ` Neil Bothwick 2012-08-14 17:45 ` Michael Hampicke 2012-08-13 20:13 ` Paul Hartman 2012-08-13 20:41 ` Volker Armin Hemmann 2012-08-14 2:07 ` Adam Carter 2012-08-14 16:36 ` Helmut Jarausch 2012-08-14 17:05 ` Pandu Poluan 2012-08-14 17:21 ` Jason Weisberger 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 17:50 ` Michael Hampicke 2012-08-14 19:55 ` Alecks Gates 2012-08-14 20:17 ` Michael Mol 2012-08-14 20:57 ` Alecks Gates 2012-08-14 17:48 ` Michael Hampicke 2012-08-14 17:42 ` Volker Armin Hemmann 2012-08-14 19:39 ` Paul Hartman 2012-08-15 7:31 ` Bill Kenworthy 2012-08-15 8:13 ` Bill Kenworthy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox