* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ [not found] <1d4ac47c28706094230cb2c4e6ee1c1c71629aa0.arfrever@gentoo> @ 2011-11-26 10:58 ` Fabian Groffen 2011-11-26 11:26 ` Nirbheek Chauhan 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis 0 siblings, 2 replies; 20+ messages in thread From: Fabian Groffen @ 2011-11-26 10:58 UTC (permalink / raw To: gentoo-dev, Arfrever Frehtes Taifersar Arahesis [-- Attachment #1: Type: text/plain, Size: 1381 bytes --] Attempt 2 from correct email address, sorry for any duplicate in advance. On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote: > commit: 1d4ac47c28706094230cb2c4e6ee1c1c71629aa0 > T> Org> > AuthorDate: Sat Nov 26 01:52:49 2011 +0000 > Commit: Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org> > CommitDate: Sat Nov 26 01:52:49 2011 +0000 > URL: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c > > dblink.mergeme(): Merge files in alphabetic order. What's the advantage of this? I don't really like to pay for sorting a potentially huge list just for some eye-candy. (That's omitted by default these days anyway...) Any other opinions on this one? > --- > pym/portage/dbapi/vartree.py | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/pym/portage/dbapi/vartree.py b/pym/portage/dbapi/vartree.py > index dd74c10..099164a 100644 > --- a/pym/portage/dbapi/vartree.py > +++ b/pym/portage/dbapi/vartree.py > @@ -3981,7 +3981,7 @@ class dblink(object): > mergelist = stufftomerge > offset = "" > > - for i, x in enumerate(mergelist): > + for i, x in enumerate(sorted(mergelist)): > > mysrc = join(srcroot, offset, x) > mydest = join(destroot, offset, x) > -- Fabian Groffen Gentoo on a different level [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen @ 2011-11-26 11:26 ` Nirbheek Chauhan 2011-11-26 11:37 ` "Paweł Hajdan, Jr." [not found] ` <20111126113830.GC37825@gentoo.org> 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis 1 sibling, 2 replies; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 11:26 UTC (permalink / raw To: gentoo-dev, Arfrever Frehtes Taifersar Arahesis On Sat, Nov 26, 2011 at 4:28 PM, Fabian Groffen <grobian@gentoo.org> wrote: > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote: >> commit: 1d4ac47c28706094230cb2c4e6ee1c1c71629aa0 >> T> Org> >> AuthorDate: Sat Nov 26 01:52:49 2011 +0000 >> Commit: Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org> >> CommitDate: Sat Nov 26 01:52:49 2011 +0000 >> URL: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c >> >> dblink.mergeme(): Merge files in alphabetic order. > > What's the advantage of this? I don't really like to pay for sorting a > potentially huge list just for some eye-candy. (That's omitted by > default these days anyway...) > Any other opinions on this one? > If it should be sorted[1], it should really be sorted in the reverse order of distfile-download size. That would be extremely useful on systems with slow internet connections. Too many times have I sat waiting for libreoffice-bin to download while a webkit-gtk recompile waits in the queue. We already have the information during dependency resolution with --verbose, and it costs very little. Besides, sorting even 30,000 entries (if you're merging every ebuild in portage) should not take more than a few secs. 1. I'm obviously assuming that dep nodes that do not depend on each other would be sorted -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 11:26 ` Nirbheek Chauhan @ 2011-11-26 11:37 ` "Paweł Hajdan, Jr." [not found] ` <20111126113830.GC37825@gentoo.org> 1 sibling, 0 replies; 20+ messages in thread From: "Paweł Hajdan, Jr." @ 2011-11-26 11:37 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 480 bytes --] On 11/26/11 12:26 PM, Nirbheek Chauhan wrote: > If it should be sorted[1], it should really be sorted in the reverse > order of distfile-download size. That would be extremely useful on > systems with slow internet connections. [...] > > 1. I'm obviously assuming that dep nodes that do not depend on each > other would be sorted Seconded. I think practical reasons are more important than an arbitrary order, and I'd also benefit from this download-oriented order. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20111126113830.GC37825@gentoo.org>]
* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ [not found] ` <20111126113830.GC37825@gentoo.org> @ 2011-11-26 12:50 ` Nirbheek Chauhan 2011-11-26 12:59 ` Ciaran McCreesh 2011-11-26 16:19 ` Mike Frysinger 0 siblings, 2 replies; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 12:50 UTC (permalink / raw To: Fabian Groffen; +Cc: gentoo-portage-dev, Gentoo Dev On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen <grobian@gentoo.org> wrote: > On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote: >> [...] Besides, sorting even 30,000 >> entries (if you're merging every ebuild in portage) should not take >> more than a few secs. > > A linux kernel has around that much of files, and I really wonder if > it's worth waiting a couple of seconds (probably more on sparc and arm > systems) just because then the files are in sorted order. > I'm not sure the two are really comparable. However, looking at a simple string sort on 30,000 strings, I don't see it taking a significant amount of time at all: import random import time t1 = time.time() a = range(100000, 130000) random.shuffle(a) b = [str(i) for i in a] t2 = time.time() b.sort() t3 = time.time() print(t2-t1) print(t3-t2) ---- 0.0682320594788 0.0464689731598 >> 1. I'm obviously assuming that dep nodes that do not depend on each >> other would be sorted > > I think this is per package. > Actually, reading the code it seems that it's about the file merge order of a single package. My participation in this entire discussion is m00t. Never mind. :p -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 12:50 ` Nirbheek Chauhan @ 2011-11-26 12:59 ` Ciaran McCreesh 2011-11-26 13:44 ` Rich Freeman 2011-11-26 16:19 ` Mike Frysinger 1 sibling, 1 reply; 20+ messages in thread From: Ciaran McCreesh @ 2011-11-26 12:59 UTC (permalink / raw To: gentoo-dev [-- Attachment #1: Type: text/plain, Size: 413 bytes --] On Sat, 26 Nov 2011 18:20:27 +0530 Nirbheek Chauhan <nirbheek@gentoo.org> wrote: > Actually, reading the code it seems that it's about the file merge > order of a single package. My participation in this entire discussion > is m00t. Never mind. :p ...in which case it's often an awful lot faster to sort by inode, not by filename. Try it when installing a kernel sources package. -- Ciaran McCreesh [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 12:59 ` Ciaran McCreesh @ 2011-11-26 13:44 ` Rich Freeman 2011-11-26 15:09 ` Michał Górny 2011-11-26 15:50 ` Nirbheek Chauhan 0 siblings, 2 replies; 20+ messages in thread From: Rich Freeman @ 2011-11-26 13:44 UTC (permalink / raw To: gentoo-dev On Sat, Nov 26, 2011 at 7:59 AM, Ciaran McCreesh <ciaran.mccreesh@googlemail.com> wrote: > On Sat, 26 Nov 2011 18:20:27 +0530 > Nirbheek Chauhan <nirbheek@gentoo.org> wrote: >> Actually, reading the code it seems that it's about the file merge >> order of a single package. My participation in this entire discussion >> is m00t. Never mind. :p > > ...in which case it's often an awful lot faster to sort by inode, not by > filename. Try it when installing a kernel sources package. I can believe it. Btrfs added inode-order directory indexes precisely for this reason. I'd have to look up the details but I think it was designed to return the directories in this order to function calls so that anything that iterates through the tree would get this optimization by default. Of course, if you then resort the list first you lose that. (It also has the ext3 dir_index-style indexes for named file lookups.) Oh, on the topic of btrfs, if any emerge operations do file copies, adding --reflink=auto to the cp command will GREATLY improve performance. That does a copy-on-write copy - it behaves like a hard-link as far as time to create goes, but it behaves like a full copy as far as modifications not being shared goes. It also uses almost no additional disk space until the content starts to diverge between the copies. Setting reflink=auto should be safe on non-COW filesystems as it will fall back to a normal copy if the operation isn't supported. It is available in stable coreutils. Some speculate that this option could increase fragmentation (both copies will share extents from the original file, and have some extents of their own), but btrfs doesn't overwrite anything in-place so fragmentation is a potential issue with any file modification (change one byte in the middle of a file and you get a new record somewhere with one byte in it and a bunch of pointers in the metadata saying "stick this byte here" - though for one byte I'm guessing it would end up in the metadata tree much as ext3 stores small files in their inodes so the one byte would be in ram when the pointer to it is loaded). Rich ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 13:44 ` Rich Freeman @ 2011-11-26 15:09 ` Michał Górny 2011-11-26 15:25 ` Rich Freeman 2011-11-26 15:58 ` Nirbheek Chauhan 2011-11-26 15:50 ` Nirbheek Chauhan 1 sibling, 2 replies; 20+ messages in thread From: Michał Górny @ 2011-11-26 15:09 UTC (permalink / raw To: gentoo-dev; +Cc: rich0 [-- Attachment #1: Type: text/plain, Size: 857 bytes --] On Sat, 26 Nov 2011 08:44:28 -0500 Rich Freeman <rich0@gentoo.org> wrote: > Oh, on the topic of btrfs, if any emerge operations do file copies, > adding --reflink=auto to the cp command will GREATLY improve > performance. That does a copy-on-write copy - it behaves like a > hard-link as far as time to create goes, but it behaves like a full > copy as far as modifications not being shared goes. [...] We don't rely on external tools to do the copying. AFAIR it uses Python's shutil module which is rather poor. I'm slowly working on creating atomic-install tool for merging this more optimally [1]. But in this particular case, I don't think COW is particularly useful. If it works only on filesystem bounds, we could move the file directly anyway. [1]:https://github.com/mgorny/atomic-install -- Best regards, Michał Górny [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 15:09 ` Michał Górny @ 2011-11-26 15:25 ` Rich Freeman 2011-11-26 16:00 ` Michał Górny 2011-11-26 15:58 ` Nirbheek Chauhan 1 sibling, 1 reply; 20+ messages in thread From: Rich Freeman @ 2011-11-26 15:25 UTC (permalink / raw To: Michał Górny; +Cc: gentoo-dev On Sat, Nov 26, 2011 at 10:09 AM, Michał Górny <mgorny@gentoo.org> wrote: > But in this particular case, I don't think COW is particularly useful. > If it works only on filesystem bounds, we could move the file directly > anyway. Yup - I would only use it if you really are doing a copy and not a move (neglecting the fact that the implementation of a cross-filesystem move does a copy first). I imagine many ebuilds do copy operations internally, but probably not to an extent where it would make much difference. I'm not sure how doins/dobin/etc are implemented - I think they're copies and so allowing for the fact that not everybody uses a tmpfs it might make sense to fix those. Rich ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 15:25 ` Rich Freeman @ 2011-11-26 16:00 ` Michał Górny 0 siblings, 0 replies; 20+ messages in thread From: Michał Górny @ 2011-11-26 16:00 UTC (permalink / raw To: gentoo-dev; +Cc: rich0 [-- Attachment #1: Type: text/plain, Size: 981 bytes --] On Sat, 26 Nov 2011 10:25:15 -0500 Rich Freeman <rich0@gentoo.org> wrote: > On Sat, Nov 26, 2011 at 10:09 AM, Michał Górny <mgorny@gentoo.org> > wrote: > > But in this particular case, I don't think COW is particularly > > useful. If it works only on filesystem bounds, we could move the > > file directly anyway. > > Yup - I would only use it if you really are doing a copy and not a > move (neglecting the fact that the implementation of a > cross-filesystem move does a copy first). I imagine many ebuilds do > copy operations internally, but probably not to an extent where it > would make much difference. I'm not sure how doins/dobin/etc are > implemented - I think they're copies and so allowing for the fact that > not everybody uses a tmpfs it might make sense to fix those. AFAICS doins uses 'install' mostly, and sometimes 'cp' (with symlinks). I don't see any variant of '--reflink' option for 'install'. -- Best regards, Michał Górny [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 15:09 ` Michał Górny 2011-11-26 15:25 ` Rich Freeman @ 2011-11-26 15:58 ` Nirbheek Chauhan 2011-11-26 16:08 ` Michał Górny 1 sibling, 1 reply; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 15:58 UTC (permalink / raw To: gentoo-dev; +Cc: rich0 On Sat, Nov 26, 2011 at 8:39 PM, Michał Górny <mgorny@gentoo.org> wrote: > But in this particular case, I don't think COW is particularly useful. > If it works only on filesystem bounds, we could move the file directly > anyway. > There are still a few specific cases in which CoW would indeed be useful. IIRC, reflinking of files works across btrfs *subvolumes*, and such a copy would normally be detected as a cross-device move. Another use would be an patch-merge which makes use of *ranged reflinks* to only CoW copy those parts of the file that were changed[1]. rsync has support for this, but only while appending to files (--append-verify --no-whole-file). 1. Somewhat like rope data structures, with the caveat that ranges must be block-size aligned. -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 15:58 ` Nirbheek Chauhan @ 2011-11-26 16:08 ` Michał Górny 2011-11-26 16:31 ` Nirbheek Chauhan 0 siblings, 1 reply; 20+ messages in thread From: Michał Górny @ 2011-11-26 16:08 UTC (permalink / raw To: gentoo-dev; +Cc: nirbheek, rich0 [-- Attachment #1: Type: text/plain, Size: 1022 bytes --] On Sat, 26 Nov 2011 21:28:51 +0530 Nirbheek Chauhan <nirbheek@gentoo.org> wrote: > On Sat, Nov 26, 2011 at 8:39 PM, Michał Górny <mgorny@gentoo.org> > wrote: > > But in this particular case, I don't think COW is particularly > > useful. If it works only on filesystem bounds, we could move the > > file directly anyway. > > > > There are still a few specific cases in which CoW would indeed be > useful. IIRC, reflinking of files works across btrfs *subvolumes*, and > such a copy would normally be detected as a cross-device move. For such a thing, shouldn't rename() work neat anyway? > Another use would be an patch-merge which makes use of *ranged > reflinks* to only CoW copy those parts of the file that were > changed[1]. rsync has support for this, but only while appending to > files (--append-verify --no-whole-file). So, it'd be like: 1) CoW-dup old file, 2) patch-merge into the duped old file, 3) replace. Am I understanding correctly? -- Best regards, Michał Górny [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 16:08 ` Michał Górny @ 2011-11-26 16:31 ` Nirbheek Chauhan 0 siblings, 0 replies; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 16:31 UTC (permalink / raw To: Michał Górny; +Cc: gentoo-dev, rich0 On Sat, Nov 26, 2011 at 9:38 PM, Michał Górny <mgorny@gentoo.org> wrote: > On Sat, 26 Nov 2011 21:28:51 +0530 > Nirbheek Chauhan <nirbheek@gentoo.org> wrote: >> There are still a few specific cases in which CoW would indeed be >> useful. IIRC, reflinking of files works across btrfs *subvolumes*, and >> such a copy would normally be detected as a cross-device move. > > For such a thing, shouldn't rename() work neat anyway? > No, because reflink is an ioctl that works directly on the FS level by sharing data blocks, and should theoretically not bother about the file hierarchy. On the other hand, rename() is a userland API and must behave itself. >> Another use would be an patch-merge which makes use of *ranged >> reflinks* to only CoW copy those parts of the file that were >> changed[1]. rsync has support for this, but only while appending to >> files (--append-verify --no-whole-file). > > So, it'd be like: > 1) CoW-dup old file, > 2) patch-merge into the duped old file, > 3) replace. > > Am I understanding correctly? > You can do that, or perhaps you can just do the patch-merge in-place. Not sure about the crash guarantees in the latter case. The former (rename) is documented here: https://btrfs.wiki.kernel.org/articles/f/a/q/FAQ_1fe9.html#What_are_the_crash_guarantees_of_overwrite-by-rename.3F But in all this, the hard part is really the "patch-merge" for anything except appends. ;) -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 13:44 ` Rich Freeman 2011-11-26 15:09 ` Michał Górny @ 2011-11-26 15:50 ` Nirbheek Chauhan 1 sibling, 0 replies; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 15:50 UTC (permalink / raw To: gentoo-dev On Sat, Nov 26, 2011 at 7:14 PM, Rich Freeman <rich0@gentoo.org> wrote: > isn't supported. It is available in stable coreutils. Some speculate > that this option could increase fragmentation (both copies will share > extents from the original file, and have some extents of their own), > but btrfs doesn't overwrite anything in-place so fragmentation is a > potential issue with any file modification (change one byte in the Adding to your comments on this: To mitigate such issues, newer versions of the btrfs fs driver have automatic online defragmentation as well. Works quite well for moderate fragmentation. A particularly ghastly example where fragmentation issues become pathological in nature are files that are fsync()ed very frequently. A typical example are the *.sqlite files in ~/.mozilla which easily get hundreds or even thousands of fragments after a few hours worth of firefox usage (can be verified with filefrag). To fix such things, regular online defragmentation of those specific files can be done using `btrfs fi defrag <file>`. -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 12:50 ` Nirbheek Chauhan 2011-11-26 12:59 ` Ciaran McCreesh @ 2011-11-26 16:19 ` Mike Frysinger 2011-11-26 16:34 ` Nirbheek Chauhan 2011-11-27 21:44 ` [gentoo-portage-dev] " Zac Medico 1 sibling, 2 replies; 20+ messages in thread From: Mike Frysinger @ 2011-11-26 16:19 UTC (permalink / raw To: gentoo-dev; +Cc: Nirbheek Chauhan, Fabian Groffen, gentoo-portage-dev [-- Attachment #1: Type: Text/Plain, Size: 963 bytes --] On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote: > On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen wrote: > > On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote: > >> [...] Besides, sorting even 30,000 > >> entries (if you're merging every ebuild in portage) should not take > >> more than a few secs. > > > > A linux kernel has around that much of files, and I really wonder if > > it's worth waiting a couple of seconds (probably more on sparc and arm > > systems) just because then the files are in sorted order. > > I'm not sure the two are really comparable. However, looking at a > simple string sort on 30,000 strings, I don't see it taking a > significant amount of time at all: sure, it's probably not significantly higher, but i also can't see any point in sorting the entries. we've been doing fine so far in the 10+ years of it being unsorted. so unless Arfrever has a compelling reason, time to revert. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 16:19 ` Mike Frysinger @ 2011-11-26 16:34 ` Nirbheek Chauhan 2011-11-27 21:44 ` [gentoo-portage-dev] " Zac Medico 1 sibling, 0 replies; 20+ messages in thread From: Nirbheek Chauhan @ 2011-11-26 16:34 UTC (permalink / raw To: Mike Frysinger; +Cc: gentoo-dev, Fabian Groffen, gentoo-portage-dev On Sat, Nov 26, 2011 at 9:49 PM, Mike Frysinger <vapier@gentoo.org> wrote: > On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote: >> I'm not sure the two are really comparable. However, looking at a >> simple string sort on 30,000 strings, I don't see it taking a >> significant amount of time at all: > > sure, it's probably not significantly higher, but i also can't see any point in > sorting the entries. we've been doing fine so far in the 10+ years of it being > unsorted. so unless Arfrever has a compelling reason, time to revert. > -mike > I agree. My argument was that if sorting has some benefit, the cost is negligible, and it should be done properly. If the benefits of the sorting are non-existent, or if it causes problems as Ciaran pointed out, then sorting should not be done. -- ~Nirbheek Chauhan Gentoo GNOME+Mozilla Team ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-portage-dev] Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 16:19 ` Mike Frysinger 2011-11-26 16:34 ` Nirbheek Chauhan @ 2011-11-27 21:44 ` Zac Medico 1 sibling, 0 replies; 20+ messages in thread From: Zac Medico @ 2011-11-27 21:44 UTC (permalink / raw To: gentoo-portage-dev Cc: Mike Frysinger, gentoo-dev, Nirbheek Chauhan, Fabian Groffen On 11/26/2011 08:19 AM, Mike Frysinger wrote: > On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote: >> On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen wrote: >>> On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote: >>>> [...] Besides, sorting even 30,000 >>>> entries (if you're merging every ebuild in portage) should not take >>>> more than a few secs. >>> >>> A linux kernel has around that much of files, and I really wonder if >>> it's worth waiting a couple of seconds (probably more on sparc and arm >>> systems) just because then the files are in sorted order. >> >> I'm not sure the two are really comparable. However, looking at a >> simple string sort on 30,000 strings, I don't see it taking a >> significant amount of time at all: > > sure, it's probably not significantly higher, but i also can't see any point in > sorting the entries. we've been doing fine so far in the 10+ years of it being > unsorted. so unless Arfrever has a compelling reason, time to revert. > -mike Okay, reverted: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=7c5b170d47ab054bc3f8a7778dd3f8139c1239c6 -- Thanks, Zac ^ permalink raw reply [flat|nested] 20+ messages in thread
* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen 2011-11-26 11:26 ` Nirbheek Chauhan @ 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis 2011-11-28 8:06 ` Michał Górny 2011-11-30 20:31 ` Mike Frysinger 1 sibling, 2 replies; 20+ messages in thread From: Arfrever Frehtes Taifersar Arahesis @ 2011-11-27 22:28 UTC (permalink / raw To: Gentoo Development [-- Attachment #1: Type: Text/Plain, Size: 897 bytes --] 2011-11-26 11:58:22 Fabian Groffen napisał(a): > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote: > > commit: 1d4ac47c28706094230cb2c4e6ee1c1c71629aa0 > > T> Org> > > AuthorDate: Sat Nov 26 01:52:49 2011 +0000 > > Commit: Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org> > > CommitDate: Sat Nov 26 01:52:49 2011 +0000 > > URL: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c > > > > dblink.mergeme(): Merge files in alphabetic order. > > What's the advantage of this? The advantage is that this allows to easier review output of `emerge` to verify if correct files have been installed. > I don't really like to pay for sorting a potentially huge list just for some eye-candy. Time of sorting in case of Linux sources is less than 0.05 s. -- Arfrever Frehtes Taifersar Arahesis [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis @ 2011-11-28 8:06 ` Michał Górny 2011-11-28 12:57 ` Rich Freeman 2011-11-30 20:31 ` Mike Frysinger 1 sibling, 1 reply; 20+ messages in thread From: Michał Górny @ 2011-11-28 8:06 UTC (permalink / raw To: gentoo-dev; +Cc: arfrever [-- Attachment #1: Type: text/plain, Size: 1116 bytes --] On Sun, 27 Nov 2011 23:28:12 +0100 Arfrever Frehtes Taifersar Arahesis <arfrever@gentoo.org> wrote: > 2011-11-26 11:58:22 Fabian Groffen napisał(a): > > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis > > wrote: > > > commit: 1d4ac47c28706094230cb2c4e6ee1c1c71629aa0 > > > T> Org> > > > AuthorDate: Sat Nov 26 01:52:49 2011 +0000 > > > Commit: Arfrever Frehtes Taifersar Arahesis <arfrever <AT> > > > gentoo <DOT> org> CommitDate: Sat Nov 26 01:52:49 2011 +0000 > > > URL: > > > http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c > > > > > > dblink.mergeme(): Merge files in alphabetic order. > > > > What's the advantage of this? > > The advantage is that this allows to easier review output of `emerge` > to verify if correct files have been installed. > > > I don't really like to pay for sorting a potentially huge list just > > for some eye-candy. > > Time of sorting in case of Linux sources is less than 0.05 s. Have you considered time overhead of moving files in unnatural order? -- Best regards, Michał Górny [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-28 8:06 ` Michał Górny @ 2011-11-28 12:57 ` Rich Freeman 0 siblings, 0 replies; 20+ messages in thread From: Rich Freeman @ 2011-11-28 12:57 UTC (permalink / raw To: gentoo-dev; +Cc: arfrever On Mon, Nov 28, 2011 at 3:06 AM, Michał Górny <mgorny@gentoo.org> wrote: > Have you considered time overhead of moving files in unnatural order? Rather than re-discuss this point it would probably be better for everybody to just read through the entire thread again, particularly Cirian's post and its follow-ups. My understanding is that the patch has already been reverted. Now, if somebody has a suggestion for how to sort the files in such a way as to improve the performance (assuming that it doesn't already happen in inode order) that would certainly take things forward. Sorting lists in RAM is cheap, disk seeks are expensive. Of course, if you're using tmpfs it is either all in RAM or in swap in the first place, and I'm not sure if swap brings in additional considerations here. Rich ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis 2011-11-28 8:06 ` Michał Górny @ 2011-11-30 20:31 ` Mike Frysinger 1 sibling, 0 replies; 20+ messages in thread From: Mike Frysinger @ 2011-11-30 20:31 UTC (permalink / raw To: gentoo-dev; +Cc: Arfrever Frehtes Taifersar Arahesis [-- Attachment #1: Type: Text/Plain, Size: 1299 bytes --] On Sunday 27 November 2011 17:28:12 Arfrever Frehtes Taifersar Arahesis wrote: > 2011-11-26 11:58:22 Fabian Groffen napisał(a): > > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote: > > > commit: 1d4ac47c28706094230cb2c4e6ee1c1c71629aa0 > > > T> Org> > > > AuthorDate: Sat Nov 26 01:52:49 2011 +0000 > > > Commit: Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo > > > <DOT> org> CommitDate: Sat Nov 26 01:52:49 2011 +0000 > > > URL: > > > http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1 > > > d4ac47c > > > > > > dblink.mergeme(): Merge files in alphabetic order. > > > > What's the advantage of this? > > The advantage is that this allows to easier review output of `emerge` to > verify if correct files have been installed. then you should be sorting the output and not the db along these lines, if you're going to be making commits to important repos, please use changelogs that are actually useful. the changelog here: dblink.mergeme(): Merge files in alphabetic order. is utterly and completely useless. you need to document *why* you're changing things first and foremost, and only then can you delve into details as to why the implementation you're committing makes sense. -mike [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2011-11-30 20:32 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <1d4ac47c28706094230cb2c4e6ee1c1c71629aa0.arfrever@gentoo> 2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen 2011-11-26 11:26 ` Nirbheek Chauhan 2011-11-26 11:37 ` "Paweł Hajdan, Jr." [not found] ` <20111126113830.GC37825@gentoo.org> 2011-11-26 12:50 ` Nirbheek Chauhan 2011-11-26 12:59 ` Ciaran McCreesh 2011-11-26 13:44 ` Rich Freeman 2011-11-26 15:09 ` Michał Górny 2011-11-26 15:25 ` Rich Freeman 2011-11-26 16:00 ` Michał Górny 2011-11-26 15:58 ` Nirbheek Chauhan 2011-11-26 16:08 ` Michał Górny 2011-11-26 16:31 ` Nirbheek Chauhan 2011-11-26 15:50 ` Nirbheek Chauhan 2011-11-26 16:19 ` Mike Frysinger 2011-11-26 16:34 ` Nirbheek Chauhan 2011-11-27 21:44 ` [gentoo-portage-dev] " Zac Medico 2011-11-27 22:28 ` Arfrever Frehtes Taifersar Arahesis 2011-11-28 8:06 ` Michał Górny 2011-11-28 12:57 ` Rich Freeman 2011-11-30 20:31 ` Mike Frysinger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox