public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
       [not found] <1d4ac47c28706094230cb2c4e6ee1c1c71629aa0.arfrever@gentoo>
@ 2011-11-26 10:58 ` Fabian Groffen
  2011-11-26 11:26   ` Nirbheek Chauhan
  2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
  0 siblings, 2 replies; 20+ messages in thread
From: Fabian Groffen @ 2011-11-26 10:58 UTC (permalink / raw
  To: gentoo-dev, Arfrever Frehtes Taifersar Arahesis

[-- Attachment #1: Type: text/plain, Size: 1381 bytes --]

Attempt 2 from correct email address, sorry for any duplicate in
advance.

On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote:
> commit:     1d4ac47c28706094230cb2c4e6ee1c1c71629aa0
> T> Org>
> AuthorDate: Sat Nov 26 01:52:49 2011 +0000
> Commit:     Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org>
> CommitDate: Sat Nov 26 01:52:49 2011 +0000
> URL:        http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c
> 
> dblink.mergeme(): Merge files in alphabetic order.

What's the advantage of this?  I don't really like to pay for sorting a
potentially huge list just for some eye-candy.  (That's omitted by
default these days anyway...)
Any other opinions on this one?

> ---
>  pym/portage/dbapi/vartree.py |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/pym/portage/dbapi/vartree.py b/pym/portage/dbapi/vartree.py
> index dd74c10..099164a 100644
> --- a/pym/portage/dbapi/vartree.py
> +++ b/pym/portage/dbapi/vartree.py
> @@ -3981,7 +3981,7 @@ class dblink(object):
>  			mergelist = stufftomerge
>  			offset = ""
>  
> -		for i, x in enumerate(mergelist):
> +		for i, x in enumerate(sorted(mergelist)):
>  
>  			mysrc = join(srcroot, offset, x)
>  			mydest = join(destroot, offset, x)
> 

-- 
Fabian Groffen
Gentoo on a different level

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen
@ 2011-11-26 11:26   ` Nirbheek Chauhan
  2011-11-26 11:37     ` "Paweł Hajdan, Jr."
       [not found]     ` <20111126113830.GC37825@gentoo.org>
  2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
  1 sibling, 2 replies; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 11:26 UTC (permalink / raw
  To: gentoo-dev, Arfrever Frehtes Taifersar Arahesis

On Sat, Nov 26, 2011 at 4:28 PM, Fabian Groffen <grobian@gentoo.org> wrote:
> On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote:
>> commit:     1d4ac47c28706094230cb2c4e6ee1c1c71629aa0
>> T> Org>
>> AuthorDate: Sat Nov 26 01:52:49 2011 +0000
>> Commit:     Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org>
>> CommitDate: Sat Nov 26 01:52:49 2011 +0000
>> URL:        http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c
>>
>> dblink.mergeme(): Merge files in alphabetic order.
>
> What's the advantage of this?  I don't really like to pay for sorting a
> potentially huge list just for some eye-candy.  (That's omitted by
> default these days anyway...)
> Any other opinions on this one?
>

If it should be sorted[1], it should really be sorted in the reverse
order of distfile-download size. That would be extremely useful on
systems with slow internet connections. Too many times have I sat
waiting for libreoffice-bin to download while a webkit-gtk recompile
waits in the queue.

We already have the information during dependency resolution with
--verbose, and it costs very little. Besides, sorting even 30,000
entries (if you're merging every ebuild in portage) should not take
more than a few secs.

1. I'm obviously assuming that dep nodes that do not depend on each
other would be sorted

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 11:26   ` Nirbheek Chauhan
@ 2011-11-26 11:37     ` "Paweł Hajdan, Jr."
       [not found]     ` <20111126113830.GC37825@gentoo.org>
  1 sibling, 0 replies; 20+ messages in thread
From: "Paweł Hajdan, Jr." @ 2011-11-26 11:37 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 480 bytes --]

On 11/26/11 12:26 PM, Nirbheek Chauhan wrote:
> If it should be sorted[1], it should really be sorted in the reverse
> order of distfile-download size. That would be extremely useful on
> systems with slow internet connections. [...]
> 
> 1. I'm obviously assuming that dep nodes that do not depend on each
> other would be sorted

Seconded. I think practical reasons are more important than an arbitrary
order, and I'd also benefit from this download-oriented order.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 203 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
       [not found]     ` <20111126113830.GC37825@gentoo.org>
@ 2011-11-26 12:50       ` Nirbheek Chauhan
  2011-11-26 12:59         ` Ciaran McCreesh
  2011-11-26 16:19         ` Mike Frysinger
  0 siblings, 2 replies; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 12:50 UTC (permalink / raw
  To: Fabian Groffen; +Cc: gentoo-portage-dev, Gentoo Dev

On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen <grobian@gentoo.org> wrote:
> On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote:
>> [...] Besides, sorting even 30,000
>> entries (if you're merging every ebuild in portage) should not take
>> more than a few secs.
>
> A linux kernel has around that much of files, and I really wonder if
> it's worth waiting a couple of seconds (probably more on sparc and arm
> systems) just because then the files are in sorted order.
>

I'm not sure the two are really comparable. However, looking at a
simple string sort on 30,000 strings, I don't see it taking a
significant amount of time at all:

import random
import time
t1 = time.time()
a = range(100000, 130000)
random.shuffle(a)
b = [str(i) for i in a]
t2 = time.time()
b.sort()
t3 = time.time()
print(t2-t1)
print(t3-t2)

----
0.0682320594788
0.0464689731598


>> 1. I'm obviously assuming that dep nodes that do not depend on each
>> other would be sorted
>
> I think this is per package.
>

Actually, reading the code it seems that it's about the file merge
order of a single package. My participation in this entire discussion
is m00t. Never mind. :p

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 12:50       ` Nirbheek Chauhan
@ 2011-11-26 12:59         ` Ciaran McCreesh
  2011-11-26 13:44           ` Rich Freeman
  2011-11-26 16:19         ` Mike Frysinger
  1 sibling, 1 reply; 20+ messages in thread
From: Ciaran McCreesh @ 2011-11-26 12:59 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

On Sat, 26 Nov 2011 18:20:27 +0530
Nirbheek Chauhan <nirbheek@gentoo.org> wrote:
> Actually, reading the code it seems that it's about the file merge
> order of a single package. My participation in this entire discussion
> is m00t. Never mind. :p

...in which case it's often an awful lot faster to sort by inode, not by
filename. Try it when installing a kernel sources package.

-- 
Ciaran McCreesh

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 12:59         ` Ciaran McCreesh
@ 2011-11-26 13:44           ` Rich Freeman
  2011-11-26 15:09             ` Michał Górny
  2011-11-26 15:50             ` Nirbheek Chauhan
  0 siblings, 2 replies; 20+ messages in thread
From: Rich Freeman @ 2011-11-26 13:44 UTC (permalink / raw
  To: gentoo-dev

On Sat, Nov 26, 2011 at 7:59 AM, Ciaran McCreesh
<ciaran.mccreesh@googlemail.com> wrote:
> On Sat, 26 Nov 2011 18:20:27 +0530
> Nirbheek Chauhan <nirbheek@gentoo.org> wrote:
>> Actually, reading the code it seems that it's about the file merge
>> order of a single package. My participation in this entire discussion
>> is m00t. Never mind. :p
>
> ...in which case it's often an awful lot faster to sort by inode, not by
> filename. Try it when installing a kernel sources package.

I can believe it.  Btrfs added inode-order directory indexes precisely
for this reason.  I'd have to look up the details but I think it was
designed to return the directories in this order to function calls so
that anything that iterates through the tree would get this
optimization by default.  Of course, if you then resort the list first
you lose that.  (It also has the ext3 dir_index-style indexes for
named file lookups.)

Oh, on the topic of btrfs, if any emerge operations do file copies,
adding --reflink=auto to the cp command will GREATLY improve
performance.  That does a copy-on-write copy - it behaves like a
hard-link as far as time to create goes, but it behaves like a full
copy as far as modifications not being shared goes.  It also uses
almost no additional disk space until the content starts to diverge
between the copies.  Setting reflink=auto should be safe on non-COW
filesystems as it will fall back to a normal copy if the operation
isn't supported.  It is available in stable coreutils.  Some speculate
that this option could increase fragmentation (both copies will share
extents from the original file, and have some extents of their own),
but btrfs doesn't overwrite anything in-place so fragmentation is a
potential issue with any file modification (change one byte in the
middle of a file and you get a new record somewhere with one byte in
it and a bunch of pointers in the metadata saying "stick this byte
here" - though for one byte I'm guessing it would end up in the
metadata tree much as ext3 stores small files in their inodes so the
one byte would be in ram when the pointer to it is loaded).

Rich



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 13:44           ` Rich Freeman
@ 2011-11-26 15:09             ` Michał Górny
  2011-11-26 15:25               ` Rich Freeman
  2011-11-26 15:58               ` Nirbheek Chauhan
  2011-11-26 15:50             ` Nirbheek Chauhan
  1 sibling, 2 replies; 20+ messages in thread
From: Michał Górny @ 2011-11-26 15:09 UTC (permalink / raw
  To: gentoo-dev; +Cc: rich0

[-- Attachment #1: Type: text/plain, Size: 857 bytes --]

On Sat, 26 Nov 2011 08:44:28 -0500
Rich Freeman <rich0@gentoo.org> wrote:

> Oh, on the topic of btrfs, if any emerge operations do file copies,
> adding --reflink=auto to the cp command will GREATLY improve
> performance.  That does a copy-on-write copy - it behaves like a
> hard-link as far as time to create goes, but it behaves like a full
> copy as far as modifications not being shared goes. [...]

We don't rely on external tools to do the copying. AFAIR it uses
Python's shutil module which is rather poor. I'm slowly working on
creating atomic-install tool for merging this more optimally [1].

But in this particular case, I don't think COW is particularly useful.
If it works only on filesystem bounds, we could move the file directly
anyway.

[1]:https://github.com/mgorny/atomic-install

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 15:09             ` Michał Górny
@ 2011-11-26 15:25               ` Rich Freeman
  2011-11-26 16:00                 ` Michał Górny
  2011-11-26 15:58               ` Nirbheek Chauhan
  1 sibling, 1 reply; 20+ messages in thread
From: Rich Freeman @ 2011-11-26 15:25 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev

On Sat, Nov 26, 2011 at 10:09 AM, Michał Górny <mgorny@gentoo.org> wrote:
> But in this particular case, I don't think COW is particularly useful.
> If it works only on filesystem bounds, we could move the file directly
> anyway.

Yup - I would only use it if you really are doing a copy and not a
move (neglecting the fact that the implementation of a
cross-filesystem move does a copy first).  I imagine many ebuilds do
copy operations internally, but probably not to an extent where it
would make much difference.  I'm not sure how doins/dobin/etc are
implemented - I think they're copies and so allowing for the fact that
not everybody uses a tmpfs it might make sense to fix those.

Rich



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 13:44           ` Rich Freeman
  2011-11-26 15:09             ` Michał Górny
@ 2011-11-26 15:50             ` Nirbheek Chauhan
  1 sibling, 0 replies; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 15:50 UTC (permalink / raw
  To: gentoo-dev

On Sat, Nov 26, 2011 at 7:14 PM, Rich Freeman <rich0@gentoo.org> wrote:
> isn't supported.  It is available in stable coreutils.  Some speculate
> that this option could increase fragmentation (both copies will share
> extents from the original file, and have some extents of their own),
> but btrfs doesn't overwrite anything in-place so fragmentation is a
> potential issue with any file modification (change one byte in the

Adding to your comments on this:

To mitigate such issues, newer versions of the btrfs fs driver have
automatic online defragmentation as well. Works quite well for
moderate fragmentation.

A particularly ghastly example where fragmentation issues become
pathological in nature are files that are fsync()ed very frequently. A
typical example are the *.sqlite files in ~/.mozilla which easily get
hundreds or even thousands of fragments after a few hours worth of
firefox usage (can be verified with filefrag).

To fix such things, regular online defragmentation of those specific
files can be done using `btrfs fi defrag <file>`.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 15:09             ` Michał Górny
  2011-11-26 15:25               ` Rich Freeman
@ 2011-11-26 15:58               ` Nirbheek Chauhan
  2011-11-26 16:08                 ` Michał Górny
  1 sibling, 1 reply; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 15:58 UTC (permalink / raw
  To: gentoo-dev; +Cc: rich0

On Sat, Nov 26, 2011 at 8:39 PM, Michał Górny <mgorny@gentoo.org> wrote:
> But in this particular case, I don't think COW is particularly useful.
> If it works only on filesystem bounds, we could move the file directly
> anyway.
>

There are still a few specific cases in which CoW would indeed be
useful. IIRC, reflinking of files works across btrfs *subvolumes*, and
such a copy would normally be detected as a cross-device move. Another
use would be an patch-merge which makes use of *ranged reflinks* to
only CoW copy those parts of the file that were changed[1]. rsync has
support for this, but only while appending to files (--append-verify
--no-whole-file).


1. Somewhat like rope data structures, with the caveat that ranges
must be block-size aligned.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 15:25               ` Rich Freeman
@ 2011-11-26 16:00                 ` Michał Górny
  0 siblings, 0 replies; 20+ messages in thread
From: Michał Górny @ 2011-11-26 16:00 UTC (permalink / raw
  To: gentoo-dev; +Cc: rich0

[-- Attachment #1: Type: text/plain, Size: 981 bytes --]

On Sat, 26 Nov 2011 10:25:15 -0500
Rich Freeman <rich0@gentoo.org> wrote:

> On Sat, Nov 26, 2011 at 10:09 AM, Michał Górny <mgorny@gentoo.org>
> wrote:
> > But in this particular case, I don't think COW is particularly
> > useful. If it works only on filesystem bounds, we could move the
> > file directly anyway.
> 
> Yup - I would only use it if you really are doing a copy and not a
> move (neglecting the fact that the implementation of a
> cross-filesystem move does a copy first).  I imagine many ebuilds do
> copy operations internally, but probably not to an extent where it
> would make much difference.  I'm not sure how doins/dobin/etc are
> implemented - I think they're copies and so allowing for the fact that
> not everybody uses a tmpfs it might make sense to fix those.

AFAICS doins uses 'install' mostly, and sometimes 'cp' (with symlinks).
I don't see any variant of '--reflink' option for 'install'.

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 15:58               ` Nirbheek Chauhan
@ 2011-11-26 16:08                 ` Michał Górny
  2011-11-26 16:31                   ` Nirbheek Chauhan
  0 siblings, 1 reply; 20+ messages in thread
From: Michał Górny @ 2011-11-26 16:08 UTC (permalink / raw
  To: gentoo-dev; +Cc: nirbheek, rich0

[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]

On Sat, 26 Nov 2011 21:28:51 +0530
Nirbheek Chauhan <nirbheek@gentoo.org> wrote:

> On Sat, Nov 26, 2011 at 8:39 PM, Michał Górny <mgorny@gentoo.org>
> wrote:
> > But in this particular case, I don't think COW is particularly
> > useful. If it works only on filesystem bounds, we could move the
> > file directly anyway.
> >
> 
> There are still a few specific cases in which CoW would indeed be
> useful. IIRC, reflinking of files works across btrfs *subvolumes*, and
> such a copy would normally be detected as a cross-device move.

For such a thing, shouldn't rename() work neat anyway?

> Another use would be an patch-merge which makes use of *ranged
> reflinks* to only CoW copy those parts of the file that were
> changed[1]. rsync has support for this, but only while appending to
> files (--append-verify --no-whole-file).

So, it'd be like:
1) CoW-dup old file,
2) patch-merge into the duped old file,
3) replace.

Am I understanding correctly?

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 12:50       ` Nirbheek Chauhan
  2011-11-26 12:59         ` Ciaran McCreesh
@ 2011-11-26 16:19         ` Mike Frysinger
  2011-11-26 16:34           ` Nirbheek Chauhan
  2011-11-27 21:44           ` [gentoo-portage-dev] " Zac Medico
  1 sibling, 2 replies; 20+ messages in thread
From: Mike Frysinger @ 2011-11-26 16:19 UTC (permalink / raw
  To: gentoo-dev; +Cc: Nirbheek Chauhan, Fabian Groffen, gentoo-portage-dev

[-- Attachment #1: Type: Text/Plain, Size: 963 bytes --]

On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote:
> On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen wrote:
> > On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote:
> >> [...] Besides, sorting even 30,000
> >> entries (if you're merging every ebuild in portage) should not take
> >> more than a few secs.
> > 
> > A linux kernel has around that much of files, and I really wonder if
> > it's worth waiting a couple of seconds (probably more on sparc and arm
> > systems) just because then the files are in sorted order.
> 
> I'm not sure the two are really comparable. However, looking at a
> simple string sort on 30,000 strings, I don't see it taking a
> significant amount of time at all:

sure, it's probably not significantly higher, but i also can't see any point in 
sorting the entries.  we've been doing fine so far in the 10+ years of it being 
unsorted.  so unless Arfrever has a compelling reason, time to revert.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 16:08                 ` Michał Górny
@ 2011-11-26 16:31                   ` Nirbheek Chauhan
  0 siblings, 0 replies; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 16:31 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev, rich0

On Sat, Nov 26, 2011 at 9:38 PM, Michał Górny <mgorny@gentoo.org> wrote:
> On Sat, 26 Nov 2011 21:28:51 +0530
> Nirbheek Chauhan <nirbheek@gentoo.org> wrote:
>> There are still a few specific cases in which CoW would indeed be
>> useful. IIRC, reflinking of files works across btrfs *subvolumes*, and
>> such a copy would normally be detected as a cross-device move.
>
> For such a thing, shouldn't rename() work neat anyway?
>

No, because reflink is an ioctl that works directly on the FS level by
sharing data blocks, and should theoretically not bother about the
file hierarchy. On the other hand, rename() is a userland API and must
behave itself.

>> Another use would be an patch-merge which makes use of *ranged
>> reflinks* to only CoW copy those parts of the file that were
>> changed[1]. rsync has support for this, but only while appending to
>> files (--append-verify --no-whole-file).
>
> So, it'd be like:
> 1) CoW-dup old file,
> 2) patch-merge into the duped old file,
> 3) replace.
>
> Am I understanding correctly?
>

You can do that, or perhaps you can just do the patch-merge in-place.
Not sure about the crash guarantees in the latter case. The former
(rename) is documented here:
https://btrfs.wiki.kernel.org/articles/f/a/q/FAQ_1fe9.html#What_are_the_crash_guarantees_of_overwrite-by-rename.3F

But in all this, the hard part is really the "patch-merge" for
anything except appends. ;)

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 16:19         ` Mike Frysinger
@ 2011-11-26 16:34           ` Nirbheek Chauhan
  2011-11-27 21:44           ` [gentoo-portage-dev] " Zac Medico
  1 sibling, 0 replies; 20+ messages in thread
From: Nirbheek Chauhan @ 2011-11-26 16:34 UTC (permalink / raw
  To: Mike Frysinger; +Cc: gentoo-dev, Fabian Groffen, gentoo-portage-dev

On Sat, Nov 26, 2011 at 9:49 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote:
>> I'm not sure the two are really comparable. However, looking at a
>> simple string sort on 30,000 strings, I don't see it taking a
>> significant amount of time at all:
>
> sure, it's probably not significantly higher, but i also can't see any point in
> sorting the entries.  we've been doing fine so far in the 10+ years of it being
> unsorted.  so unless Arfrever has a compelling reason, time to revert.
> -mike
>

I agree. My argument was that if sorting has some benefit, the cost is
negligible, and it should be done properly. If the benefits of the
sorting are non-existent, or if it causes problems as Ciaran pointed
out, then sorting should not be done.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-portage-dev] Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 16:19         ` Mike Frysinger
  2011-11-26 16:34           ` Nirbheek Chauhan
@ 2011-11-27 21:44           ` Zac Medico
  1 sibling, 0 replies; 20+ messages in thread
From: Zac Medico @ 2011-11-27 21:44 UTC (permalink / raw
  To: gentoo-portage-dev
  Cc: Mike Frysinger, gentoo-dev, Nirbheek Chauhan, Fabian Groffen

On 11/26/2011 08:19 AM, Mike Frysinger wrote:
> On Saturday 26 November 2011 07:50:27 Nirbheek Chauhan wrote:
>> On Sat, Nov 26, 2011 at 5:08 PM, Fabian Groffen wrote:
>>> On 26-11-2011 16:56:41 +0530, Nirbheek Chauhan wrote:
>>>> [...] Besides, sorting even 30,000
>>>> entries (if you're merging every ebuild in portage) should not take
>>>> more than a few secs.
>>>
>>> A linux kernel has around that much of files, and I really wonder if
>>> it's worth waiting a couple of seconds (probably more on sparc and arm
>>> systems) just because then the files are in sorted order.
>>
>> I'm not sure the two are really comparable. However, looking at a
>> simple string sort on 30,000 strings, I don't see it taking a
>> significant amount of time at all:
> 
> sure, it's probably not significantly higher, but i also can't see any point in 
> sorting the entries.  we've been doing fine so far in the 10+ years of it being 
> unsorted.  so unless Arfrever has a compelling reason, time to revert.
> -mike

Okay, reverted:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=7c5b170d47ab054bc3f8a7778dd3f8139c1239c6

-- 
Thanks,
Zac



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen
  2011-11-26 11:26   ` Nirbheek Chauhan
@ 2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
  2011-11-28  8:06     ` Michał Górny
  2011-11-30 20:31     ` Mike Frysinger
  1 sibling, 2 replies; 20+ messages in thread
From: Arfrever Frehtes Taifersar Arahesis @ 2011-11-27 22:28 UTC (permalink / raw
  To: Gentoo Development

[-- Attachment #1: Type: Text/Plain, Size: 897 bytes --]

2011-11-26 11:58:22 Fabian Groffen napisał(a):
> On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote:
> > commit:     1d4ac47c28706094230cb2c4e6ee1c1c71629aa0
> > T> Org>
> > AuthorDate: Sat Nov 26 01:52:49 2011 +0000
> > Commit:     Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo <DOT> org>
> > CommitDate: Sat Nov 26 01:52:49 2011 +0000
> > URL:        http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c
> > 
> > dblink.mergeme(): Merge files in alphabetic order.
> 
> What's the advantage of this?

The advantage is that this allows to easier review output of `emerge` to verify if correct files
have been installed.

> I don't really like to pay for sorting a potentially huge list just for some eye-candy. 

Time of sorting in case of Linux sources is less than 0.05 s.

-- 
Arfrever Frehtes Taifersar Arahesis

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
@ 2011-11-28  8:06     ` Michał Górny
  2011-11-28 12:57       ` Rich Freeman
  2011-11-30 20:31     ` Mike Frysinger
  1 sibling, 1 reply; 20+ messages in thread
From: Michał Górny @ 2011-11-28  8:06 UTC (permalink / raw
  To: gentoo-dev; +Cc: arfrever

[-- Attachment #1: Type: text/plain, Size: 1116 bytes --]

On Sun, 27 Nov 2011 23:28:12 +0100
Arfrever Frehtes Taifersar Arahesis <arfrever@gentoo.org> wrote:

> 2011-11-26 11:58:22 Fabian Groffen napisał(a):
> > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis
> > wrote:
> > > commit:     1d4ac47c28706094230cb2c4e6ee1c1c71629aa0
> > > T> Org>
> > > AuthorDate: Sat Nov 26 01:52:49 2011 +0000
> > > Commit:     Arfrever Frehtes Taifersar Arahesis <arfrever <AT>
> > > gentoo <DOT> org> CommitDate: Sat Nov 26 01:52:49 2011 +0000
> > > URL:
> > > http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1d4ac47c
> > > 
> > > dblink.mergeme(): Merge files in alphabetic order.
> > 
> > What's the advantage of this?
> 
> The advantage is that this allows to easier review output of `emerge`
> to verify if correct files have been installed.
> 
> > I don't really like to pay for sorting a potentially huge list just
> > for some eye-candy. 
> 
> Time of sorting in case of Linux sources is less than 0.05 s.

Have you considered time overhead of moving files in unnatural order?

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-28  8:06     ` Michał Górny
@ 2011-11-28 12:57       ` Rich Freeman
  0 siblings, 0 replies; 20+ messages in thread
From: Rich Freeman @ 2011-11-28 12:57 UTC (permalink / raw
  To: gentoo-dev; +Cc: arfrever

On Mon, Nov 28, 2011 at 3:06 AM, Michał Górny <mgorny@gentoo.org> wrote:
> Have you considered time overhead of moving files in unnatural order?

Rather than re-discuss this point it would probably be better for
everybody to just read through the entire thread again, particularly
Cirian's post and its follow-ups.  My understanding is that the patch
has already been reverted.

Now, if somebody has a suggestion for how to sort the files in such a
way as to improve the performance (assuming that it doesn't already
happen in inode order) that would certainly take things forward.

Sorting lists in RAM is cheap, disk seeks are expensive.

Of course, if you're using tmpfs it is either all in RAM or in swap in
the first place, and I'm not sure if swap brings in additional
considerations here.

Rich



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/
  2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
  2011-11-28  8:06     ` Michał Górny
@ 2011-11-30 20:31     ` Mike Frysinger
  1 sibling, 0 replies; 20+ messages in thread
From: Mike Frysinger @ 2011-11-30 20:31 UTC (permalink / raw
  To: gentoo-dev; +Cc: Arfrever Frehtes Taifersar Arahesis

[-- Attachment #1: Type: Text/Plain, Size: 1299 bytes --]

On Sunday 27 November 2011 17:28:12 Arfrever Frehtes Taifersar Arahesis wrote:
> 2011-11-26 11:58:22 Fabian Groffen napisał(a):
> > On 26-11-2011 01:54:35 +0000, Arfrever Frehtes Taifersar Arahesis wrote:
> > > commit:     1d4ac47c28706094230cb2c4e6ee1c1c71629aa0
> > > T> Org>
> > > AuthorDate: Sat Nov 26 01:52:49 2011 +0000
> > > Commit:     Arfrever Frehtes Taifersar Arahesis <arfrever <AT> gentoo
> > > <DOT> org> CommitDate: Sat Nov 26 01:52:49 2011 +0000
> > > URL:       
> > > http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1
> > > d4ac47c
> > > 
> > > dblink.mergeme(): Merge files in alphabetic order.
> > 
> > What's the advantage of this?
> 
> The advantage is that this allows to easier review output of `emerge` to
> verify if correct files have been installed.

then you should be sorting the output and not the db

along these lines, if you're going to be making commits to important repos, 
please use changelogs that are actually useful.  the changelog here:
	dblink.mergeme(): Merge files in alphabetic order.
is utterly and completely useless.

you need to document *why* you're changing things first and foremost, and only 
then can you delve into details as to why the implementation you're committing 
makes sense.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-11-30 20:32 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1d4ac47c28706094230cb2c4e6ee1c1c71629aa0.arfrever@gentoo>
2011-11-26 10:58 ` [gentoo-dev] Re: proj/portage:master commit in: pym/portage/dbapi/ Fabian Groffen
2011-11-26 11:26   ` Nirbheek Chauhan
2011-11-26 11:37     ` "Paweł Hajdan, Jr."
     [not found]     ` <20111126113830.GC37825@gentoo.org>
2011-11-26 12:50       ` Nirbheek Chauhan
2011-11-26 12:59         ` Ciaran McCreesh
2011-11-26 13:44           ` Rich Freeman
2011-11-26 15:09             ` Michał Górny
2011-11-26 15:25               ` Rich Freeman
2011-11-26 16:00                 ` Michał Górny
2011-11-26 15:58               ` Nirbheek Chauhan
2011-11-26 16:08                 ` Michał Górny
2011-11-26 16:31                   ` Nirbheek Chauhan
2011-11-26 15:50             ` Nirbheek Chauhan
2011-11-26 16:19         ` Mike Frysinger
2011-11-26 16:34           ` Nirbheek Chauhan
2011-11-27 21:44           ` [gentoo-portage-dev] " Zac Medico
2011-11-27 22:28   ` Arfrever Frehtes Taifersar Arahesis
2011-11-28  8:06     ` Michał Górny
2011-11-28 12:57       ` Rich Freeman
2011-11-30 20:31     ` Mike Frysinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox