public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Alastair Tse <liquidx@gentoo.org>
To: gentoo-dev@gentoo.org
Subject: Re: [gentoo-dev] python-2.3.2 testing required
Date: Thu, 13 Nov 2003 09:51:28 +0000	[thread overview]
Message-ID: <1068717088.25166.47.camel@huggins.eng.cam.ac.uk> (raw)
In-Reply-To: <200311130910.16348.tdickenson@devmail.geminidataloggers.co.uk>

[-- Attachment #1: Type: text/plain, Size: 2040 bytes --]

On Thu, 2003-11-13 at 09:10, Toby Dickenson wrote:
> Ive not used ucs4 python yet, but it is one of the things I was looking 
> forward to in version 2.3. It would much nicer to leave ucs2 behind.

I would like to move away from UCS2 as well, but I'd like some arguments
to say why this is a good thing apart from "it's more compatible.".

> If ucs4 strings were the only cause of that difference, supybot would need to 
> be storing 2.5 million unicode characters. I guess that isnt likely. 
> Excluding bugs, I dont see any reason why a program that doesnt use any 
> unicode objects would use more memory when running on a ucs4 python 
> interpreter.

All unicode string objects would have been stored in UCS4 instead of
UCS2. Things like XML parsers all use unicode string objects to store
their representations because UTF-8 is the default encoding for XML.
Those sorts of applications may have a more significant  memory
footprint growth.

> > But note that this example is not scientific
> > because the machines were different in kernel version, compiler and
> > compiler optimisations.
> 
> Those reasons sound much more plausibe to me. Does anyone have a more 
> scientific comparison of the effect of the ucs4 option on python?

I'd like to do that some time. Otherwise, someone with a faster machine
than mine may want to try it. It would be an interesting to see what the
real impact is. If the memory footprint doesn't grow as much as I claims
it does, then it is a powerful argument for moving to UCS4 as default.

The reason why UCS2 is still default in the masked python-2.3.2 is
because (a) not many people use anything at the moment that requires
anything above UCS2 and (b) UCS4 does take up more memory compared to
the UCS2. How much more, I'm not certain.

For instance, how much more memory would portage take if it doesn't use
unicode strings at all?

Cheers,
-- 
Alastair 'liquidx' Tse
 >> Gentoo Developer
 >> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2003-11-13  9:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-12 18:46 [gentoo-dev] python-2.3.2 testing required Alastair Tse
2003-11-13  8:07 ` Nick Jones
2003-11-13  9:57   ` Alastair Tse
2003-11-13  8:07 ` Alastair Tse
2003-11-13  9:05 ` Paul de Vrieze
2003-11-13  9:38   ` Alastair Tse
2003-11-13  9:10 ` Toby Dickenson
2003-11-13  9:51   ` Alastair Tse [this message]
2003-11-13 23:34 ` Toby Dickenson
2003-11-14  9:37   ` Alastair Tse
2003-11-15  8:09 ` Simon Watson
2003-11-17  0:29 ` Alastair Tse
2003-11-17 10:28   ` Toby Dickenson
2003-11-17 10:48     ` Alastair Tse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1068717088.25166.47.camel@huggins.eng.cam.ac.uk \
    --to=liquidx@gentoo.org \
    --cc=gentoo-dev@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox