Begin forwarded message: Date: Fri, 8 Oct 2004 13:56:17 +0100 From: Ciaran McCreesh To: Marius Mauch Cc: gentoo-portage-dev@lists.gentoo.org Subject: Re: [gentoo-portage-dev] changelog encoding [ not sure if I can post to gentoo-portage-dev, please forward on if not... ] On Fri, 8 Oct 2004 14:31:52 +0200 Marius Mauch wrote: | On 10/07/04 Brian wrote: | > What is the official encoding method(s) for the changelogs. It has | > been reported that porthole often fails getting the changelogs due | > to the encoding. Currently it is assuming ascii. Many are | > reported to be iso-8859-1. | | I don't think we have an official encoding, but I think ciaranm knows | a bit more about that issue. Yup. We *need* to have an official encoding. Reason being, at least one developer has a non-(ASCII as in characters 0..126 only) character in their name. Said encoding should also apply to ebuilds, but not to files/ entries (I could give the lengthy explanation if anyone really wants to know, but basically certain things would break). I've been whinging about this on and off for about a year now, and every time it's been dismissed as irrelevant :) If we're going to standardise on an encoding, it's got to be UTF-8. iso-8859-1 is not sufficient to represent every developer (and potential patch contributor)'s name correctly. UTF-16 and plain old four byte unicode aren't compatible with our existing files (in UTF-8, characters 1 to 126 are the same as in regular ASCII). Yes, UTF-8 kinda sucks in terms of space when encoding japanese or russian characters, but since these will be a rare occurance it's not really a problem. -- Ciaran McCreesh : Gentoo Developer (Sparc, MIPS, Vim, Fluxbox) Mail : ciaranm at gentoo.org Web : http://dev.gentoo.org/~ciaranm