public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] UTF-8 locale by default
@ 2012-07-19 21:39 Sascha Cunz
  2012-07-19 22:23 ` Chí-Thanh Christopher Nguyễn
  2012-07-19 22:28 ` Ulrich Mueller
  0 siblings, 2 replies; 49+ messages in thread
From: Sascha Cunz @ 2012-07-19 21:39 UTC (permalink / raw
  To: gentoo-dev

I recently discovered that I for some reason haven't noticed the warning about 
setting the locale to utf-8 in the gentoo handbook for obviously several 
years; thus i was still running all my systems in a POSIX locale since i never 
cared much about it.

However, since I noticed, I talked to several people about it; all of them 
stating as first response: "Not shipping with a utf-8 locale turned on by 
default nowadays probably is a bug in your distro".

While thinking about this and recognizing that indeed recent distributions 
ship with some UTF-8 locale by default, I tend to agree on that statement.

Though, google brings up a lot of good documentation about how to change the 
locale, I couldn't find something that tells why stage3 is still delivered 
with posix locale set.

Is there a reason for not using at least en_US.UTF-8 as a "sane" default 
value?

BR,
SaCu



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-19 21:39 [gentoo-dev] UTF-8 locale by default Sascha Cunz
@ 2012-07-19 22:23 ` Chí-Thanh Christopher Nguyễn
  2012-07-19 22:28 ` Ulrich Mueller
  1 sibling, 0 replies; 49+ messages in thread
From: Chí-Thanh Christopher Nguyễn @ 2012-07-19 22:23 UTC (permalink / raw
  To: gentoo-dev

Sascha Cunz schrieb:
> Is there a reason for not using at least en_US.UTF-8 as a "sane" default 
> value?

It has been discussed some time ago already. Setting LANG="en_US.UTF-8"
would mess with collation rules, measurement&paper units etc. which has
the potential to make users outside USA unhappy.

It might make sense to set LC_CTYPE="en_US.UTF8" but even so,
transliteration may give you unexpected results.

To illustrate this, try running

echo äå | LC_CTYPE=en_US.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=da_DK.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=de_DE.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8

and compare the output.
For the previous discussion, see this thread:
http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml


Best regards,
Chí-Thanh Christopher Nguyễn



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-19 21:39 [gentoo-dev] UTF-8 locale by default Sascha Cunz
  2012-07-19 22:23 ` Chí-Thanh Christopher Nguyễn
@ 2012-07-19 22:28 ` Ulrich Mueller
  2012-07-27  6:42   ` Ben de Groot
  1 sibling, 1 reply; 49+ messages in thread
From: Ulrich Mueller @ 2012-07-19 22:28 UTC (permalink / raw
  To: gentoo-dev

>>>>> On Thu, 19 Jul 2012, Sascha Cunz wrote:

> Is there a reason for not using at least en_US.UTF-8 as a "sane"
> default value?

Because there's no one-size-fits-all locale, but it is specific to
every system so the user must configure it?

The matter was recently discussed in this mailing list [1] and also in
the March 2012 council meeting [2], and as a result the docs team has
amended the respective section [3] of the handbook.

Ulrich

[1] <http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml>
[2] <http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt>
[3] <http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=8>



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-19 22:28 ` Ulrich Mueller
@ 2012-07-27  6:42   ` Ben de Groot
  2012-07-27  7:08     ` Ulrich Mueller
  0 siblings, 1 reply; 49+ messages in thread
From: Ben de Groot @ 2012-07-27  6:42 UTC (permalink / raw
  To: gentoo-dev

On 20 July 2012 06:28, Ulrich Mueller <ulm@gentoo.org> wrote:
>>>>>> On Thu, 19 Jul 2012, Sascha Cunz wrote:
>
>> Is there a reason for not using at least en_US.UTF-8 as a "sane"
>> default value?
>
> Because there's no one-size-fits-all locale, but it is specific to
> every system so the user must configure it?

While this is understandable, the fact remains that not having a
UTF-8 locale by default in our stage3 environment is sub-optimal.

I understand why the council rejected Debian's C.UTF-8 option,
but is there really no better default that we can use?

Without any default locale set, in practically all cases that means
that the user is presented with English, and mostly the American
variant. So, in practice, we are defaulting to en_US, just not in a
unicode environment. Correct me if I'm wrong.

Also, in most other places (such as our website, GLEPs, ebuilds)
we default to en_US.UTF-8.

So let's upgrade to en_US.UTF-8, which is for most users more
desirable than the current situation. Of course we will still advise
them to set their desired locales in /etc/locale.gen. But at least
they will start with a unicode environment, as expected anno 2012.


> The matter was recently discussed in this mailing list [1] and also in
> the March 2012 council meeting [2], and as a result the docs team has
> amended the respective section [3] of the handbook.
>
> Ulrich
>
> [1] <http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml>
> [2] <http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt>
> [3] <http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=8>
>

-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  6:42   ` Ben de Groot
@ 2012-07-27  7:08     ` Ulrich Mueller
  2012-07-27  7:19       ` Rick "Zero_Chaos" Farina
                         ` (3 more replies)
  0 siblings, 4 replies; 49+ messages in thread
From: Ulrich Mueller @ 2012-07-27  7:08 UTC (permalink / raw
  To: gentoo-dev

>>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:

> I understand why the council rejected Debian's C.UTF-8 option,
> but is there really no better default that we can use?

> Without any default locale set, in practically all cases that means
> that the user is presented with English, and mostly the American
> variant. So, in practice, we are defaulting to en_US, just not in a
> unicode environment. Correct me if I'm wrong.

See below. We're not defaulting to en_US for things like the number
format.

> Also, in most other places (such as our website, GLEPs, ebuilds)
> we default to en_US.UTF-8.

> So let's upgrade to en_US.UTF-8, which is for most users more
> desirable than the current situation. Of course we will still advise
> them to set their desired locales in /etc/locale.gen. But at least
> they will start with a unicode environment, as expected anno 2012.

As I had pointed out before [1], changing from POSIX to an en_US
locale will have undesirable side effects, like commas as thousands
separators in numbers (because of LC_NUMERIC). Also the defaults of
en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

So if we change the default (but I still don't see the need), we
should go for a less intrusive setting like:

   LANG="POSIX"
   LC_CTYPE="en_US.utf8"

Ulrich

[1] <http://archives.gentoo.org/gentoo-dev/msg_56a438adde8efebd467ada5f858048ba.xml>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  7:08     ` Ulrich Mueller
@ 2012-07-27  7:19       ` Rick "Zero_Chaos" Farina
  2012-07-27  8:06       ` Dan Douglas
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 49+ messages in thread
From: Rick "Zero_Chaos" Farina @ 2012-07-27  7:19 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 07/27/2012 03:08 AM, Ulrich Mueller wrote:
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
> 
>    LANG="POSIX"
>    LC_CTYPE="en_US.utf8"

I would love to see a utf8 default, if the above is agreeable then I say +1

- -Zero
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQEkD6AAoJEKXdFCfdEflKt8MP/3wRoExV11rO5aV5952hwKhd
x9AG3wGJQqGFLkKW++gU1RLX8rhxZE+W8cRlp3/4Q1b6yLGFp7UihZv/rQj1SJra
Uz4OWqzzdYAkfkzr2MOgB94iODXInuuSbZmhcvOg8d7cgbhW3p0aIQ59uqkqom6W
U0a8BohmGtTEMvWurMtvz705atv0z8aRUsoBUkagCUmRqg96j8HJRbMibNFKcHaa
tzilNblkCouPmh5VZNuoCNIVrs6ADOT+kXmhZ8DeuOOdM88irPr41gz557K97J4l
u9ZWElpLY8zse+dHSioybE57cb9ISNph9B3OjmrzEmxMYO/Vs8+8ZRIgX4A4U2FZ
BDISvf2u77ZUhv48gCuC6pj+np7IMAUgRgk1xWiSkPIWxvlcPcvFo/K1dle3FofL
iNAxf0XcLj+crfBemhnvDWTB0ZCIIBcyn0MYax70lzcwR0t0q+xJ8XBN1hF3xWob
LOUSCd1sibc2a65D5olc/qKSjINM5KY3D+CVXhojhD1YzklmrKBb9K5gk6ziZr2y
w4OMOIkDc+iHYq0xhcYRAJU38+cuX9ViNq9O4H3ILpQXi+KRKlk4PmlLIm2v9evb
P+JNsRSl+1sxUkn2ZthBh+83vj/WtnR0s1sXEzc+6riBomBGsc0Hbsoa9Z+JgNhF
FzvV5OHsfNiuHvAzayww
=ZiLb
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  7:08     ` Ulrich Mueller
  2012-07-27  7:19       ` Rick "Zero_Chaos" Farina
@ 2012-07-27  8:06       ` Dan Douglas
  2012-07-27  8:34         ` Ben de Groot
  2012-07-27  8:38       ` Cyprien Nicolas
  2012-07-27 12:13       ` Chí-Thanh Christopher Nguyễn
  3 siblings, 1 reply; 49+ messages in thread
From: Dan Douglas @ 2012-07-27  8:06 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
> 
> > I understand why the council rejected Debian's C.UTF-8 option,
> > but is there really no better default that we can use?
> 
> > Without any default locale set, in practically all cases that means
> > that the user is presented with English, and mostly the American
> > variant. So, in practice, we are defaulting to en_US, just not in a
> > unicode environment. Correct me if I'm wrong.
> 
> See below. We're not defaulting to en_US for things like the number
> format.
> 
> > Also, in most other places (such as our website, GLEPs, ebuilds)
> > we default to en_US.UTF-8.
> 
> > So let's upgrade to en_US.UTF-8, which is for most users more
> > desirable than the current situation. Of course we will still advise
> > them to set their desired locales in /etc/locale.gen. But at least
> > they will start with a unicode environment, as expected anno 2012.
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
> 
>    LANG="POSIX"
>    LC_CTYPE="en_US.utf8"
> 
> Ulrich
> 

You're concerned about the commas breaking things? Given that you usually need 
to specifically ask for them (i.e., printf ' flag), and that kind of output is 
usually going to be for human consumption only that seems unlikely. If 
anything does rely upon the format, can't tolerate different locales, and fails 
to specify LC_NUMERIC then it's broken anyway.

LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying 
defaults for some people. What do users of other distros think? Is this really 
a serious problem for anyone?

LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8 
by default. I can live with LANG=POSIX.
-- 
Dan Douglas

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  8:06       ` Dan Douglas
@ 2012-07-27  8:34         ` Ben de Groot
  2012-07-27  8:49           ` Michał Górny
  0 siblings, 1 reply; 49+ messages in thread
From: Ben de Groot @ 2012-07-27  8:34 UTC (permalink / raw
  To: gentoo-dev

On 27 July 2012 16:06, Dan Douglas <ormaaj@gmail.com> wrote:
> On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
>> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> > I understand why the council rejected Debian's C.UTF-8 option,
>> > but is there really no better default that we can use?
>>
>> > Without any default locale set, in practically all cases that means
>> > that the user is presented with English, and mostly the American
>> > variant. So, in practice, we are defaulting to en_US, just not in a
>> > unicode environment. Correct me if I'm wrong.
>>
>> See below. We're not defaulting to en_US for things like the number
>> format.
>>
>> > Also, in most other places (such as our website, GLEPs, ebuilds)
>> > we default to en_US.UTF-8.
>>
>> > So let's upgrade to en_US.UTF-8, which is for most users more
>> > desirable than the current situation. Of course we will still advise
>> > them to set their desired locales in /etc/locale.gen. But at least
>> > they will start with a unicode environment, as expected anno 2012.
>>
>> As I had pointed out before [1], changing from POSIX to an en_US
>> locale will have undesirable side effects, like commas as thousands
>> separators in numbers (because of LC_NUMERIC). Also the defaults of
>> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>>
>> So if we change the default (but I still don't see the need), we
>> should go for a less intrusive setting like:
>>
>>    LANG="POSIX"
>>    LC_CTYPE="en_US.utf8"
>>
>> Ulrich
>>
>
> You're concerned about the commas breaking things? Given that you usually need
> to specifically ask for them (i.e., printf ' flag), and that kind of output is
> usually going to be for human consumption only that seems unlikely. If
> anything does rely upon the format, can't tolerate different locales, and fails
> to specify LC_NUMERIC then it's broken anyway.
>
> LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying
> defaults for some people. What do users of other distros think? Is this really
> a serious problem for anyone?
>
> LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8
> by default. I can live with LANG=POSIX.
> --
> Dan Douglas

How about the below?

LANG=en_GB.utf8
LC_COLLATE=C
LC_CTYPE=en_GB.utf8

That will give us A4 paper size and the metric system. If LC_NUMERIC is
really a problem, we can set it to something more desirable.
-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  7:08     ` Ulrich Mueller
  2012-07-27  7:19       ` Rick "Zero_Chaos" Farina
  2012-07-27  8:06       ` Dan Douglas
@ 2012-07-27  8:38       ` Cyprien Nicolas
  2012-07-27  8:47         ` Michał Górny
  2012-07-27 12:13       ` Chí-Thanh Christopher Nguyễn
  3 siblings, 1 reply; 49+ messages in thread
From: Cyprien Nicolas @ 2012-07-27  8:38 UTC (permalink / raw
  To: gentoo-dev

Ulrich Mueller wrote:
>> On Fri, 27 Jul 2012, Ben de Groot wrote:
>>
>> So let's upgrade to en_US.UTF-8, which is for most users more
>> desirable than the current situation. Of course we will still advise
>> them to set their desired locales in /etc/locale.gen. But at least
>> they will start with a unicode environment, as expected anno 2012.
> 
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

For this very reason by system locale is en_IE.UTF-8. Still English but
using Euro Monetary, Metric units, A4 paper, etc.

It might suit needs for most European installs, but not for everyone.

-- 
Cyprien / Fulax
Gentoo Lisp Project contributor



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  8:38       ` Cyprien Nicolas
@ 2012-07-27  8:47         ` Michał Górny
  0 siblings, 0 replies; 49+ messages in thread
From: Michał Górny @ 2012-07-27  8:47 UTC (permalink / raw
  To: gentoo-dev; +Cc: c.nicolas

[-- Attachment #1: Type: text/plain, Size: 1034 bytes --]

On Fri, 27 Jul 2012 10:38:30 +0200
Cyprien Nicolas <c.nicolas@gmail.com> wrote:

> Ulrich Mueller wrote:
> >> On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> So let's upgrade to en_US.UTF-8, which is for most users more
> >> desirable than the current situation. Of course we will still
> >> advise them to set their desired locales in /etc/locale.gen. But
> >> at least they will start with a unicode environment, as expected
> >> anno 2012.
> > 
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> 
> For this very reason by system locale is en_IE.UTF-8. Still English
> but using Euro Monetary, Metric units, A4 paper, etc.
> 
> It might suit needs for most European installs, but not for everyone.

Still uses ',' for thousands sep.

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  8:34         ` Ben de Groot
@ 2012-07-27  8:49           ` Michał Górny
  0 siblings, 0 replies; 49+ messages in thread
From: Michał Górny @ 2012-07-27  8:49 UTC (permalink / raw
  To: gentoo-dev; +Cc: yngwin

[-- Attachment #1: Type: text/plain, Size: 2762 bytes --]

On Fri, 27 Jul 2012 16:34:01 +0800
Ben de Groot <yngwin@gentoo.org> wrote:

> On 27 July 2012 16:06, Dan Douglas <ormaaj@gmail.com> wrote:
> > On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
> >> >>>>> On Fri, 27 Jul 2012, Ben de Groot wrote:
> >>
> >> > I understand why the council rejected Debian's C.UTF-8 option,
> >> > but is there really no better default that we can use?
> >>
> >> > Without any default locale set, in practically all cases that
> >> > means that the user is presented with English, and mostly the
> >> > American variant. So, in practice, we are defaulting to en_US,
> >> > just not in a unicode environment. Correct me if I'm wrong.
> >>
> >> See below. We're not defaulting to en_US for things like the number
> >> format.
> >>
> >> > Also, in most other places (such as our website, GLEPs, ebuilds)
> >> > we default to en_US.UTF-8.
> >>
> >> > So let's upgrade to en_US.UTF-8, which is for most users more
> >> > desirable than the current situation. Of course we will still
> >> > advise them to set their desired locales in /etc/locale.gen. But
> >> > at least they will start with a unicode environment, as expected
> >> > anno 2012.
> >>
> >> As I had pointed out before [1], changing from POSIX to an en_US
> >> locale will have undesirable side effects, like commas as thousands
> >> separators in numbers (because of LC_NUMERIC). Also the defaults of
> >> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> >>
> >> So if we change the default (but I still don't see the need), we
> >> should go for a less intrusive setting like:
> >>
> >>    LANG="POSIX"
> >>    LC_CTYPE="en_US.utf8"
> >>
> >> Ulrich
> >>
> >
> > You're concerned about the commas breaking things? Given that you
> > usually need to specifically ask for them (i.e., printf ' flag),
> > and that kind of output is usually going to be for human
> > consumption only that seems unlikely. If anything does rely upon
> > the format, can't tolerate different locales, and fails to specify
> > LC_NUMERIC then it's broken anyway.
> >
> > LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more
> > annoying defaults for some people. What do users of other distros
> > think? Is this really a serious problem for anyone?
> >
> > LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is
> > getting utf8 by default. I can live with LANG=POSIX.
> > --
> > Dan Douglas
> 
> How about the below?
> 
> LANG=en_GB.utf8
> LC_COLLATE=C
> LC_CTYPE=en_GB.utf8
> 
> That will give us A4 paper size and the metric system. If LC_NUMERIC
> is really a problem, we can set it to something more desirable.

LC_NUMERIC=pl_PL.utf8

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27  7:08     ` Ulrich Mueller
                         ` (2 preceding siblings ...)
  2012-07-27  8:38       ` Cyprien Nicolas
@ 2012-07-27 12:13       ` Chí-Thanh Christopher Nguyễn
  2012-07-27 17:24         ` Mike Frysinger
  3 siblings, 1 reply; 49+ messages in thread
From: Chí-Thanh Christopher Nguyễn @ 2012-07-27 12:13 UTC (permalink / raw
  To: gentoo-dev

Ulrich Mueller schrieb:
> As I had pointed out before [1], changing from POSIX to an en_US
> locale will have undesirable side effects, like commas as thousands
> separators in numbers (because of LC_NUMERIC). Also the defaults of
> en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
>
> So if we change the default (but I still don't see the need), we
> should go for a less intrusive setting like:
>
>    LANG="POSIX"
>    LC_CTYPE="en_US.utf8"

This would be better than LANG="en_US.utf8" but I would still prefer not
to have any country/region attached to the locale. The C.UTF-8 locale
which Debian uses for this purpose (a UTF-8 locale without side effects)
appears more suitable to me.


Best regards,
Chí-Thanh Christopher Nguyễn



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 12:13       ` Chí-Thanh Christopher Nguyễn
@ 2012-07-27 17:24         ` Mike Frysinger
  2012-07-27 18:29           ` Pacho Ramos
  2012-08-03  5:16           ` Luca Barbato
  0 siblings, 2 replies; 49+ messages in thread
From: Mike Frysinger @ 2012-07-27 17:24 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: Text/Plain, Size: 1038 bytes --]

On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> Ulrich Mueller schrieb:
> > As I had pointed out before [1], changing from POSIX to an en_US
> > locale will have undesirable side effects, like commas as thousands
> > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> > 
> > So if we change the default (but I still don't see the need), we
> > 
> > should go for a less intrusive setting like:
> >    LANG="POSIX"
> >    LC_CTYPE="en_US.utf8"
> 
> This would be better than LANG="en_US.utf8" but I would still prefer not
> to have any country/region attached to the locale. The C.UTF-8 locale
> which Debian uses for this purpose (a UTF-8 locale without side effects)
> appears more suitable to me.

yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the only 
real option in my mind for making unicode the default.  any other 
amalgamations of various locales is ugly as sin.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 17:24         ` Mike Frysinger
@ 2012-07-27 18:29           ` Pacho Ramos
  2012-07-27 20:16             ` Aaron W. Swenson
  2012-08-03  5:16           ` Luca Barbato
  1 sibling, 1 reply; 49+ messages in thread
From: Pacho Ramos @ 2012-07-27 18:29 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1296 bytes --]

El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
> > Ulrich Mueller schrieb:
> > > As I had pointed out before [1], changing from POSIX to an en_US
> > > locale will have undesirable side effects, like commas as thousands
> > > separators in numbers (because of LC_NUMERIC). Also the defaults of
> > > en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
> > > 
> > > So if we change the default (but I still don't see the need), we
> > > 
> > > should go for a less intrusive setting like:
> > >    LANG="POSIX"
> > >    LC_CTYPE="en_US.utf8"
> > 
> > This would be better than LANG="en_US.utf8" but I would still prefer not
> > to have any country/region attached to the locale. The C.UTF-8 locale
> > which Debian uses for this purpose (a UTF-8 locale without side effects)
> > appears more suitable to me.
> 
> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the only 
> real option in my mind for making unicode the default.  any other 
> amalgamations of various locales is ugly as sin.
> -mike

Do you have any idea about how much time could that formalization take?
If it will take a long time, maybe we could go to that amalgamations :-/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 18:29           ` Pacho Ramos
@ 2012-07-27 20:16             ` Aaron W. Swenson
  2012-07-27 20:55               ` Diego Elio Pettenò
  2012-07-30 14:35               ` Michael Orlitzky
  0 siblings, 2 replies; 49+ messages in thread
From: Aaron W. Swenson @ 2012-07-27 20:16 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 07/27/2012 02:29 PM, Pacho Ramos wrote:
> El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
>> On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn
>> wrote:
>>> Ulrich Mueller schrieb:
>>>> As I had pointed out before [1], changing from POSIX to an
>>>> en_US locale will have undesirable side effects, like commas
>>>> as thousands separators in numbers (because of LC_NUMERIC).
>>>> Also the defaults of en_US for LC_MEASUREMENT and LC_PAPER
>>>> are only useful in the U.S.
>>>> 
>>>> So if we change the default (but I still don't see the need),
>>>> we
>>>> 
>>>> should go for a less intrusive setting like: LANG="POSIX" 
>>>> LC_CTYPE="en_US.utf8"
>>> 
>>> This would be better than LANG="en_US.utf8" but I would still
>>> prefer not to have any country/region attached to the locale.
>>> The C.UTF-8 locale which Debian uses for this purpose (a UTF-8
>>> locale without side effects) appears more suitable to me.
>> 
>> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.
>> that's the only real option in my mind for making unicode the
>> default.  any other amalgamations of various locales is ugly as
>> sin. -mike
> 
> Do you have any idea about how much time could that formalization
> take? If it will take a long time, maybe we could go to that
> amalgamations :-/
> 

Really, how much of an inconvenience is it that we don't use UTF-8 as
a default?

In my mind, it is sufficient that we instruct users how to set the
locale in the handbook.

No user will be happy with whatever we decide to use as a default. I
will be especially upset if we use the metric system instead of the
*STANDARD* system. It has 'standard' in the name for a reason people.
(^_^)

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAS9xEACgkQVxOqA9G7/aDXmQEAmKW1MNgHDZpjE0JBWsWssq0h
LR32rvm0CrafIhD6v3UA/Aiuq6BTGxfJ3pO6+pP5xtQ5RD0ML5+89sSfKX6R1DEo
=JtMV
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 20:16             ` Aaron W. Swenson
@ 2012-07-27 20:55               ` Diego Elio Pettenò
  2012-07-30 14:35               ` Michael Orlitzky
  1 sibling, 0 replies; 49+ messages in thread
From: Diego Elio Pettenò @ 2012-07-27 20:55 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 471 bytes --]

Il 27/07/2012 13:16, Aaron W. Swenson ha scritto:
> Really, how much of an inconvenience is it that we don't use UTF-8 as
> a default?

Given that there are a ton and a half of Python packages that do not
work with a non-utf8 locale, I'd say it's quite a thing.

So either we go with an UTF-8 default or somebody has to fix the
packages not working without it....

-- 
Diego Elio Pettenò — Flameeyes
flameeyes@flameeyes.eu — http://blog.flameeyes.eu/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 20:16             ` Aaron W. Swenson
  2012-07-27 20:55               ` Diego Elio Pettenò
@ 2012-07-30 14:35               ` Michael Orlitzky
  2012-07-30 14:41                 ` Michał Górny
  2012-07-30 14:42                 ` Michael Mol
  1 sibling, 2 replies; 49+ messages in thread
From: Michael Orlitzky @ 2012-07-30 14:35 UTC (permalink / raw
  To: gentoo-dev

On 07/27/12 16:16, Aaron W. Swenson wrote:
> 
> No user will be happy with whatever we decide to use as a default.

The defaults should be what's best for the most people, with a bias
towards safety. Why don't we just take a survey and choose the most
common utf8 response?


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:35               ` Michael Orlitzky
@ 2012-07-30 14:41                 ` Michał Górny
  2012-07-30 14:50                   ` Michael Orlitzky
  2012-07-30 15:04                   ` Michael Mol
  2012-07-30 14:42                 ` Michael Mol
  1 sibling, 2 replies; 49+ messages in thread
From: Michał Górny @ 2012-07-30 14:41 UTC (permalink / raw
  To: gentoo-dev; +Cc: michael

[-- Attachment #1: Type: text/plain, Size: 545 bytes --]

On Mon, 30 Jul 2012 10:35:36 -0400
Michael Orlitzky <michael@orlitzky.com> wrote:

> On 07/27/12 16:16, Aaron W. Swenson wrote:
> > 
> > No user will be happy with whatever we decide to use as a default.
> 
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

How can you take a survey like that? How will you ensure it actually
hits the majority? How will you define the majority?

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:35               ` Michael Orlitzky
  2012-07-30 14:41                 ` Michał Górny
@ 2012-07-30 14:42                 ` Michael Mol
  2012-07-30 15:29                   ` Rich Freeman
  1 sibling, 1 reply; 49+ messages in thread
From: Michael Mol @ 2012-07-30 14:42 UTC (permalink / raw
  To: gentoo-dev

On Mon, Jul 30, 2012 at 10:35 AM, Michael Orlitzky <michael@orlitzky.com> wrote:
> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>
>> No user will be happy with whatever we decide to use as a default.
>
> The defaults should be what's best for the most people, with a bias
> towards safety. Why don't we just take a survey and choose the most
> common utf8 response?

You'd really want to a "which do you prefer, which can you use"
survey, then; You don't really want to choose the result preferred by
the most people, rather you want the result which is usable by the
most people.

-- 
:wq


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:41                 ` Michał Górny
@ 2012-07-30 14:50                   ` Michael Orlitzky
  2012-07-30 16:28                     ` Michał Górny
  2012-07-30 15:04                   ` Michael Mol
  1 sibling, 1 reply; 49+ messages in thread
From: Michael Orlitzky @ 2012-07-30 14:50 UTC (permalink / raw
  To: gentoo-dev

On 07/30/12 10:41, Michał Górny wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky <michael@orlitzky.com> wrote:
> 
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>>
>>> No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
> 
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?
> 

Considering that the alternative is to force everyone to change it
manually, you can do it however you want and it'll be an improvement.

  1) Create a webpage with a bunch of options, count the results

  2) Ask the g.o mailing lists, count responses manually

  3) Use google docs like the website survey that went out a few days
     ago

It won't hit everyone, but no survey ever does. As long as you get a
large enough unbiased sample, it doesn't matter. And anything would be
an improvement, so it doesn't matter anyway.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:41                 ` Michał Górny
  2012-07-30 14:50                   ` Michael Orlitzky
@ 2012-07-30 15:04                   ` Michael Mol
  2012-07-30 15:51                     ` Aaron W. Swenson
  1 sibling, 1 reply; 49+ messages in thread
From: Michael Mol @ 2012-07-30 15:04 UTC (permalink / raw
  To: gentoo-dev

On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny <mgorny@gentoo.org> wrote:
> On Mon, 30 Jul 2012 10:35:36 -0400
> Michael Orlitzky <michael@orlitzky.com> wrote:
>
>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >
>> > No user will be happy with whatever we decide to use as a default.
>>
>> The defaults should be what's best for the most people, with a bias
>> towards safety. Why don't we just take a survey and choose the most
>> common utf8 response?
>
> How can you take a survey like that? How will you ensure it actually
> hits the majority? How will you define the majority?

Serverside script on gentoo.org. Push out a news item with the URL and
a last-call date. Tabulate the results, using browser fingerprints to
weed out the bulk of duplicates.

-- 
:wq


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:42                 ` Michael Mol
@ 2012-07-30 15:29                   ` Rich Freeman
  0 siblings, 0 replies; 49+ messages in thread
From: Rich Freeman @ 2012-07-30 15:29 UTC (permalink / raw
  To: gentoo-dev

On Mon, Jul 30, 2012 at 10:42 AM, Michael Mol <mikemol@gmail.com> wrote:
>
> You'd really want to a "which do you prefer, which can you use"
> survey, then; You don't really want to choose the result preferred by
> the most people, rather you want the result which is usable by the
> most people.

I tend to agree.  Donnie said something in his manifesto which I think
applies here: any of the proposed solutions is probably better than
doing nothing.

If I forget to tweak my locale and I end up with a comma as a decimal
mark it isn't the end of the world, and neither is some output in
metric units.  I've ended up working on many a global system where
times get reported in GMT and people put up with the inconvenience
because they realize that any standard is better than no standard.

What is the real end-user impact of any of this stuff anyway?  During
the install the thing that matters is being able to partition disks
and compile kernels and such.  I doubt that too many users will be
dependent on installer locale settings for displaying weather reports
or such.  If they don't set locale, then it is like not setting
localtime - you just get to live with some default.  I would imagine
that at least by having a UTF-8 locale users would be able to do
things like set full names of users using unicode, etc.

Rich


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 15:04                   ` Michael Mol
@ 2012-07-30 15:51                     ` Aaron W. Swenson
  2012-08-01 20:18                       ` Andreas K. Huettel
  0 siblings, 1 reply; 49+ messages in thread
From: Aaron W. Swenson @ 2012-07-30 15:51 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 07/30/2012 11:04 AM, Michael Mol wrote:
> On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny <mgorny@gentoo.org>
> wrote:
>> On Mon, 30 Jul 2012 10:35:36 -0400 Michael Orlitzky
>> <michael@orlitzky.com> wrote:
>> 
>>> On 07/27/12 16:16, Aaron W. Swenson wrote:
>>>> 
>>>> No user will be happy with whatever we decide to use as a
>>>> default.
>>> 
>>> The defaults should be what's best for the most people, with a
>>> bias towards safety. Why don't we just take a survey and choose
>>> the most common utf8 response?
>> 
>> How can you take a survey like that? How will you ensure it
>> actually hits the majority? How will you define the majority?
> 
> Serverside script on gentoo.org. Push out a news item with the URL
> and a last-call date. Tabulate the results, using browser
> fingerprints to weed out the bulk of duplicates.
> 

I still advocate continuing how we have been.

However, the survey should be one question: What is the output of
`locale' on your workstation/desktop/laptop?

The less painful we make the survey, the more respondents we'll get,
and the less biased the results will be. Additionally, it makes the
responses easy to parse with a script.

Servers are excluded because special things take place there that may
not actually line up with what the user prefers.

If it turns out that C or POSIX is the most common response, we should
then default the locale to en_US.UTF-8 if we really want to default to
a UTF-8 setting. The reason being it makes sense to have the default
locale set to the country of origin, which in our case is the United
States.

Yes, it may irk those whose native locale is not en_US.UTF-8, but like
I said, no one will be happy. Except for those whose native locale
happens to be the default.

Start at a default, doesn't really matter which as long as the default
is the lingua franca of international business, and instruct the user,
as we already do, how to change it during the setup.

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email    : titanofold@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAWrXAACgkQVxOqA9G7/aCmowD6A8+9giw1BhhxvAag7Cmeom7o
mHVW49AfEDSo6ReknZkBAIa09FZ62SU66BCCi6m3Qisk5SW7P3YDLNbkMDS38/CZ
=lFc0
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 14:50                   ` Michael Orlitzky
@ 2012-07-30 16:28                     ` Michał Górny
  2012-07-30 16:57                       ` Michael Mol
  2012-07-30 17:33                       ` Michael Orlitzky
  0 siblings, 2 replies; 49+ messages in thread
From: Michał Górny @ 2012-07-30 16:28 UTC (permalink / raw
  To: gentoo-dev; +Cc: michael

[-- Attachment #1: Type: text/plain, Size: 2010 bytes --]

On Mon, 30 Jul 2012 10:50:29 -0400
Michael Orlitzky <michael@orlitzky.com> wrote:

> On 07/30/12 10:41, Michał Górny wrote:
> > On Mon, 30 Jul 2012 10:35:36 -0400
> > Michael Orlitzky <michael@orlitzky.com> wrote:
> > 
> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
> >>>
> >>> No user will be happy with whatever we decide to use as a default.
> >>
> >> The defaults should be what's best for the most people, with a bias
> >> towards safety. Why don't we just take a survey and choose the most
> >> common utf8 response?
> > 
> > How can you take a survey like that? How will you ensure it actually
> > hits the majority? How will you define the majority?
> > 
> 
> Considering that the alternative is to force everyone to change it
> manually, you can do it however you want and it'll be an improvement.

My point here is that you want the thing to change. So you first try to
convince people here to change. We practically did a small survey here
and in the result we didn't agree on doing the change.

So you're saying we should do another survey on another group, hoping
that this time the result will be on your side.

>   1) Create a webpage with a bunch of options, count the results
> 
>   2) Ask the g.o mailing lists, count responses manually
> 
>   3) Use google docs like the website survey that went out a few days
>      ago
> 
> It won't hit everyone, but no survey ever does. As long as you get a
> large enough unbiased sample, it doesn't matter. And anything would be
> an improvement, so it doesn't matter anyway.

It depends on who the 'unbiased sample' is. Are you interested only in
opinion of Gentoo users who visit the website? Who sync once a day?
Once a week? Who follow Gentoo Planet? Who participate in the forums?

We can create the survey and announce it everywhere. But it still won't
catch many old-time Gentoo users who can actually have something
opposite to say. It won't be unbiased.

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 16:28                     ` Michał Górny
@ 2012-07-30 16:57                       ` Michael Mol
  2012-07-30 17:33                       ` Michael Orlitzky
  1 sibling, 0 replies; 49+ messages in thread
From: Michael Mol @ 2012-07-30 16:57 UTC (permalink / raw
  To: gentoo-dev

On Mon, Jul 30, 2012 at 12:28 PM, Michał Górny <mgorny@gentoo.org> wrote:
> On Mon, 30 Jul 2012 10:50:29 -0400
> Michael Orlitzky <michael@orlitzky.com> wrote:
>
>> On 07/30/12 10:41, Michał Górny wrote:
>> > On Mon, 30 Jul 2012 10:35:36 -0400
>> > Michael Orlitzky <michael@orlitzky.com> wrote:
>> >
>> >> On 07/27/12 16:16, Aaron W. Swenson wrote:
>> >>>
>> >>> No user will be happy with whatever we decide to use as a default.
>> >>
>> >> The defaults should be what's best for the most people, with a bias
>> >> towards safety. Why don't we just take a survey and choose the most
>> >> common utf8 response?
>> >
>> > How can you take a survey like that? How will you ensure it actually
>> > hits the majority? How will you define the majority?
>> >
>>
>> Considering that the alternative is to force everyone to change it
>> manually, you can do it however you want and it'll be an improvement.
>
> My point here is that you want the thing to change. So you first try to
> convince people here to change. We practically did a small survey here
> and in the result we didn't agree on doing the change.
>
> So you're saying we should do another survey on another group, hoping
> that this time the result will be on your side.
>
>>   1) Create a webpage with a bunch of options, count the results
>>
>>   2) Ask the g.o mailing lists, count responses manually
>>
>>   3) Use google docs like the website survey that went out a few days
>>      ago
>>
>> It won't hit everyone, but no survey ever does. As long as you get a
>> large enough unbiased sample, it doesn't matter. And anything would be
>> an improvement, so it doesn't matter anyway.
>
> It depends on who the 'unbiased sample' is. Are you interested only in
> opinion of Gentoo users who visit the website? Who sync once a day?
> Once a week? Who follow Gentoo Planet? Who participate in the forums?
>
> We can create the survey and announce it everywhere. But it still won't
> catch many old-time Gentoo users who can actually have something
> opposite to say. It won't be unbiased.

I was thinking about this, and I suspect that a survey period of 1-2
months is likely fine. It should also be enough to scoop up people who
run servers and monitor those servers for security updates.

-- 
:wq


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 16:28                     ` Michał Górny
  2012-07-30 16:57                       ` Michael Mol
@ 2012-07-30 17:33                       ` Michael Orlitzky
  2012-07-30 19:02                         ` Walter Dnes
  2012-08-02 21:38                         ` Kent Fredric
  1 sibling, 2 replies; 49+ messages in thread
From: Michael Orlitzky @ 2012-07-30 17:33 UTC (permalink / raw
  To: gentoo-dev

On 07/30/12 12:28, Michał Górny wrote:
> 
> My point here is that you want the thing to change. So you first try to
> convince people here to change. We practically did a small survey here
> and in the result we didn't agree on doing the change.
> 
> So you're saying we should do another survey on another group, hoping
> that this time the result will be on your side.

We didn't do a survey, we asked,

  "Is there a reason for not using at least en_US.UTF-8 as a "sane"
   default value?"

Unsurprisingly, the responses contained reasons for not using
en_US.UTF-8 as the default.

Don't take my original reply out of context, I don't actually care what
we have as the default.


> 
> It depends on who the 'unbiased sample' is. Are you interested only in
> opinion of Gentoo users who visit the website? Who sync once a day?
> Once a week? Who follow Gentoo Planet? Who participate in the forums?
> 
> We can create the survey and announce it everywhere. But it still won't
> catch many old-time Gentoo users who can actually have something
> opposite to say. It won't be unbiased.

The technical objection to C.UTF-8 is that it's non-standard, Ok. What
are the technical objections to LC_CTYPE=en_US.UTF-8? If the
alternatives are all improvements, the statistics are irrelevant.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 17:33                       ` Michael Orlitzky
@ 2012-07-30 19:02                         ` Walter Dnes
  2012-07-31 15:16                           ` Michael Orlitzky
  2012-08-02 21:38                         ` Kent Fredric
  1 sibling, 1 reply; 49+ messages in thread
From: Walter Dnes @ 2012-07-30 19:02 UTC (permalink / raw
  To: gentoo-dev

On Mon, Jul 30, 2012 at 01:33:48PM -0400, Michael Orlitzky wrote

> The technical objection to C.UTF-8 is that it's non-standard, Ok.
> What are the technical objections to LC_CTYPE=en_US.UTF-8? If the
> alternatives are all improvements, the statistics are irrelevant.

  I ran into a problem several months ago with xfreecell not running.
Turned out the ISO8859-1 fonts were not being generated, just UTF-8.
xfreecell needs ISO8859-1 fonts.  And it's not the only package.  I
modified xorg-2.eclass so that font packages would build ISO8859-1.  See
http://article.gmane.org/gmane.linux.gentoo.user/252316/ for the gory
details.  Would forcing UTF-8 cause problems for packages that expect
specific ISO encodings in X fonts?

  The important part of the eclass mod was to manually enable iso8859-1
and disable all other encodings...

if grep -q -s "disable-all-encodings" ${ECONF_SOURCE:-.}/configure; then
			FONT_OPTIONS+="
				--enable-iso8859-1
				--disable-iso10646
				--disable-iso10646-1
				--disable-iso8859-2
				--disable-iso8859-3
				--disable-iso8859-4
				--disable-iso8859-5
				--disable-iso8859-6
				--disable-iso8859-7
				--disable-iso8859-8
				--disable-iso8859-9
				--disable-iso8859-10
				--disable-iso8859-11
				--disable-iso8859-12
				--disable-iso8859-13
				--disable-iso8859-14
				--disable-iso8859-15
				--disable-iso8859-16
				--disable-jisx0201
				--disable-koi8-r"
		else
			FONT_OPTIONS+="
				--disable-iso10646
				--disable-iso10646-1
				--disable-iso8859-2
				--disable-iso8859-3
				--disable-iso8859-4
				--disable-iso8859-5
				--disable-iso8859-6
				--disable-iso8859-7
				--disable-iso8859-8
				--disable-iso8859-9
				--disable-iso8859-10
				--disable-iso8859-11
				--disable-iso8859-12
				--disable-iso8859-13
				--disable-iso8859-14
				--disable-iso8859-15
				--disable-iso8859-16
				--disable-jisx0201
				--disable-koi8-r"
		fi

-- 
Walter Dnes <waltdnes@waltdnes.org>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 19:02                         ` Walter Dnes
@ 2012-07-31 15:16                           ` Michael Orlitzky
  0 siblings, 0 replies; 49+ messages in thread
From: Michael Orlitzky @ 2012-07-31 15:16 UTC (permalink / raw
  To: gentoo-dev

On 07/30/12 15:02, Walter Dnes wrote:
> Would forcing UTF-8 cause problems for packages that expect
> specific ISO encodings in X fonts?

Not that I know of (and setting a default wouldn't force anything).

xfreecell's readme states "Make sure there is a font named 7x14" and
another thread mentions that this is provided by
media-fonts/font-misc-misc so that sounds like a bug in the ebuild to me.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 15:51                     ` Aaron W. Swenson
@ 2012-08-01 20:18                       ` Andreas K. Huettel
  2012-08-01 20:29                         ` Michael Orlitzky
  0 siblings, 1 reply; 49+ messages in thread
From: Andreas K. Huettel @ 2012-08-01 20:18 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: Text/Plain, Size: 706 bytes --]


> 
> If it turns out that C or POSIX is the most common response, we should
> then default the locale to en_US.UTF-8 if we really want to default to
> a UTF-8 setting. The reason being it makes sense to have the default
> locale set to the country of origin, which in our case is the United
> States.
> 

Given the number of Gentoo devs (especially on the desktop side where this 
matters most) from other parts of the world, that's not really a valid 
argument. In particular in cases as e.g. "Paper size setting", where basically 
US stubbornness stands against the rest of the planet.

-- 

Andreas K. Huettel
Gentoo Linux developer 
dilfridge@gentoo.org
http://www.akhuettel.de/


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-01 20:18                       ` Andreas K. Huettel
@ 2012-08-01 20:29                         ` Michael Orlitzky
  2012-08-02  0:20                           ` Walter Dnes
  0 siblings, 1 reply; 49+ messages in thread
From: Michael Orlitzky @ 2012-08-01 20:29 UTC (permalink / raw
  To: gentoo-dev

On 08/01/12 16:18, Andreas K. Huettel wrote:
> 
>>
>> If it turns out that C or POSIX is the most common response, we should
>> then default the locale to en_US.UTF-8 if we really want to default to
>> a UTF-8 setting. The reason being it makes sense to have the default
>> locale set to the country of origin, which in our case is the United
>> States.
>>
> 
> Given the number of Gentoo devs (especially on the desktop side where this 
> matters most) from other parts of the world, that's not really a valid 
> argument. In particular in cases as e.g. "Paper size setting", where basically 
> US stubbornness stands against the rest of the planet.
> 

Every locale is wrong for somebody; the idea was that by taking a
survey, you could make it wrong for the least amount of people (by default).

If the majority of users use a stupid paper size, the best default is
still whatever they use regardless of any personal preferences.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-01 20:29                         ` Michael Orlitzky
@ 2012-08-02  0:20                           ` Walter Dnes
  2012-08-02  1:00                             ` Mike Gilbert
                                               ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Walter Dnes @ 2012-08-02  0:20 UTC (permalink / raw
  To: gentoo-dev

On Wed, Aug 01, 2012 at 04:29:42PM -0400, Michael Orlitzky wrote

> Every locale is wrong for somebody; the idea was that by taking
> a survey, you could make it wrong for the least amount of people
> (by default).

  Question... has anybody ever considered that maybe a POSIX locale
is wrong for the least amount of people???  There's also a very damning
statement in the post that started this thread...

On Thu, Jul 19, 2012 at 11:39:59PM +0200, Sascha Cunz wrote
> I recently discovered that I for some reason haven't noticed the
> warning about setting the locale to utf-8 in the gentoo handbook for
> obviously several years; thus i was still running all my systems in
> a POSIX locale since i never cared much about it.
> 
> However, since I noticed, I talked to several people about it; all
> of them stating as first response: "Not shipping with a utf-8 locale
> turned on by default nowadays probably is a bug in your distro"

  That's right... the poster was running a POSIX locale for several
years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  Then "several
people said" "Not shipping with a utf-8 locale turned on by default
nowadays probably is a bug in your distro".  And suddenly it's a
problem.  What's next?  Despite running with no problems for many years
with a separate /usr and no initramfs, will we have "several people"
come along and tell us that it's a bug in our distro?  Oh... wait...

  The fact that "other distros do it" does not constitute justification
for us to do it.  If I wanted to run Redhat or Ubuntu, I'd run Redhat or
Ubuntu.  We're ignoring a very basic question here... what problems does
shipping with a POSIX locale cause that would be fixed by setting a UTF8
default locale???  I want a real answer.  Not something along the lines
of "But daddy, all the other kids are doing it".

-- 
Walter Dnes <waltdnes@waltdnes.org>


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  0:20                           ` Walter Dnes
@ 2012-08-02  1:00                             ` Mike Gilbert
  2012-08-02  6:42                               ` Fabian Groffen
  2012-08-02  5:36                             ` Peter Stuge
  2012-08-02  5:43                             ` [gentoo-dev] " Sergey Popov
  2 siblings, 1 reply; 49+ messages in thread
From: Mike Gilbert @ 2012-08-02  1:00 UTC (permalink / raw
  To: gentoo-dev

On Wed, Aug 1, 2012 at 8:20 PM, Walter Dnes <waltdnes@waltdnes.org> wrote:
> We're ignoring a very basic question here... what problems does
> shipping with a POSIX locale cause that would be fixed by setting a UTF8
> default locale???  I want a real answer.  Not something along the lines
> of "But daddy, all the other kids are doing it".
>

Try reading the rest of the thread before posting a rant.

Diego mentioned the python issue. As well, there are many test suites
that malfunction without a UTF-8 or en_US.UTF-8 locale. If you hunt
through Bugzilla, you can probably dig up other issues.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  0:20                           ` Walter Dnes
  2012-08-02  1:00                             ` Mike Gilbert
@ 2012-08-02  5:36                             ` Peter Stuge
  2012-08-02 18:22                               ` [gentoo-dev] " Duncan
  2012-08-02  5:43                             ` [gentoo-dev] " Sergey Popov
  2 siblings, 1 reply; 49+ messages in thread
From: Peter Stuge @ 2012-08-02  5:36 UTC (permalink / raw
  To: gentoo-dev

Walter Dnes wrote:
> The fact that "other distros do it" does not constitute
> justification for us to do it.

Unfortunately that exact reason, along with "Fedora is doing it", was
cited by a very active developer as reason to reject technical points
which I tried to make a few times.

But that is off-topic. Let's leave it for later. All I'm saying is
don't underestimate pack mentality.


//Peter


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  0:20                           ` Walter Dnes
  2012-08-02  1:00                             ` Mike Gilbert
  2012-08-02  5:36                             ` Peter Stuge
@ 2012-08-02  5:43                             ` Sergey Popov
  2 siblings, 0 replies; 49+ messages in thread
From: Sergey Popov @ 2012-08-02  5:43 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]

02.08.2012 04:20, Walter Dnes wrote:
>   That's right... the poster was running a POSIX locale for several
> years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  
This discussion is very similar with one, that i have seen in Russian
Linux community some years ago about migrating from ru_RU.KOI8-R to
ru_RU.UTF-8. Arguments from "KOI8-R guys" were the same - "Why we should
change something if it works?" and they are also did not notice
fundamental problems with some vitally important packages, which can not
be replaced or need to be heavily patched to work properly. Arguments
from "UTF-8 guys" were not ideal, but locale change brokes only old or
unsupported packages, so they win.

P.S. I do not think that comparison with 'initramfs and separate /usr
problem' is correct in this case. Default locale change is evolution,
not revolution...


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  1:00                             ` Mike Gilbert
@ 2012-08-02  6:42                               ` Fabian Groffen
  2012-08-02  9:14                                 ` Stelian Ionescu
  2012-08-02 18:21                                 ` Diego Elio Pettenò
  0 siblings, 2 replies; 49+ messages in thread
From: Fabian Groffen @ 2012-08-02  6:42 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 493 bytes --]

On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
> Diego mentioned the python issue.

Honestly, if some asian person has whatever charset that I often find in
spam messages, but is not UTF-8, are you then going to tell that person
to switch to UTF-8 to get those python packages emerged?  I hope not.

There is a difference between "there is a UTF-8 locale available on the
system" and "en_US.UTF-8 locale is in effect".

Fabian

-- 
Fabian Groffen
Gentoo on a different level

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  6:42                               ` Fabian Groffen
@ 2012-08-02  9:14                                 ` Stelian Ionescu
  2012-08-02 18:21                                 ` Diego Elio Pettenò
  1 sibling, 0 replies; 49+ messages in thread
From: Stelian Ionescu @ 2012-08-02  9:14 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

On Thu, 2012-08-02 at 08:42 +0200, Fabian Groffen wrote:
> On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
> > Diego mentioned the python issue.
> 
> Honestly, if some asian person has whatever charset that I often find in
> spam messages, but is not UTF-8, are you then going to tell that person
> to switch to UTF-8 to get those python packages emerged?  I hope not.

Yes.

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02  6:42                               ` Fabian Groffen
  2012-08-02  9:14                                 ` Stelian Ionescu
@ 2012-08-02 18:21                                 ` Diego Elio Pettenò
  2012-08-02 18:32                                   ` Mike Gilbert
  2012-08-02 18:45                                   ` Alexis Ballier
  1 sibling, 2 replies; 49+ messages in thread
From: Diego Elio Pettenò @ 2012-08-02 18:21 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

On 01/08/2012 23:42, Fabian Groffen wrote:
> Honestly, if some asian person has whatever charset that I often find in
> spam messages, but is not UTF-8, are you then going to tell that person
> to switch to UTF-8 to get those python packages emerged?  I hope not.

Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
available, but doesn't set in by default -> Python stuff fails to build
or test -> not going to be fixed with "change your locale" reasoning.

Is it mental? Yes.
Would I like that to change? Yes.
Do I care ẃhether that's through the use of cluebyfour on the Python
team or by setting an utf-8 locale by default? Not in the least.

-- 
Diego Elio Pettenò — Flameeyes
flameeyes@flameeyes.eu — http://blog.flameeyes.eu/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 554 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [gentoo-dev] Re: UTF-8 locale by default
  2012-08-02  5:36                             ` Peter Stuge
@ 2012-08-02 18:22                               ` Duncan
  0 siblings, 0 replies; 49+ messages in thread
From: Duncan @ 2012-08-02 18:22 UTC (permalink / raw
  To: gentoo-dev

Peter Stuge posted on Thu, 02 Aug 2012 07:36:07 +0200 as excerpted:

> Walter Dnes wrote:
>> The fact that "other distros do it" does not constitute justification
>> for us to do it.
> 
> Unfortunately that exact reason, along with "Fedora is doing it", was
> cited by a very active developer as reason to reject technical points
> which I tried to make a few times.
> 
> But that is off-topic. Let's leave it for later. All I'm saying is don't
> underestimate pack mentality.

I don't know if it applies here, but there's a difference between "other 
distros do it", and "that's the way upstream created it".  If upstream is 
effectively another distro, the first might be true as well, but gentoo 
has always had a policy of trying to do it upstream's way unless there's 
a good reason not to, and that's quite different than just following the 
distro pack.  (You likely know this already, but other readers may not, 
so I'm emphasizing it.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02 18:21                                 ` Diego Elio Pettenò
@ 2012-08-02 18:32                                   ` Mike Gilbert
  2012-08-03 14:59                                     ` Matthew Summers
  2012-08-02 18:45                                   ` Alexis Ballier
  1 sibling, 1 reply; 49+ messages in thread
From: Mike Gilbert @ 2012-08-02 18:32 UTC (permalink / raw
  To: gentoo-dev

On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
<flameeyes@flameeyes.eu> wrote:
> On 01/08/2012 23:42, Fabian Groffen wrote:
>> Honestly, if some asian person has whatever charset that I often find in
>> spam messages, but is not UTF-8, are you then going to tell that person
>> to switch to UTF-8 to get those python packages emerged?  I hope not.
>
> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
> available, but doesn't set in by default -> Python stuff fails to build
> or test -> not going to be fixed with "change your locale" reasoning.
>
> Is it mental? Yes.
> Would I like that to change? Yes.
> Do I care ẃhether that's through the use of cluebyfour on the Python
> team or by setting an utf-8 locale by default? Not in the least.
>

Please apply the cluebyfour to the upstream developers of python and
python modules. :-)

I do try to fix unicode problems if I run into them. However,
sometimes it just isn't worth the effort.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02 18:21                                 ` Diego Elio Pettenò
  2012-08-02 18:32                                   ` Mike Gilbert
@ 2012-08-02 18:45                                   ` Alexis Ballier
  1 sibling, 0 replies; 49+ messages in thread
From: Alexis Ballier @ 2012-08-02 18:45 UTC (permalink / raw
  To: gentoo-dev

On Thu, 02 Aug 2012 11:21:40 -0700
Diego Elio Pettenò <flameeyes@flameeyes.eu> wrote:

> On 01/08/2012 23:42, Fabian Groffen wrote:
> > Honestly, if some asian person has whatever charset that I often
> > find in spam messages, but is not UTF-8, are you then going to tell
> > that person to switch to UTF-8 to get those python packages
> > emerged?  I hope not.
> 
> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
> available, but doesn't set in by default -> Python stuff fails to
> build or test -> not going to be fixed with "change your locale"
> reasoning.

not that it is hard to set LC_ALL=sth before running the failing
command, or make the pm do it... we already fix regexp bugs with other
locales (or workaround them by setting LC_ALL=C), it falls under the
same category.
you just need to teach people, and maybe mandate an utf8 locale to be
present; the same way they do not consider estonian alphabet ordering
'broken' they would not consider not having an utf8 locale 'broken',
esp. when said utf8 is far from being optimal in terms of size for asian
languages.

A.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-30 17:33                       ` Michael Orlitzky
  2012-07-30 19:02                         ` Walter Dnes
@ 2012-08-02 21:38                         ` Kent Fredric
  1 sibling, 0 replies; 49+ messages in thread
From: Kent Fredric @ 2012-08-02 21:38 UTC (permalink / raw
  To: gentoo-dev

On 31 July 2012 05:33, Michael Orlitzky <michael@orlitzky.com> wrote:
> On 07/30/12 12:28, Michał Górny wrote:
>>
>> My point here is that you want the thing to change. So you first try to
>> convince people here to change. We practically did a small survey here
>> and in the result we didn't agree on doing the change.
>>
>> So you're saying we should do another survey on another group, hoping
>> that this time the result will be on your side.
>
> We didn't do a survey, we asked,
>
>   "Is there a reason for not using at least en_US.UTF-8 as a "sane"
>    default value?"
>
> Unsurprisingly, the responses contained reasons for not using
> en_US.UTF-8 as the default.
>

I think its a shame that :

1. the current handbook way to change timezone is manually editing a file.
2. the handbook doesn't mention `eselect locale`
3. `eselect locale list` is useless if you have *all* locales available to you.
4. `eselect locale` can only set the LANG variable.
5. that eselect doesn't have an interactive mode yet.

Why? because this problem could be made simpler by providing a way to
use a recommended locale for your timezone, which is likely to yield a
more sane default for that timezone.

It would also make it easier to validate the value the user chooses
for their Timezone value.

Consider:

eselect timezone list
 # all level 1 timezones + groups , ie: like ls /usr/share/zoneinfo
eselect timezone list  America/
# contents of /usr/share/zoneinfo/America
eselect timezone set America/Chicago
# /etc/timezone is updated to  'America/Chicago'
# /etc/localtime is replaced with /usr/share/zoneinfo/America/Chicago
eselect locale set --all auto
# LANG and LC_* are set using the values defined as "default" for
America/Chicago
eselect locale set --ctype auto
# Only LC_CTYPE is autopopulated.
eselect locale list
# 600 items because you have a vanilla locale.defs
eselect locale list --timezone
# shows a list of LOCALE values for the current TZ, with the one that
would be used as default first/marked up differently
eselect locale list en
# shows english locale options
eselect locale set --ctype en_US.utf8


The benefits of setting these locales this way are obvious to me at
least, you can set locales to a value that is sensible automatically.
You also can validate a users choice of locale and provide feedback,
such as, you can list non-installed locales, and then tell the user if
thy try to use a locale that isn't installed yet they need to update
locales.def

The only way I can suggest something better, would be an interactive
locale setter, something like 'tzselect' , except sets timezone *and*
locale information, with the ability to automatically update
locales.def and add new locale definitions and regenerate the locale
database.

This way, you could have a selection process more like this:

https://gist.github.com/3240866

#? 1

The following information has been given:

United States
Eastern Time

Therefore TZ='America/New_York' will be used.
Local time is now: Thu Aug 2 17:33:17 EDT 2012.
Universal Time is now: Thu Aug 2 21:33:17 UTC 2012.
Is the above information OK?
1) Yes
2) No
#? 1
Your Current locale settings are:

    LANG="POSIX"

The recommended settings for your locale are :
    LANG="en_US.utf8"
    LC_CTYPE="en_US.utf8"

Do you wish to change your locale settings at this time?
1) No
2) Yes - Use recommended settings
3) Yes - Configure locale interactively.

At least this way, the effort required to configure your system into a
very good logical UTF8 default is trivial.

-- 
Kent

perl -e  "print substr( \"edrgmaM  SPA NOcomil.ic\\@tfrken\", \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );"

http://kent-fredric.fox.geek.nz


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-07-27 17:24         ` Mike Frysinger
  2012-07-27 18:29           ` Pacho Ramos
@ 2012-08-03  5:16           ` Luca Barbato
  2012-08-08  1:58             ` Dan Douglas
  1 sibling, 1 reply; 49+ messages in thread
From: Luca Barbato @ 2012-08-03  5:16 UTC (permalink / raw
  To: gentoo-dev

On 07/27/2012 07:24 PM, Mike Frysinger wrote:
> yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the only 
> real option in my mind for making unicode the default.  any other 
> amalgamations of various locales is ugly as sin.

When they meet? I'd be fine with a pre-release =P

lu



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-02 18:32                                   ` Mike Gilbert
@ 2012-08-03 14:59                                     ` Matthew Summers
  2012-08-03 15:47                                       ` Michał Górny
  0 siblings, 1 reply; 49+ messages in thread
From: Matthew Summers @ 2012-08-03 14:59 UTC (permalink / raw
  To: gentoo-dev

On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert <floppym@gentoo.org> wrote:
> On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
> <flameeyes@flameeyes.eu> wrote:
>> On 01/08/2012 23:42, Fabian Groffen wrote:
>>> Honestly, if some asian person has whatever charset that I often find in
>>> spam messages, but is not UTF-8, are you then going to tell that person
>>> to switch to UTF-8 to get those python packages emerged?  I hope not.
>>
>> Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
>> available, but doesn't set in by default -> Python stuff fails to build
>> or test -> not going to be fixed with "change your locale" reasoning.
>>
>> Is it mental? Yes.
>> Would I like that to change? Yes.
>> Do I care ẃhether that's through the use of cluebyfour on the Python
>> team or by setting an utf-8 locale by default? Not in the least.
>>
>
> Please apply the cluebyfour to the upstream developers of python and
> python modules. :-)
>
> I do try to fix unicode problems if I run into them. However,
> sometimes it just isn't worth the effort.
>

Python upstream is doing what they think is best in using unicode.

That said, what if we just temporarily set a locale in the ebuild for
running tests and elsewhere? Is this unreasonable or impossible? It
might not be a great solution, this method, since users' stuff will
still break.

Further, I support the use of C.UTF-8 when it is ready. It seems like
the lowest common denominator to me.


-- 
Matthew W. Summers
Gentoo Foundation Inc.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-03 14:59                                     ` Matthew Summers
@ 2012-08-03 15:47                                       ` Michał Górny
  2012-08-03 16:54                                         ` Alexis Ballier
  0 siblings, 1 reply; 49+ messages in thread
From: Michał Górny @ 2012-08-03 15:47 UTC (permalink / raw
  To: gentoo-dev; +Cc: quantumsummers

[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]

On Fri, 3 Aug 2012 09:59:42 -0500
Matthew Summers <quantumsummers@gentoo.org> wrote:

> On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert <floppym@gentoo.org>
> wrote:
> > On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
> > <flameeyes@flameeyes.eu> wrote:
> >> On 01/08/2012 23:42, Fabian Groffen wrote:
> >>> Honestly, if some asian person has whatever charset that I often
> >>> find in spam messages, but is not UTF-8, are you then going to
> >>> tell that person to switch to UTF-8 to get those python packages
> >>> emerged?  I hope not.
> >>
> >> Tell that to the Python team I guess. My tinderbox _has_ utf8
> >> locales available, but doesn't set in by default -> Python stuff
> >> fails to build or test -> not going to be fixed with "change your
> >> locale" reasoning.
> >>
> >> Is it mental? Yes.
> >> Would I like that to change? Yes.
> >> Do I care ẃhether that's through the use of cluebyfour on the
> >> Python team or by setting an utf-8 locale by default? Not in the
> >> least.
> >>
> >
> > Please apply the cluebyfour to the upstream developers of python and
> > python modules. :-)
> >
> > I do try to fix unicode problems if I run into them. However,
> > sometimes it just isn't worth the effort.
> >
> 
> Python upstream is doing what they think is best in using unicode.
> 
> That said, what if we just temporarily set a locale in the ebuild for
> running tests and elsewhere? Is this unreasonable or impossible? It
> might not be a great solution, this method, since users' stuff will
> still break.

It is impossible because you can't know which locale a particular
system has available. AFAIK there's no 'it-will-always-work' choice;
unless we're going to enforce generating some common locale, or do very
ugly things.

> 
> Further, I support the use of C.UTF-8 when it is ready. It seems like
> the lowest common denominator to me.

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-03 15:47                                       ` Michał Górny
@ 2012-08-03 16:54                                         ` Alexis Ballier
  2012-08-03 17:09                                           ` Diego Elio Pettenò
  0 siblings, 1 reply; 49+ messages in thread
From: Alexis Ballier @ 2012-08-03 16:54 UTC (permalink / raw
  To: gentoo-dev

On Fri, 3 Aug 2012 17:47:24 +0200
Michał Górny <mgorny@gentoo.org> wrote:
> > Python upstream is doing what they think is best in using unicode.
> > 
> > That said, what if we just temporarily set a locale in the ebuild
> > for running tests and elsewhere? Is this unreasonable or
> > impossible? It might not be a great solution, this method, since
> > users' stuff will still break.
> 
> It is impossible because you can't know which locale a particular
> system has available. AFAIK there's no 'it-will-always-work' choice;
> unless we're going to enforce generating some common locale, or do
> very ugly things.

I don't think anyone will object to enforcing a given locale to be
present, even en_US.UTF-8; people will object if they have to use that
locale.

Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

A.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-03 16:54                                         ` Alexis Ballier
@ 2012-08-03 17:09                                           ` Diego Elio Pettenò
  0 siblings, 0 replies; 49+ messages in thread
From: Diego Elio Pettenò @ 2012-08-03 17:09 UTC (permalink / raw
  To: gentoo-dev

On 03/08/2012 09:54, Alexis Ballier wrote:
> I don't think anyone will object to enforcing a given locale to be
> present, even en_US.UTF-8; people will object if they have to use that
> locale.
> 
> Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

Agreed. And there _is_ a way to tell which locales are available:
`locale -a`.

-- 
Diego Elio Pettenò — Flameeyes
flameeyes@flameeyes.eu — http://blog.flameeyes.eu/


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-03  5:16           ` Luca Barbato
@ 2012-08-08  1:58             ` Dan Douglas
  2012-12-31 17:14               ` Maxim Kammerer
  0 siblings, 1 reply; 49+ messages in thread
From: Dan Douglas @ 2012-08-08  1:58 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 646 bytes --]

On Friday, August 03, 2012 07:16:45 AM Luca Barbato wrote:
> On 07/27/2012 07:24 PM, Mike Frysinger wrote:
> > yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
only 
> > real option in my mind for making unicode the default.  any other 
> > amalgamations of various locales is ugly as sin.
> 
> When they meet? I'd be fine with a pre-release =P
> 
> lu
> 

2008 TC1 is just finishing up balloting as we speak. If this isn't already in 
there you may be in for a long wait. Feel free to subscribe to the austin-
group lists -- It's open to anyone. A calendar with the teleconference 
schedule is available.
--
Dan Douglas

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-08-08  1:58             ` Dan Douglas
@ 2012-12-31 17:14               ` Maxim Kammerer
  2012-12-31 21:44                 ` Zac Medico
  0 siblings, 1 reply; 49+ messages in thread
From: Maxim Kammerer @ 2012-12-31 17:14 UTC (permalink / raw
  To: gentoo-dev

Hi,

stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
/usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt

Working with those (e.g., backup) probably requires a UTF-8 locale. Is
this considered acceptable? Did anyone notice?

-- 
Maxim Kammerer
Liberté Linux: http://dee.su/liberte


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [gentoo-dev] UTF-8 locale by default
  2012-12-31 17:14               ` Maxim Kammerer
@ 2012-12-31 21:44                 ` Zac Medico
  0 siblings, 0 replies; 49+ messages in thread
From: Zac Medico @ 2012-12-31 21:44 UTC (permalink / raw
  To: gentoo-dev

On 12/31/2012 09:14 AM, Maxim Kammerer wrote:
> Hi,
> 
> stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
> /usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt
> 
> Working with those (e.g., backup) probably requires a UTF-8 locale. Is
> this considered acceptable? Did anyone notice?

It's been that way for a very long time (over a year). Since bug #382199
[1], portage uses a constant UTF-8 encoding for all installed files
regardless of the locale, so at least you can count on portage handling
those UTF-8 names even if you don't have a UTF-8 locale configured.

[1] https://bugs.gentoo.org/show_bug.cgi?id=382199
-- 
Thanks,
Zac


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2012-12-31 21:45 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-19 21:39 [gentoo-dev] UTF-8 locale by default Sascha Cunz
2012-07-19 22:23 ` Chí-Thanh Christopher Nguyễn
2012-07-19 22:28 ` Ulrich Mueller
2012-07-27  6:42   ` Ben de Groot
2012-07-27  7:08     ` Ulrich Mueller
2012-07-27  7:19       ` Rick "Zero_Chaos" Farina
2012-07-27  8:06       ` Dan Douglas
2012-07-27  8:34         ` Ben de Groot
2012-07-27  8:49           ` Michał Górny
2012-07-27  8:38       ` Cyprien Nicolas
2012-07-27  8:47         ` Michał Górny
2012-07-27 12:13       ` Chí-Thanh Christopher Nguyễn
2012-07-27 17:24         ` Mike Frysinger
2012-07-27 18:29           ` Pacho Ramos
2012-07-27 20:16             ` Aaron W. Swenson
2012-07-27 20:55               ` Diego Elio Pettenò
2012-07-30 14:35               ` Michael Orlitzky
2012-07-30 14:41                 ` Michał Górny
2012-07-30 14:50                   ` Michael Orlitzky
2012-07-30 16:28                     ` Michał Górny
2012-07-30 16:57                       ` Michael Mol
2012-07-30 17:33                       ` Michael Orlitzky
2012-07-30 19:02                         ` Walter Dnes
2012-07-31 15:16                           ` Michael Orlitzky
2012-08-02 21:38                         ` Kent Fredric
2012-07-30 15:04                   ` Michael Mol
2012-07-30 15:51                     ` Aaron W. Swenson
2012-08-01 20:18                       ` Andreas K. Huettel
2012-08-01 20:29                         ` Michael Orlitzky
2012-08-02  0:20                           ` Walter Dnes
2012-08-02  1:00                             ` Mike Gilbert
2012-08-02  6:42                               ` Fabian Groffen
2012-08-02  9:14                                 ` Stelian Ionescu
2012-08-02 18:21                                 ` Diego Elio Pettenò
2012-08-02 18:32                                   ` Mike Gilbert
2012-08-03 14:59                                     ` Matthew Summers
2012-08-03 15:47                                       ` Michał Górny
2012-08-03 16:54                                         ` Alexis Ballier
2012-08-03 17:09                                           ` Diego Elio Pettenò
2012-08-02 18:45                                   ` Alexis Ballier
2012-08-02  5:36                             ` Peter Stuge
2012-08-02 18:22                               ` [gentoo-dev] " Duncan
2012-08-02  5:43                             ` [gentoo-dev] " Sergey Popov
2012-07-30 14:42                 ` Michael Mol
2012-07-30 15:29                   ` Rich Freeman
2012-08-03  5:16           ` Luca Barbato
2012-08-08  1:58             ` Dan Douglas
2012-12-31 17:14               ` Maxim Kammerer
2012-12-31 21:44                 ` Zac Medico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox