From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.77) (envelope-from ) id 1Srz9R-0001Kf-2C for garchives@archives.gentoo.org; Thu, 19 Jul 2012 22:24:53 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id EDD06E074E; Thu, 19 Jul 2012 22:24:34 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) by pigeon.gentoo.org (Postfix) with ESMTP id A013CE0747 for ; Thu, 19 Jul 2012 22:24:01 +0000 (UTC) Received: from [192.168.178.29] (e178066214.adsl.alicedsl.de [85.178.66.214]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: chithanh) by smtp.gentoo.org (Postfix) with ESMTPSA id B4CAF1B40BA for ; Thu, 19 Jul 2012 22:24:00 +0000 (UTC) Message-ID: <500888FA.6060307@gentoo.org> Date: Fri, 20 Jul 2012 00:23:54 +0200 From: =?UTF-8?B?Q2jDrS1UaGFuaCBDaHJpc3RvcGhlciBOZ3V54buFbg==?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120622 Firefox/13.0.1 SeaMonkey/2.10.1 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] UTF-8 locale by default References: <3146937.NEprMvEFLe@mephista> In-Reply-To: <3146937.NEprMvEFLe@mephista> X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: 016a1e92-7700-4615-b4c0-d74c79f5c075 X-Archives-Hash: a70bb94804295054daf5e04bea18ca20 Sascha Cunz schrieb: > Is there a reason for not using at least en_US.UTF-8 as a "sane" defaul= t=20 > value? It has been discussed some time ago already. Setting LANG=3D"en_US.UTF-8" would mess with collation rules, measurement&paper units etc. which has the potential to make users outside USA unhappy. It might make sense to set LC_CTYPE=3D"en_US.UTF8" but even so, transliteration may give you unexpected results. To illustrate this, try running echo =C3=A4=C3=A5 | LC_CTYPE=3Den_US.UTF-8 iconv -t ASCII//TRANSLIT -f UT= F-8 echo =C3=A4=C3=A5 | LC_CTYPE=3Dda_DK.UTF-8 iconv -t ASCII//TRANSLIT -f UT= F-8 echo =C3=A4=C3=A5 | LC_CTYPE=3Dde_DE.UTF-8 iconv -t ASCII//TRANSLIT -f UT= F-8 and compare the output. For the previous discussion, see this thread: http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071= d.xml Best regards, Ch=C3=AD-Thanh Christopher Nguy=E1=BB=85n