From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 513D41381F3 for ; Wed, 7 Aug 2013 12:42:05 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 77612E0B9C; Wed, 7 Aug 2013 12:41:44 +0000 (UTC) Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 80784E0B95 for ; Wed, 7 Aug 2013 12:41:43 +0000 (UTC) Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 23407210F7 for ; Wed, 7 Aug 2013 08:41:43 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute1.internal (MEProxy); Wed, 07 Aug 2013 08:41:43 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.co.uk; h= message-id:date:from:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; s=mesmtp; bh=kmKDDCK4vfAIivaajnjaTVcWFJU=; b=Z+lr/mIMov1Se8S88FczwJKcVQai aYn4NsJw/AREj4B4jixU05/fVnQ+a9/ioxlFLyg13maqiWly1Ko/rHqnpQaBbV7b 0htjvVnKVS3X6uifUn0Z7+RLPpP1z4dhbGITWWEYulYvrkFmYwN1CEG1lKuY4f+k 1JnSKM5/DdEr/FE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; s=smtpout; bh=kmKDDCK4vfAIivaajnjaTV cWFJU=; b=G003y7Jkg8JK30fQEhKhFH14ckAtPdHJi8seZw4vMX3F4E/NbaHYWT /Q9PvQWpWvq4NACOQL7i0ehzPRtVX+NfQ3KRPcUmHjpBicuV9SmmR13Ixi023zB3 zrZb/XJeMkxDwP3RRVcE4poQ2PxUUiDZ2bo+L0bf1CsoBQIODDNss= X-Sasl-enc: bQjgISvKPnvFccQZN5sdayRTTJiZX/rDtgW0+Q1kvvU9 1375879302 Received: from [192.168.1.100] (unknown [94.170.82.148]) by mail.messagingengine.com (Postfix) with ESMTPA id C8BB06800B3 for ; Wed, 7 Aug 2013 08:41:42 -0400 (EDT) Message-ID: <5202407F.3020107@fastmail.co.uk> Date: Wed, 07 Aug 2013 13:41:35 +0100 From: Kerin Millar User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] export LC_CTYPE=en_US.UTF-8 References: <5200F440.7040002@fastmail.co.uk> <17B75555-DFBB-4FDB-A0F5-A236E7825738@stellar.eclipse.co.uk> In-Reply-To: <17B75555-DFBB-4FDB-A0F5-A236E7825738@stellar.eclipse.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Archives-Salt: a728791a-6be8-4ff3-ae65-c3310a0b3ff4 X-Archives-Hash: e91a7d4aebe8ed790101804b8a4b9cdd On 06/08/2013 23:42, Stroller wrote: > > On 6 August 2013, at 14:04, Kerin Millar wrote: >> ... >> If undefined, the value of LC_COLLATE is inherited from LANG. I'm not sure that overriding it is particularly useful nowadays but it doesn't hurt. > > It's been a couple of years since I looked into this, but I'm given to believe that LANG should set all LC_ variables correctly, and that overriding them is frowned upon. As has been mentioned, there are valid reasons to want to override the collation. Here is a concrete example: https://lists.gnu.org/archive/html/bug-gnu-utils/2003-08/msg00537.html Strictly speaking, grep is correct to behave that way but it can be confounding. In an ideal world, everyone would be using named classes instead of ranges in their regular expressions but it's not an ideal world. These days, grep no longer exhibits this characteristic in Gentoo. Nevertheless, it serves as a valid example of how collations for UTF-8 locales can be a liability. Of the other distros, Arch Linux also defined LC_COLLATE=C although I understand that they have just recently stopped doing that. On a production system, I would still be inclined to use it for reasons of safety. For that matter, some people refuse to use UTF-8 at all on the grounds of security; the handling of variable-width encodings continues to be an effective bug inducer. > I had to do this myself because, due to a bug, the en_GB time formatting failed to display am or pm. I believe this should be fixed now. Presumably: a) LANG was defined inappropriately b) LANG was defined appropriately but LC_TIME was defined otherwise c) LC_ALL was defined, trumping all I would definitely not advise doing any of these things. --Kerin