From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1OeucS-00040G-Jo for garchives@archives.gentoo.org; Fri, 30 Jul 2010 18:47:44 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id BA204E08FB; Fri, 30 Jul 2010 18:47:40 +0000 (UTC) Received: from mail-pz0-f53.google.com (mail-pz0-f53.google.com [209.85.210.53]) by pigeon.gentoo.org (Postfix) with ESMTP id DE298E08D1 for ; Fri, 30 Jul 2010 18:47:20 +0000 (UTC) Received: by pzk9 with SMTP id 9so852127pzk.40 for ; Fri, 30 Jul 2010 11:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:received:date:from:to:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=VbH+wAWO2RQNDP1W+7SQC7y38RVYTFtjZlu+hVb3d4I=; b=ff7ydnA9TqCu0bLPz5CyuVhcSQ3EYo7xIyYNMScVGNQVEvxLClPebnQQ7N820tUGmb AnvfPOQBV2ps7hTOs0uKx5MKI3QORNCbc2jsv6EScS2u0YM8UQEYUUub8da9OFCsiHyS C5aeO6vURFOweGZcAsIwos4fhT9OZVbWCB8VM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=usEjmx//2A4mSneWZ1Uf98SqtDtwkJREccNYRFkPrM7vb62/Q5c55pMPzXXV4LlsKR EY/M1iFBIDp0Inh9zIWocCOwTsUiWx9tWTraU1BvDs7NhqVzPCbsDYDl63ZN/KlFxl9X XzoIlwNzyEXVhvk9RdGEEetco8E0Ab3M64RZw= Received: by 10.142.156.16 with SMTP id d16mr2096376wfe.324.1280515640309; Fri, 30 Jul 2010 11:47:20 -0700 (PDT) Received: from smtp.gmail.com (c-67-171-128-62.hsd1.wa.comcast.net [67.171.128.62]) by mx.google.com with ESMTPS id t11sm2892577wfc.16.2010.07.30.11.47.17 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 30 Jul 2010 11:47:19 -0700 (PDT) Received: by smtp.gmail.com (sSMTP sendmail emulation); Fri, 30 Jul 2010 11:45:18 -0700 Date: Fri, 30 Jul 2010 11:45:18 -0700 From: Brian Harring To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] Locale check in python_pkg_setup() Message-ID: <20100730184518.GA32513@hrair> References: <201007300116.43653.Arfrever@gentoo.org> <4C5243C7.70709@gentoo.org> <20100730034827.GC15031@hrair> <4C530291.2010100@gentoo.org> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="PNTmBPCT7hxwcZjr" Content-Disposition: inline In-Reply-To: <4C530291.2010100@gentoo.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-Archives-Salt: 50ca8f20-8347-44a6-8cd1-6d38685f90f1 X-Archives-Hash: 058a9c31e2769f213aecf9360335c6cb --PNTmBPCT7hxwcZjr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 30, 2010 at 09:49:21AM -0700, "Paweee Hajdan, Jr." wrote: > On 7/29/10 8:48 PM, Brian Harring wrote: > > It's basically annoying people into changing to partially=20 > > sidestep a couple of bugs, instead of fixing the issue- and that's the= =20 > > wrong course of action. >=20 > I think that with python earlier than python-3 unicode handling is quite > complicated, and I'm not surprised there are problems with that. encoding handling wasn't that bad under py2k. Py3k just enforces the=20 boundaries- meaning you can't just skid by. > Arfrever, does python-3 have the same problem with non-UTF8 locales? ascii is a subset of utf-8 and ascii is a subset of latin-1; latin-1=20 and utf-8 aren't compatible in encoded form however. What this means is that the same set of bugs I ran down still will go=20 boom if you have a utf-8 locale and the code in question was dealing=20 w/ a latin-1 encoded file. > Another thing we can consider is making UTF8 the default setup in > Gentoo. I think most people (including me) don't care whether it's C or > UTF8 as long as it works. "as long as it works" in this case means "fix the code" as I've laid=20 out. Forcing locale's to sidestep it leaves the latin-1/utf8=20 incompatibility to go 'boom'. Basically, forcing utf8 doesn't "make it work". It reduces the cases=20 breakage will show up while leaving those issues still there- frankly=20 this is worse, can't fix those screwups without them breaking (for=20 better or worse, and preferably breaking in a testcase). We've got 4=20 bugs, and only one of them is semi complex fix (dodcutils needs to=20 require that html it's fed is utf8 compatible- valid enough req=20 anyways since html shouldn't be latin-1, it should be ascii or utf8). So.. get fixing, instead of dodging the work imo. ;) ~brian --PNTmBPCT7hxwcZjr Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (GNU/Linux) iEYEARECAAYFAkxTHb4ACgkQsiLx3HvNzgcU1ACgy50GXOOtllogbKd1ZEuHPjdM vH4AoJA9aMVrTnsrIBsAuEOZzQ7xtF3N =OruI -----END PGP SIGNATURE----- --PNTmBPCT7hxwcZjr--