From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 76D68139084 for ; Sat, 25 Nov 2017 20:49:49 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id AB91CE0E90; Sat, 25 Nov 2017 20:49:40 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 867A3E0E8C for ; Sat, 25 Nov 2017 20:49:40 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id BB93D33FE2A for ; Sat, 25 Nov 2017 20:49:39 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id 44A4CA78C for ; Sat, 25 Nov 2017 20:49:36 +0000 (UTC) From: "Michał Górny" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Michał Górny" Message-ID: <1511642957.1b4daf535fc27f6ca28219ca9b71a9b9ab5d775b.mgorny@gentoo> Subject: [gentoo-commits] data/glep:master commit in: / X-VCS-Repository: data/glep X-VCS-Files: glep-0074.rst X-VCS-Directories: / X-VCS-Committer: mgorny X-VCS-Committer-Name: Michał Górny X-VCS-Revision: 1b4daf535fc27f6ca28219ca9b71a9b9ab5d775b X-VCS-Branch: master Date: Sat, 25 Nov 2017 20:49:36 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: 7458c986-eb4c-423b-9474-7bb47a771714 X-Archives-Hash: abccfbd482db9b1349ee773082cd03ed commit: 1b4daf535fc27f6ca28219ca9b71a9b9ab5d775b Author: Michał Górny gentoo org> AuthorDate: Thu Nov 23 18:44:54 2017 +0000 Commit: Michał Górny gentoo org> CommitDate: Sat Nov 25 20:49:17 2017 +0000 URL: https://gitweb.gentoo.org/data/glep.git/commit/?id=1b4daf53 glep-0074: Make extended filename encoding optional glep-0074.rst | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/glep-0074.rst b/glep-0074.rst index 6db6caa..5270b7a 100644 --- a/glep-0074.rst +++ b/glep-0074.rst @@ -142,8 +142,15 @@ corresponding to valid UTF-8 code points excluding the backwards slash (``\``) and characters classified as control characters and whitespace in the current version of the Unicode standard [#UNICODE]_. -Any of the excluded characters that are present in path must be encoded -using one of the following escape sequences: +The implementation can optionally support extended filename encoding +to support those paths. If the encoding is not supported, +the implementation must reject directories containing any files using +non-compliant names, as well as Manifest files whose filename field +contains such filenames. + +If the encoding is supported, then all of the excluded characters that +are present in path must be encoded using one of the following escape +sequences: - characters in the ``U+0000`` to ``U+007F`` range can be encoded as ``\xHH`` where ``HH`` specifies the zero-padded, hexadecimal @@ -615,6 +622,13 @@ by attempting to locate the size field and take everything before it as filename. This was terribly fragile and even if it worked, it would solve the problem only partially. +To preserve compatibility with the current implementations and given +that all of the listed characters are not allowed for the foreseeable +Gentoo uses, the extended encoding support is optional. If such support +is not provided, the implementation must unconditionally reject any +such files. Ignoring them implicitly would be confusing, and it is +not possible to use them in explicit ``IGNORE`` entries. + The character encoding method provides means to overcome the character restrictions to extend the tool usability beyond immediate Gentoo uses. The backslash escape form based on Python unicode strings is used