public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] [PATCH 0/2] glep-0068: Stricten the XML format
@ 2022-10-08  6:40 Michał Górny
  2022-10-08  6:40 ` [gentoo-dev] [PATCH 1/2] glep-0068: Clarify and restrict XML data format Michał Górny
  2022-10-08  6:40 ` [gentoo-dev] [PATCH 2/2] glep-0068: Indicate that unknown elements should be ignored Michał Górny
  0 siblings, 2 replies; 3+ messages in thread
From: Michał Górny @ 2022-10-08  6:40 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

Hi,

The spec is a bit lax about the XML features allowed.  However, we don't
really expect people to use fancy features like custom entities,
XInclude, etc.  Let's formally stricten the spec to disallow anything
remote or potentially dangerous to at least protect implementations
from the most common XML security problems.

While at it, let's make it clear that while we don't permit elements
outside the spec in metadata.xml files, we may add new elements or
attributes in future versions.

I'm not sure whether we should be increasing the version number here.
On one hand, the change roughly matches the original intent (i.e. no
metadata.xml files should be broken by it, and implementation should not
have been processing external DTDs or anything like that anyway).
On the other, technically speaking the new version is more restrictive
than the old one, so a major version bump would be correct.

WDYT?


Michał Górny (2):
  glep-0068: Clarify and restrict XML data format
  glep-0068: Indicate that unknown elements should be ignored

 glep-0068.rst | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

-- 
2.38.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [gentoo-dev] [PATCH 1/2] glep-0068: Clarify and restrict XML data format
  2022-10-08  6:40 [gentoo-dev] [PATCH 0/2] glep-0068: Stricten the XML format Michał Górny
@ 2022-10-08  6:40 ` Michał Górny
  2022-10-08  6:40 ` [gentoo-dev] [PATCH 2/2] glep-0068: Indicate that unknown elements should be ignored Michał Górny
  1 sibling, 0 replies; 3+ messages in thread
From: Michał Górny @ 2022-10-08  6:40 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

Explicitly specify XML 1.0 and link to the specification.  Forbid
"external markup declarations" and processing DTDs to secure against
common XML attacks.

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 glep-0068.rst | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/glep-0068.rst b/glep-0068.rst
index 78ac7ea..d3e3611 100644
--- a/glep-0068.rst
+++ b/glep-0068.rst
@@ -6,8 +6,8 @@ Type: Standards Track
 Status: Final
 Version: 1.2
 Created: 2016-03-14
-Last-Modified: 2022-05-22
-Post-History: 2016-03-16, 2018-02-20, 2022-05-22
+Last-Modified: 2022-10-07
+Post-History: 2016-03-16, 2018-02-20, 2022-05-22, 2022-10-07
 Content-Type: text/x-rst
 Requires: 67
 Replaces: 34, 46, 56
@@ -59,10 +59,14 @@ Metadata files
 --------------
 
 This specification provides two kinds of metadata files: category metadata
-files and package metadata files. Both kinds of files use XML file format
-with structure defined in this GLEP. The XML structure does not use
-a namespace and must not contain any elements outside the scope of this
-specification.
+files and package metadata files. Both kinds of files use the XML 1.0 file
+format [#XML10]_. They must not use external markup declarations, as defined
+in the XML specification. While they may reference or include a DTD, the parser
+must not fetch or process it.
+
+The data structure of metadata files is defined in this GLEP. The elements
+and attributes do not use namespaces. Conforming files must not contain
+any elements or attributes that are not defined in this specification.
 
 Category metadata files are named ``metadata.xml`` and located inside category
 directories in an ebuild repository. Their structure is described
@@ -516,6 +520,9 @@ References
 .. [#METADATA-DTD] The original metadata.dtd file
    https://gitweb.gentoo.org/data/dtd.git/tree/metadata.dtd?id=a908a93b5afe295359e0a01814c9bef8b5268bcd
 
+.. [#XML10] Extensible Markup Language (XML) 1.0 (Fifth Edition)
+   https://www.w3.org/TR/xml/
+
 .. [#BCP-47] BCP 47: "Tags for identifying languages",
    https://tools.ietf.org/rfc/bcp/bcp47.txt
 
-- 
2.38.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [gentoo-dev] [PATCH 2/2] glep-0068: Indicate that unknown elements should be ignored
  2022-10-08  6:40 [gentoo-dev] [PATCH 0/2] glep-0068: Stricten the XML format Michał Górny
  2022-10-08  6:40 ` [gentoo-dev] [PATCH 1/2] glep-0068: Clarify and restrict XML data format Michał Górny
@ 2022-10-08  6:40 ` Michał Górny
  1 sibling, 0 replies; 3+ messages in thread
From: Michał Górny @ 2022-10-08  6:40 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

As originally stated, the GLEP did not permit extending the format.
Let's relax the requirement to conforming files but indicate that
the parsers should ignore unknown (i.e. future) elements.

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 glep-0068.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/glep-0068.rst b/glep-0068.rst
index d3e3611..45ca30f 100644
--- a/glep-0068.rst
+++ b/glep-0068.rst
@@ -67,6 +67,8 @@ must not fetch or process it.
 The data structure of metadata files is defined in this GLEP. The elements
 and attributes do not use namespaces. Conforming files must not contain
 any elements or attributes that are not defined in this specification.
+However, parsers should ignore any unknown elements or attributes in order
+to permit future extension.
 
 Category metadata files are named ``metadata.xml`` and located inside category
 directories in an ebuild repository. Their structure is described
-- 
2.38.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-10-08  6:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-08  6:40 [gentoo-dev] [PATCH 0/2] glep-0068: Stricten the XML format Michał Górny
2022-10-08  6:40 ` [gentoo-dev] [PATCH 1/2] glep-0068: Clarify and restrict XML data format Michał Górny
2022-10-08  6:40 ` [gentoo-dev] [PATCH 2/2] glep-0068: Indicate that unknown elements should be ignored Michał Górny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox