public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
@ 2019-08-06 21:29 Michał Górny
  2019-08-10 20:49 ` William Hubbs
  2019-08-17  7:06 ` [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2) Michał Górny
  0 siblings, 2 replies; 22+ messages in thread
From: Michał Górny @ 2019-08-06 21:29 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 3618 bytes --]

Hi,

Some time ago William asked me to move UID/GID assignments from wiki [1]
to something more accessible.  I've finally gotten around to draft
something, and I'd like to hear your comments about it.  The idea is to
keep a whitespace-separated record format file in api.gentoo.org repo.

This is mostly inspired by Fedora's format, specifically the ability to
specify both UID and GID for matching user/group on the same line.
I've also decided not to attempt to specify disjoint Linux and FreeBSD
users/groups on a single line.  Instead, they are specified separately
and defined by providers (as suggested by Ulrich).

Sample, along with big comment explaining the file format, below.

[1] https://wiki.gentoo.org/wiki/Project:Quality_Assurance/UID_GID_Assignment

=== PROPOSED FORMAT ===

# vim:se ts=8 sts=8 tw=0 :
#
# This file provides a registry of active, requested and historical UID
# and GID assignments in the Gentoo repository.  Each line represents
# a single record, with the first line providing field headers.  Empty
# lines and comments are ignored.  Fields are separated by whitespace,
# with the last (note) field extending to end of line.  Note that
# additional fields may be added in the future.
#
# A single record can represent a user, a group or a combination of both
# using the same name.  Same user/group/UID/GID can be repeated
# in mutiple records to reflect reality.  If you can't express something
# with a single record, split it.
#
# NAME: user/group name
# UID: UID assigned to the user, or '-' if only group is defined
# GID: GID assigned to the group, or '-' if only user is defined
# PROVIDER: keyword identifying provider of user/group, see below
# NOTES: human-readable notes about the assignment, optional
#
# Valid values for PROVIDER are:
# - acct: acct-user/${NAME} and/or acct-group/${NAME} packages
# - baselayout: user/group defined in passwd/group files from baselayout
# - baselayout-fbsd: same as baselayout but on FreeBSD system
# - historical: historical UID/GID assignment, no longer used nowadays
# - requested: assignment requested on the ml, not yet pushed
# - reserved: special reserved range
# - user.eclass: user/group created via user.eclass directly in package
#
# Use PROVIDERs that apply best at the moment.  For example, if user
# was migrated from user.eclass to acct-user/ with the same UID, just
# list the latter.
#
# Please keep the list sorted by UID, GID, PROVIDER, NAME (in order).

NAME		UID	GID	PROVIDER	NOTES
root		0	0	baselayout
root		0	-	baselayout-fbsd
toor		0	-	baselayout-fbsd	FreeBSD lists both names in passwd
wheel		-	0	baselayout-fbsd
bin		1	1	baselayout
daemon		1	1	baselayout-fbsd
daemon		2	2	baselayout
operator	2	5	baselayout-fbsd
adm		3	4	baselayout
bin		3	7	baselayout-fbsd
sys		-	3	baselayout
sys		-	3	baselayout-fbsd
lp		4	7	baselayout
tty		4	4	baselayout-fbsd
sync		5	-	baselayout
kmem		5	2	baselayout-fbsd
tty		-	5	baselayout
shutdown	6	-	baselayout
mail		6	6	baselayout-fbsd
disk		-	6	baselayout
halt		7	-	baselayout
games		7	-	baselayout-fbsd
lp		-	7	acct
#...
ftp		21	21	acct		removed from baselayout svn in r1475
sshd		22	22	user.eclass	removed from baselayout svn in r1476
at		25	25	user.eclass	removed from baselayout svn in r1479
smtpd		25	25	user.eclass
#...
rpc		111	111	historical	used by net-nds/portmap, removed in 36e60cce
#...
portage		250	250	baselayout
#...
(reserved)	500-999		500-999		reserved
(reserved)	1000-60000	1000-60000	reserved	UID_MIN-UID_MAX/GID_MIN-GID_MAX
#...

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-06 21:29 [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org Michał Górny
@ 2019-08-10 20:49 ` William Hubbs
  2019-08-10 20:51   ` Michał Górny
                     ` (2 more replies)
  2019-08-17  7:06 ` [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2) Michał Górny
  1 sibling, 3 replies; 22+ messages in thread
From: William Hubbs @ 2019-08-10 20:49 UTC (permalink / raw
  To: gentoo-dev; +Cc: mgorny

[-- Attachment #1: Type: text/plain, Size: 1155 bytes --]

On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> Hi,
> 
> Some time ago William asked me to move UID/GID assignments from wiki [1]
> to something more accessible.  I've finally gotten around to draft
> something, and I'd like to hear your comments about it.  The idea is to
> keep a whitespace-separated record format file in api.gentoo.org repo.
> 
> This is mostly inspired by Fedora's format, specifically the ability to
> specify both UID and GID for matching user/group on the same line.
> I've also decided not to attempt to specify disjoint Linux and FreeBSD
> users/groups on a single line.  Instead, they are specified separately
> and defined by providers (as suggested by Ulrich).
> 
> Sample, along with big comment explaining the file format, below.

I'm not sure how I feel about whitespace delimiters for this. withno
further specification, it would make these two lines completely valid:

root 0 0 baselayout
bin 1				1				baselayout					some notes here
 
 I propose using a colon like we do in /etc/passwd:

 root:0:0:baselayout:
 bin:1:1:baselayout:some notes go here

 Thoughts?

 William

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 20:49 ` William Hubbs
@ 2019-08-10 20:51   ` Michał Górny
  2019-08-10 22:44     ` William Hubbs
  2019-08-10 22:40   ` Ulrich Mueller
  2019-08-11  0:41   ` Matt Turner
  2 siblings, 1 reply; 22+ messages in thread
From: Michał Górny @ 2019-08-10 20:51 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1395 bytes --]

On Sat, 2019-08-10 at 15:49 -0500, William Hubbs wrote:
> On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> > Hi,
> > 
> > Some time ago William asked me to move UID/GID assignments from wiki [1]
> > to something more accessible.  I've finally gotten around to draft
> > something, and I'd like to hear your comments about it.  The idea is to
> > keep a whitespace-separated record format file in api.gentoo.org repo.
> > 
> > This is mostly inspired by Fedora's format, specifically the ability to
> > specify both UID and GID for matching user/group on the same line.
> > I've also decided not to attempt to specify disjoint Linux and FreeBSD
> > users/groups on a single line.  Instead, they are specified separately
> > and defined by providers (as suggested by Ulrich).
> > 
> > Sample, along with big comment explaining the file format, below.
> 
> I'm not sure how I feel about whitespace delimiters for this. withno
> further specification, it would make these two lines completely valid:
> 
> root 0 0 baselayout
> bin 1				1				baselayout					some notes here
>  
>  I propose using a colon like we do in /etc/passwd:
> 
>  root:0:0:baselayout:
>  bin:1:1:baselayout:some notes go here
> 
>  Thoughts?
> 

Worksforme.  Do we allow extra whitespace in fields where it doesn't
matter for alignment?

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 20:49 ` William Hubbs
  2019-08-10 20:51   ` Michał Górny
@ 2019-08-10 22:40   ` Ulrich Mueller
  2019-08-10 22:53     ` William Hubbs
  2019-08-11  0:41   ` Matt Turner
  2 siblings, 1 reply; 22+ messages in thread
From: Ulrich Mueller @ 2019-08-10 22:40 UTC (permalink / raw
  To: gentoo-dev; +Cc: mgorny

[-- Attachment #1: Type: text/plain, Size: 498 bytes --]

>>>>> On Sat, 10 Aug 2019, William Hubbs wrote:
 
>  I propose using a colon like we do in /etc/passwd:

>  root:0:0:baselayout:
>  bin:1:1:baselayout:some notes go here

Please no. I thought the purpose of this exercise was to have a format
that is human readable, and not something with strange separators like
":" or "|" (as we have in the wiki)?

There cannot be any whitespace in name, IDs, or provider.
So what would be the problem with using whitespace as separators?

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 20:51   ` Michał Górny
@ 2019-08-10 22:44     ` William Hubbs
  0 siblings, 0 replies; 22+ messages in thread
From: William Hubbs @ 2019-08-10 22:44 UTC (permalink / raw
  To: gentoo-dev; +Cc: mgorny

[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]

On Sat, Aug 10, 2019 at 10:51:52PM +0200, Michał Górny wrote:
> On Sat, 2019-08-10 at 15:49 -0500, William Hubbs wrote:
> > On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> > > Hi,
> > > 
> > > Some time ago William asked me to move UID/GID assignments from wiki [1]
> > > to something more accessible.  I've finally gotten around to draft
> > > something, and I'd like to hear your comments about it.  The idea is to
> > > keep a whitespace-separated record format file in api.gentoo.org repo.
> > > 
> > > This is mostly inspired by Fedora's format, specifically the ability to
> > > specify both UID and GID for matching user/group on the same line.
> > > I've also decided not to attempt to specify disjoint Linux and FreeBSD
> > > users/groups on a single line.  Instead, they are specified separately
> > > and defined by providers (as suggested by Ulrich).
> > > 
> > > Sample, along with big comment explaining the file format, below.
> > 
> > I'm not sure how I feel about whitespace delimiters for this. withno
> > further specification, it would make these two lines completely valid:
> > 
> > root 0 0 baselayout
> > bin 1				1				baselayout					some notes here
> >  
> >  I propose using a colon like we do in /etc/passwd:
> > 
> >  root:0:0:baselayout:
> >  bin:1:1:baselayout:some notes go here
> > 
> >  Thoughts?
> > 
> 
> Worksforme.  Do we allow extra whitespace in fields where it doesn't
> matter for alignment?

I'm not sure I follow your question.

I think alignment doesn't matter at all since things on api.gentoo.org
are meant to be "data files for automated processing and external
services". 
So, whether fields have spaces in them is more about whether it makes
sense for the field to have spaces. Thinking about it that way, the only
field that should have spaces in it is the notes field.

William


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 22:40   ` Ulrich Mueller
@ 2019-08-10 22:53     ` William Hubbs
  2019-08-11  6:38       ` Ulrich Mueller
  0 siblings, 1 reply; 22+ messages in thread
From: William Hubbs @ 2019-08-10 22:53 UTC (permalink / raw
  To: gentoo-dev; +Cc: ulm, mgorny

[-- Attachment #1: Type: text/plain, Size: 982 bytes --]

On Sun, Aug 11, 2019 at 12:40:04AM +0200, Ulrich Mueller wrote:
> >>>>> On Sat, 10 Aug 2019, William Hubbs wrote:
>  
> >  I propose using a colon like we do in /etc/passwd:
> 
> >  root:0:0:baselayout:
> >  bin:1:1:baselayout:some notes go here
> 
> Please no. I thought the purpose of this exercise was to have a format
> that is human readable, and not something with strange separators like
> ":" or "|" (as we have in the wiki)?
> 
> There cannot be any whitespace in name, IDs, or provider.
> So what would be the problem with using whitespace as separators?

Well, I suppose it wouldn't be a problem if we don't care about
alignment, but that is not readable either really right?

For example, (completely making up uid/gid numbers), this is totally
valid.

root 0 0 baselayout this is from baselayout
bin 1 1 baselayout this is from baselayout
postgresql 150 150 app this is from postgresql
cockroach 250 250 app this is from cockroach

William


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 20:49 ` William Hubbs
  2019-08-10 20:51   ` Michał Górny
  2019-08-10 22:40   ` Ulrich Mueller
@ 2019-08-11  0:41   ` Matt Turner
  2019-08-11 19:48     ` William Hubbs
  2 siblings, 1 reply; 22+ messages in thread
From: Matt Turner @ 2019-08-11  0:41 UTC (permalink / raw
  To: gentoo development, Michał Górny

On Sat, Aug 10, 2019 at 1:49 PM William Hubbs <williamh@gentoo.org> wrote:
>
> On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> > Hi,
> >
> > Some time ago William asked me to move UID/GID assignments from wiki [1]
> > to something more accessible.  I've finally gotten around to draft
> > something, and I'd like to hear your comments about it.  The idea is to
> > keep a whitespace-separated record format file in api.gentoo.org repo.
> >
> > This is mostly inspired by Fedora's format, specifically the ability to
> > specify both UID and GID for matching user/group on the same line.
> > I've also decided not to attempt to specify disjoint Linux and FreeBSD
> > users/groups on a single line.  Instead, they are specified separately
> > and defined by providers (as suggested by Ulrich).
> >
> > Sample, along with big comment explaining the file format, below.
>
> I'm not sure how I feel about whitespace delimiters for this. withno
> further specification, it would make these two lines completely valid:
>
> root 0 0 baselayout
> bin 1                           1                               baselayout                                      some notes here

That's true, but I think with editor settings embedded in the file and
visual pattern of vertical alignment we shouldn't expect a bunch of
problems.

Michał's original proposal is very readable. Just because it's a "data
file" I don't think we should go out of our way to limit readability.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-10 22:53     ` William Hubbs
@ 2019-08-11  6:38       ` Ulrich Mueller
  0 siblings, 0 replies; 22+ messages in thread
From: Ulrich Mueller @ 2019-08-11  6:38 UTC (permalink / raw
  To: gentoo-dev; +Cc: ulm, mgorny

>>>>> On Sun, 11 Aug 2019, William Hubbs wrote:

> Well, I suppose it wouldn't be a problem if we don't care about
> alignment, but that is not readable either really right?

> For example, (completely making up uid/gid numbers), this is totally
> valid.

> root 0 0 baselayout this is from baselayout
> bin 1 1 baselayout this is from baselayout
> postgresql 150 150 app this is from postgresql
> cockroach 250 250 app this is from cockroach

Of course the columns would be aligned, as in Michał's original example.
What is the problem you are you trying to solve?

Ulrich


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11  0:41   ` Matt Turner
@ 2019-08-11 19:48     ` William Hubbs
  2019-08-11 22:04       ` Matt Turner
  0 siblings, 1 reply; 22+ messages in thread
From: William Hubbs @ 2019-08-11 19:48 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

[-- Attachment #1: Type: text/plain, Size: 2070 bytes --]

On Sat, Aug 10, 2019 at 05:41:56PM -0700, Matt Turner wrote:
> On Sat, Aug 10, 2019 at 1:49 PM William Hubbs <williamh@gentoo.org> wrote:
> >
> > On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> > > Hi,
> > >
> > > Some time ago William asked me to move UID/GID assignments from wiki [1]
> > > to something more accessible.  I've finally gotten around to draft
> > > something, and I'd like to hear your comments about it.  The idea is to
> > > keep a whitespace-separated record format file in api.gentoo.org repo.
> > >
> > > This is mostly inspired by Fedora's format, specifically the ability to
> > > specify both UID and GID for matching user/group on the same line.
> > > I've also decided not to attempt to specify disjoint Linux and FreeBSD
> > > users/groups on a single line.  Instead, they are specified separately
> > > and defined by providers (as suggested by Ulrich).
> > >
> > > Sample, along with big comment explaining the file format, below.
> >
> > I'm not sure how I feel about whitespace delimiters for this. withno
> > further specification, it would make these two lines completely valid:
> >
> > root 0 0 baselayout
> > bin 1                           1                               baselayout                                      some notes here
> 
> That's true, but I think with editor settings embedded in the file and
> visual pattern of vertical alignment we shouldn't expect a bunch of
> problems.

We don't honor editor settings in files by default, so this will not
work out of the box (see this in /etc/vim/vimrc).

--- cut here ---
" {{{ Modeline settings
" We don't allow modelines by default. See bug #14088 and bug #73715.
" If you're not concerned about these, you can enable them on a per-user
" basis by adding "set modeline" to your ~/.vimrc file.
set nomodeline
" }}}
--- cut here ---

If we are going to require a modeline, shouldn't we consider allowing
them by default so we can work out of the box instead of having to tweak
our editor settings?

William


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11 19:48     ` William Hubbs
@ 2019-08-11 22:04       ` Matt Turner
  2019-08-11 22:53         ` William Hubbs
  0 siblings, 1 reply; 22+ messages in thread
From: Matt Turner @ 2019-08-11 22:04 UTC (permalink / raw
  To: gentoo development; +Cc: Michał Górny

On Sun, Aug 11, 2019 at 12:48 PM William Hubbs <williamh@gentoo.org> wrote:
>
> On Sat, Aug 10, 2019 at 05:41:56PM -0700, Matt Turner wrote:
> > On Sat, Aug 10, 2019 at 1:49 PM William Hubbs <williamh@gentoo.org> wrote:
> > >
> > > On Tue, Aug 06, 2019 at 11:29:50PM +0200, Michał Górny wrote:
> > > > Hi,
> > > >
> > > > Some time ago William asked me to move UID/GID assignments from wiki [1]
> > > > to something more accessible.  I've finally gotten around to draft
> > > > something, and I'd like to hear your comments about it.  The idea is to
> > > > keep a whitespace-separated record format file in api.gentoo.org repo.
> > > >
> > > > This is mostly inspired by Fedora's format, specifically the ability to
> > > > specify both UID and GID for matching user/group on the same line.
> > > > I've also decided not to attempt to specify disjoint Linux and FreeBSD
> > > > users/groups on a single line.  Instead, they are specified separately
> > > > and defined by providers (as suggested by Ulrich).
> > > >
> > > > Sample, along with big comment explaining the file format, below.
> > >
> > > I'm not sure how I feel about whitespace delimiters for this. withno
> > > further specification, it would make these two lines completely valid:
> > >
> > > root 0 0 baselayout
> > > bin 1                           1                               baselayout                                      some notes here
> >
> > That's true, but I think with editor settings embedded in the file and
> > visual pattern of vertical alignment we shouldn't expect a bunch of
> > problems.
>
> We don't honor editor settings in files by default, so this will not
> work out of the box (see this in /etc/vim/vimrc).
>
> --- cut here ---
> " {{{ Modeline settings
> " We don't allow modelines by default. See bug #14088 and bug #73715.
> " If you're not concerned about these, you can enable them on a per-user
> " basis by adding "set modeline" to your ~/.vimrc file.
> set nomodeline
> " }}}
> --- cut here ---
>
> If we are going to require a modeline, shouldn't we consider allowing
> them by default so we can work out of the box instead of having to tweak
> our editor settings?

Perhaps we should. Is this really an important point for you?

I don't think any of that needs to hold up Michał's proposal though.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11 22:04       ` Matt Turner
@ 2019-08-11 22:53         ` William Hubbs
  2019-08-12  0:11           ` Matt Turner
                             ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: William Hubbs @ 2019-08-11 22:53 UTC (permalink / raw
  To: gentoo-dev; +Cc: Michał Górny

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

On Sun, Aug 11, 2019 at 03:04:02PM -0700, Matt Turner wrote:
> On Sun, Aug 11, 2019 at 12:48 PM William Hubbs <williamh@gentoo.org> wrote:
> > If we are going to require a modeline, shouldn't we consider allowing
> > them by default so we can work out of the box instead of having to tweak
> > our editor settings?
> 
> Perhaps we should. Is this really an important point for you?
> 
> I don't think any of that needs to hold up Michał's proposal though.

Let's move away from talking about alignment for a minute.

The reason I responded originally was that his proposal is a bit
vague. It says that whitespace separates fields; that could be spaces
or tabs. whitespace can also be a completely valid character in
the notes field. So, since he says we can add more fields in the future,
you either have to escape whitespace in the notes field or quote the
field.

Also, he may have used one whitespace character to separate fields in his
example, but the spec allows any number of whitespace characters.

William

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11 22:53         ` William Hubbs
@ 2019-08-12  0:11           ` Matt Turner
  2019-08-12  2:50           ` Michał Górny
  2019-08-12 17:42           ` Kent Fredric
  2 siblings, 0 replies; 22+ messages in thread
From: Matt Turner @ 2019-08-12  0:11 UTC (permalink / raw
  To: gentoo development; +Cc: Michał Górny

On Sun, Aug 11, 2019 at 3:54 PM William Hubbs <williamh@gentoo.org> wrote:
>
> On Sun, Aug 11, 2019 at 03:04:02PM -0700, Matt Turner wrote:
> > On Sun, Aug 11, 2019 at 12:48 PM William Hubbs <williamh@gentoo.org> wrote:
> > > If we are going to require a modeline, shouldn't we consider allowing
> > > them by default so we can work out of the box instead of having to tweak
> > > our editor settings?
> >
> > Perhaps we should. Is this really an important point for you?
> >
> > I don't think any of that needs to hold up Michał's proposal though.
>
> Let's move away from talking about alignment for a minute.
>
> The reason I responded originally was that his proposal is a bit
> vague. It says that whitespace separates fields; that could be spaces
> or tabs. whitespace can also be a completely valid character in
> the notes field. So, since he says we can add more fields in the future,
> you either have to escape whitespace in the notes field or quote the
> field.
>
> Also, he may have used one whitespace character to separate fields in his
> example, but the spec allows any number of whitespace characters.

Sigh.

The intention is to make it easily machine parseable and easily
readable by humans.

One or more whitespace characters is easy to recognize by a machine,
be they space or tabs. Aligned columns are easily readable by humans.
I think humans are able to follow existing convention when adding new
lines to the file and don't require specific rules like this.

There's nothing unclear or vague in his explanation. Can we please proceed?


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11 22:53         ` William Hubbs
  2019-08-12  0:11           ` Matt Turner
@ 2019-08-12  2:50           ` Michał Górny
  2019-08-12 16:52             ` Alec Warner
  2019-08-12 17:42           ` Kent Fredric
  2 siblings, 1 reply; 22+ messages in thread
From: Michał Górny @ 2019-08-12  2:50 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

On Sun, 2019-08-11 at 17:53 -0500, William Hubbs wrote:
> On Sun, Aug 11, 2019 at 03:04:02PM -0700, Matt Turner wrote:
> > On Sun, Aug 11, 2019 at 12:48 PM William Hubbs <williamh@gentoo.org> wrote:
> > > If we are going to require a modeline, shouldn't we consider allowing
> > > them by default so we can work out of the box instead of having to tweak
> > > our editor settings?
> > 
> > Perhaps we should. Is this really an important point for you?
> > 
> > I don't think any of that needs to hold up Michał's proposal though.
> 
> Let's move away from talking about alignment for a minute.
> 
> The reason I responded originally was that his proposal is a bit
> vague. It says that whitespace separates fields; that could be spaces
> or tabs. whitespace can also be a completely valid character in
> the notes field. So, since he says we can add more fields in the future,
> you either have to escape whitespace in the notes field or quote the
> field.

We can add new fields *in the middle*.  That's why there's explicit
header line.

> Also, he may have used one whitespace character to separate fields in his
> example, but the spec allows any number of whitespace characters.
> 

awk does not have a problem with that.


-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-12  2:50           ` Michał Górny
@ 2019-08-12 16:52             ` Alec Warner
  2019-08-12 17:58               ` Kent Fredric
  0 siblings, 1 reply; 22+ messages in thread
From: Alec Warner @ 2019-08-12 16:52 UTC (permalink / raw
  To: Gentoo Dev

[-- Attachment #1: Type: text/plain, Size: 1625 bytes --]

On Sun, Aug 11, 2019 at 7:51 PM Michał Górny <mgorny@gentoo.org> wrote:

> On Sun, 2019-08-11 at 17:53 -0500, William Hubbs wrote:
> > On Sun, Aug 11, 2019 at 03:04:02PM -0700, Matt Turner wrote:
> > > On Sun, Aug 11, 2019 at 12:48 PM William Hubbs <williamh@gentoo.org>
> wrote:
> > > > If we are going to require a modeline, shouldn't we consider allowing
> > > > them by default so we can work out of the box instead of having to
> tweak
> > > > our editor settings?
> > >
> > > Perhaps we should. Is this really an important point for you?
> > >
> > > I don't think any of that needs to hold up Michał's proposal though.
> >
> > Let's move away from talking about alignment for a minute.
> >
> > The reason I responded originally was that his proposal is a bit
> > vague. It says that whitespace separates fields; that could be spaces
> > or tabs. whitespace can also be a completely valid character in
> > the notes field. So, since he says we can add more fields in the future,
> > you either have to escape whitespace in the notes field or quote the
> > field.
>
> We can add new fields *in the middle*.  That's why there's explicit
> header line.
>
> > Also, he may have used one whitespace character to separate fields in his
> > example, but the spec allows any number of whitespace characters.
> >
>
> awk does not have a problem with that.
>

I would advise against making new "specs" and use an existing spec.
CSV, JSON and YAML are both popular machine-and-people readable
specifications with broad support.

-A


>
>
> --
> Best regards,
> Michał Górny
>
>

[-- Attachment #2: Type: text/html, Size: 2340 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-11 22:53         ` William Hubbs
  2019-08-12  0:11           ` Matt Turner
  2019-08-12  2:50           ` Michał Górny
@ 2019-08-12 17:42           ` Kent Fredric
  2 siblings, 0 replies; 22+ messages in thread
From: Kent Fredric @ 2019-08-12 17:42 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 777 bytes --]

On Sun, 11 Aug 2019 17:53:44 -0500
William Hubbs <williamh@gentoo.org> wrote:

> , since he says we can add more fields in the future,
> you either have to escape whitespace in the notes field or quote the
> field.

This was originally what I was going to comment on when the proposal
first hit the list.

But then I realised all you have to do is define all the non-notes
fields, and any additional fields will be appended to that list, and
notes always come afterwards.

It *does* mean when the new field is updated, the entire file must be
updated immediately for this new field.

But it also means existing decoders that don't know about the new field
will treat the new field as part of the notes.

All in all, I think that turns out to be a feature.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-12 16:52             ` Alec Warner
@ 2019-08-12 17:58               ` Kent Fredric
  2019-08-12 18:20                 ` Alec Warner
  0 siblings, 1 reply; 22+ messages in thread
From: Kent Fredric @ 2019-08-12 17:58 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1927 bytes --]

On Mon, 12 Aug 2019 09:52:40 -0700
Alec Warner <antarus@gentoo.org> wrote:

> CSV, JSON and YAML are both popular machine-and-people readable
> specifications with broad support.

No, not CSV. There isn't really "a spec" for that. Even though there is
a "proposed spec", "CSV editors" and things that emit CSV just make up
their own rules.

The more I know about CSV, the less I want anything to do with it.

In essence, to make CSV viable (or any other delimiter-seperated format
viable), you have to locally redefine what that means to a limited
subset of the spec.

For instance, forbid the feature where the first line is the string
"Sep=,"[1], which tools like Excel (may) generate, but aren't spec
compliant, and leads to ... interesting things.

But pretty much you can take it for granted that a '.csv' extension
will make *somebody* make assumptions about the format that aren't true.

Like for instance, is leading/trailing whitespace around delimiters
significant? Spec says yes[2], but implementations may desire it to be
no ( And gentoo probably prefers them not to be significant for
alignment reasons )

As for JSON/YAML, ... eh... that may be the case for like, 4 line files.

But once you have hundreds of entries, that becomes less true.

And both of those can have "Fun" merge conflict issues due to the
requirements around record delimiters and syntax,

eg: You're using JSON, does your JSON formatter emit every record on
its own line? No? That's going to create annoying merge conflicts.

Does your formatter/decoder support trailing "," ?
No? That's going to introduce problems.

That's why I'd rather a more narrow, less general, domain specific
format, instead of throwing these general tools at the problem.

1: https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality
2: https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-12 17:58               ` Kent Fredric
@ 2019-08-12 18:20                 ` Alec Warner
  2019-08-12 18:33                   ` Kent Fredric
  2019-08-12 19:01                   ` Kent Fredric
  0 siblings, 2 replies; 22+ messages in thread
From: Alec Warner @ 2019-08-12 18:20 UTC (permalink / raw
  To: Gentoo Dev

[-- Attachment #1: Type: text/plain, Size: 2790 bytes --]

On Mon, Aug 12, 2019 at 10:58 AM Kent Fredric <kentnl@gentoo.org> wrote:

> On Mon, 12 Aug 2019 09:52:40 -0700
> Alec Warner <antarus@gentoo.org> wrote:
>
> > CSV, JSON and YAML are both popular machine-and-people readable
> > specifications with broad support.
>
> No, not CSV. There isn't really "a spec" for that. Even though there is
> a "proposed spec", "CSV editors" and things that emit CSV just make up
> their own rules.
>
> The more I know about CSV, the less I want anything to do with it.
>
> In essence, to make CSV viable (or any other delimiter-seperated format
> viable), you have to locally redefine what that means to a limited
> subset of the spec.
>
> For instance, forbid the feature where the first line is the string
> "Sep=,"[1], which tools like Excel (may) generate, but aren't spec
> compliant, and leads to ... interesting things.
>
> But pretty much you can take it for granted that a '.csv' extension
> will make *somebody* make assumptions about the format that aren't true.
>
> Like for instance, is leading/trailing whitespace around delimiters
> significant? Spec says yes[2], but implementations may desire it to be
> no ( And gentoo probably prefers them not to be significant for
> alignment reasons )
>

I mean part of my joke is that instead of using one of the existing ones,
you will make your own and it will have its own set of mistakes and
assumptions :)


>
> As for JSON/YAML, ... eh... that may be the case for like, 4 line files.
>
> But once you have hundreds of entries, that becomes less true.
>

What becomes less true?


>
> And both of those can have "Fun" merge conflict issues due to the
> requirements around record delimiters and syntax,
>

And this means what? That I might check something in that is broken? How is
this not true for any syntax we define?


>
> eg: You're using JSON, does your JSON formatter emit every record on
> its own line? No? That's going to create annoying merge conflicts.
>

On the editing side I expect humans to be editing, so the format is not
particularly relevant.
On the consuming side (e.g. a thing that downloads stuff from
apis.gentoo.org) is where the standards are helpful, because I don't have
to write some bespoke parser and I can just dump the json stream and get
objects back.


>
> Does your formatter/decoder support trailing "," ?
> No? That's going to introduce problems.
>

The JSON specification doesn't support trailing commas, iirc. I would
expect people to follow the specification.


>
> That's why I'd rather a more narrow, less general, domain specific
> format, instead of throwing these general tools at the problem.
>
> 1:
> https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality
> 2: https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules
>

[-- Attachment #2: Type: text/html, Size: 4532 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-12 18:20                 ` Alec Warner
@ 2019-08-12 18:33                   ` Kent Fredric
  2019-08-12 19:01                   ` Kent Fredric
  1 sibling, 0 replies; 22+ messages in thread
From: Kent Fredric @ 2019-08-12 18:33 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 704 bytes --]

On Mon, 12 Aug 2019 11:20:01 -0700
Alec Warner <antarus@gentoo.org> wrote:

> >
> > As for JSON/YAML, ... eh... that may be the case for like, 4 line files.
> >
> > But once you have hundreds of entries, that becomes less true.
> >  
> 
> What becomes less true?

In that it ceases to be a human-editable format.

Its editable in the sense of "a text editor can edit it", but that's
like calling XML human-editable.

At scale, large JSON files can become something one cannot safely edit
without risking breaking the format.

( The JSON syntax rules imply a document-global parse completes, as
opposed to a line-based system that invalidity of an entry only affects
that record )


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org
  2019-08-12 18:20                 ` Alec Warner
  2019-08-12 18:33                   ` Kent Fredric
@ 2019-08-12 19:01                   ` Kent Fredric
  1 sibling, 0 replies; 22+ messages in thread
From: Kent Fredric @ 2019-08-12 19:01 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1442 bytes --]

On Mon, 12 Aug 2019 11:20:01 -0700
Alec Warner <antarus@gentoo.org> wrote:

> > And both of those can have "Fun" merge conflict issues due to the
> > requirements around record delimiters and syntax,
> >  
> 
> And this means what? That I might check something in that is broken?
> How is this not true for any syntax we define?

It leads git to the conclusion 2 different people edited the same line,
which they did, technically.

But the semantic change when viewed as a series of atomic units, each
editor edited an element that was unrelated to the other. (On par with
getting a conflict because line's 1 and lines 100 of the same file were
edited by different people, that is, pretty much killing the point of
automatic conflict resolution)

This leads to conflicts that git can't solve on its own, when it could
by simple format choice. ( And this leads to a place where somebody
could fuck up the merge, like, having their merge tool fail and then
managing to commit the conflict-markers. Been there. )

But the short version of it, is this should not generate conflicts:

- User A edits last record of file
- User B appends record to file

Whereas with JSON at least, either the format demands it due to
prohibiting the trailing ',', so:

[ 
  a,
  b,
]

Is illegal.

You can kinda get around it by using left-side commas:

[
  b
  ,c
]

But do you really want to standarize *that*?



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2)
  2019-08-06 21:29 [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org Michał Górny
  2019-08-10 20:49 ` William Hubbs
@ 2019-08-17  7:06 ` Michał Górny
  2019-08-17  7:38   ` Michał Górny
  2019-08-18  6:23   ` Ulrich Mueller
  1 sibling, 2 replies; 22+ messages in thread
From: Michał Górny @ 2019-08-17  7:06 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 3177 bytes --]

Hi,

Here's v2 of my proposed format.  Two changes:

1. I've changed the 'header line' to start with '#!', so comment
stripping normally takes care of it without having to do extra work.

2. I've allowed Markdown syntax in note field, so that we can link
commits similarly to how we do it on wiki.

== UPDATED FORMAT ==

# vim:se ts=8 sts=8 tw=0 :
#
# This file provides a registry of active, requested and historical UID
# and GID assignments in the Gentoo repository.  Each line represents
# a single record, with the '#!' line providing field headers.  Empty
# lines and comments are ignored.  Fields are separated by whitespace,
# with the last (note) field extending to end of line.  Note that
# additional fields may be added in the future.
#
# A single record can represent a user, a group or a combination of both
# using the same name.  Same user/group/UID/GID can be repeated
# in mutiple records to reflect reality.  If you can't express something
# with a single record, split it.
#
# NAME: user/group name
# UID: UID assigned to the user, or '-' if only group is defined
# GID: GID assigned to the group, or '-' if only user is defined
# PROVIDER: keyword identifying provider of user/group, see below
# NOTES: human-readable notes about the assignment, optional, can use
#        Markdown syntax
#
# Valid values for PROVIDER are:
# - acct: acct-user/${NAME} and/or acct-group/${NAME} packages
# - baselayout: user/group defined in passwd/group files from baselayout
# - baselayout-fbsd: same as baselayout but on FreeBSD system
# - historical: historical UID/GID assignment, no longer used nowadays
# - requested: assignment requested on the ml, not yet pushed
# - reserved: special reserved range
# - user.eclass: user/group created via user.eclass directly in package
#
# Use PROVIDERs that apply best at the moment.  For example, if user
# was migrated from user.eclass to acct-user/ with the same UID, just
# list the latter.
#
# Please keep the list sorted by UID, GID, PROVIDER, NAME (in order).

#!NAME		UID	GID	PROVIDER	NOTES
root		0	0	baselayout
root		0	-	baselayout-fbsd
toor		0	-	baselayout-fbsd	FreeBSD lists both names in passwd
wheel		-	0	baselayout-fbsd
bin		1	1	baselayout
daemon		1	1	baselayout-fbsd
daemon		2	2	baselayout
operator	2	5	baselayout-fbsd
adm		3	4	baselayout
bin		3	7	baselayout-fbsd
sys		-	3	baselayout
sys		-	3	baselayout-fbsd
lp		4	7	baselayout
tty		4	4	baselayout-fbsd
sync		5	-	baselayout
kmem		5	2	baselayout-fbsd
tty		-	5	baselayout
shutdown	6	-	baselayout
mail		6	6	baselayout-fbsd
disk		-	6	baselayout
halt		7	-	baselayout
games		7	-	baselayout-fbsd
lp		-	7	acct
#...
ftp		21	21	acct		removed from baselayout svn in r1475
sshd		22	22	user.eclass	removed from baselayout svn in r1476
at		25	25	user.eclass	removed from baselayout svn in r1479
smtpd		25	25	user.eclass
#...
rpc		111	111	historical	used by net-nds/portmap, removed in 36e60cce
#...
portage		250	250	baselayout
#...
(reserved)	500-999		500-999		reserved
(reserved)	1000-60000	1000-60000	reserved	UID_MIN-UID_MAX/GID_MIN-GID_MAX
#...


-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2)
  2019-08-17  7:06 ` [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2) Michał Górny
@ 2019-08-17  7:38   ` Michał Górny
  2019-08-18  6:23   ` Ulrich Mueller
  1 sibling, 0 replies; 22+ messages in thread
From: Michał Górny @ 2019-08-17  7:38 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

On Sat, 2019-08-17 at 09:06 +0200, Michał Górny wrote:
> Hi,
> 
> Here's v2 of my proposed format.

Also example usage:

get user/group named 'adm'

  $ awk '$1 == "adm"' uid-gid.txt 
  adm		3	4	baselayout

get records for UID 3:

  $ awk '$2 == 3' uid-gid.txt 
  adm		3	4	baselayout
  bin		3	7	baselayout-fbsd

get records for GID 4:

  $ awk '$3 == 4' uid-gid.txt 
  adm		3	4	baselayout
  tty		4	4	baselayout-fbsd

get all users, sorted by UID:

  $ awk '$0 ~ /^[^#]/ && $2 != "-"' uid-gid.txt  | sort -k2 -n
  root		0	0	baselayout
  root		0	-	baselayout-fbsd
  toor		0	-	baselayout-fbsd	FreeBSD lists both names in passwd

get all groups, sorted by GID:

  $ awk '$0 ~ /^[^#]/ && $3 != "-"' uid-gid.txt  | sort -k3 -n
  root		0	0	baselayout
  wheel		-	0	baselayout-fbsd
  bin		1	1	baselayout

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2)
  2019-08-17  7:06 ` [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2) Michał Górny
  2019-08-17  7:38   ` Michał Górny
@ 2019-08-18  6:23   ` Ulrich Mueller
  1 sibling, 0 replies; 22+ messages in thread
From: Ulrich Mueller @ 2019-08-18  6:23 UTC (permalink / raw
  To: Michał Górny; +Cc: gentoo-dev

>>>>> On Sat, 17 Aug 2019, Michał Górny wrote:

> Here's v2 of my proposed format.  Two changes:

> 1. I've changed the 'header line' to start with '#!', so comment
> stripping normally takes care of it without having to do extra work.

> 2. I've allowed Markdown syntax in note field, so that we can link
> commits similarly to how we do it on wiki.

Please add a standard copyright/license notice. The tables may not be
copyrightable, but the comments are.

Besides, GLEP 76 requires it. :-)

Ulrich


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-08-18  6:23 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-06 21:29 [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org Michał Górny
2019-08-10 20:49 ` William Hubbs
2019-08-10 20:51   ` Michał Górny
2019-08-10 22:44     ` William Hubbs
2019-08-10 22:40   ` Ulrich Mueller
2019-08-10 22:53     ` William Hubbs
2019-08-11  6:38       ` Ulrich Mueller
2019-08-11  0:41   ` Matt Turner
2019-08-11 19:48     ` William Hubbs
2019-08-11 22:04       ` Matt Turner
2019-08-11 22:53         ` William Hubbs
2019-08-12  0:11           ` Matt Turner
2019-08-12  2:50           ` Michał Górny
2019-08-12 16:52             ` Alec Warner
2019-08-12 17:58               ` Kent Fredric
2019-08-12 18:20                 ` Alec Warner
2019-08-12 18:33                   ` Kent Fredric
2019-08-12 19:01                   ` Kent Fredric
2019-08-12 17:42           ` Kent Fredric
2019-08-17  7:06 ` [gentoo-dev] [RFC] Moving UID/GID assignments to api.gentoo.org (v2) Michał Górny
2019-08-17  7:38   ` Michał Górny
2019-08-18  6:23   ` Ulrich Mueller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox