On Mon, Aug 12, 2019 at 10:58 AM Kent Fredric <kentnl@gentoo.org> wrote:
On Mon, 12 Aug 2019 09:52:40 -0700
Alec Warner <antarus@gentoo.org> wrote:

> CSV, JSON and YAML are both popular machine-and-people readable
> specifications with broad support.

No, not CSV. There isn't really "a spec" for that. Even though there is
a "proposed spec", "CSV editors" and things that emit CSV just make up
their own rules.

The more I know about CSV, the less I want anything to do with it.

In essence, to make CSV viable (or any other delimiter-seperated format
viable), you have to locally redefine what that means to a limited
subset of the spec.

For instance, forbid the feature where the first line is the string
"Sep=,"[1], which tools like Excel (may) generate, but aren't spec
compliant, and leads to ... interesting things.

But pretty much you can take it for granted that a '.csv' extension
will make *somebody* make assumptions about the format that aren't true.

Like for instance, is leading/trailing whitespace around delimiters
significant? Spec says yes[2], but implementations may desire it to be
no ( And gentoo probably prefers them not to be significant for
alignment reasons )

I mean part of my joke is that instead of using one of the existing ones, you will make your own and it will have its own set of mistakes and assumptions :)
 

As for JSON/YAML, ... eh... that may be the case for like, 4 line files.

But once you have hundreds of entries, that becomes less true.

What becomes less true?
 

And both of those can have "Fun" merge conflict issues due to the
requirements around record delimiters and syntax,

And this means what? That I might check something in that is broken? How is this not true for any syntax we define?
 

eg: You're using JSON, does your JSON formatter emit every record on
its own line? No? That's going to create annoying merge conflicts.

On the editing side I expect humans to be editing, so the format is not particularly relevant.
On the consuming side (e.g. a thing that downloads stuff from apis.gentoo.org) is where the standards are helpful, because I don't have to write some bespoke parser and I can just dump the json stream and get objects back.
 

Does your formatter/decoder support trailing "," ?
No? That's going to introduce problems.

The JSON specification doesn't support trailing commas, iirc. I would expect people to follow the specification.
 

That's why I'd rather a more narrow, less general, domain specific
format, instead of throwing these general tools at the problem.

1: https://en.wikipedia.org/wiki/Comma-separated_values#General_functionality
2: https://en.wikipedia.org/wiki/Comma-separated_values#Basic_rules