On Tue, Feb 27, 2024 at 15:45:17 +0100, Michał Górny wrote:
> Hello,
> 
> Given the recent spread of the "AI" bubble, I think we really need to
> look into formally addressing the related concerns.  In my opinion,
> at this point the only reasonable course of action would be to safely
> ban "AI"-backed contribution entirely.  In other words, explicitly
> forbid people from using ChatGPT, Bard, GitHub Copilot, and so on, to
> create ebuilds, code, documentation, messages, bug reports and so on for
> use in Gentoo.
> 
> Just to be clear, I'm talking about our "original" content.  We can't do
> much about upstream projects using it.
> 

I agree.

But for the sake of discussion:

What about cases where someone, say, doesn't have an excellent grasp of
English and decides to use, for example, ChatGPT to aid in writing
documentation/comments (not code) and puts a note somewhere explicitly
mentioning what was AI-generated so that someone else can take a closer
look?

I'd personally not be the biggest fan of this if it wasn't in something
like a PR or ml post where it could be reviewed before being made final.
But the most impportant part IMO would be being up-front about it.

> 
> Rationale:
> 
> 1. Copyright concerns.  At this point, the copyright situation around
> generated content is still unclear.  What's pretty clear is that pretty
> much all LLMs are trained on huge corpora of copyrighted material, and
> all fancy "AI" companies don't give shit about copyright violations.
> In particular, there's a good risk that these tools would yield stuff we
> can't legally use.
> 

I really dislike the lack of audit trail for where the bits and pieces
come from. Not to mention the examples from early on where Copilot was
filling in incorrect attribution.

- Oskari