public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Zoltan Puskas <zoltan@sinustrom.info>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo
Date: Thu, 29 Feb 2024 22:33:53 -0800	[thread overview]
Message-ID: <k2web6yllnwxx4m6rpv5rwmlaf5qmrtn7sz2ncug7pwehtl74c@pcafmr7nkuak> (raw)
In-Reply-To: <a2b8c68b1649213cf237f40e41f9a460a5667c34.camel@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 3196 bytes --]

Hi,

> Compare with the shitstorm at:
> https://github.com/pkgxdev/pantry/issues/5358

Thank you for this, it made my day.

Though I'm just a proxy maintainer for now, I also support this initiative,
there should be some guard rails set up around LLM usage.

> 1. Copyright concerns.  At this point, the copyright situation around
> generated content is still unclear.  What's pretty clear is that pretty
> much all LLMs are trained on huge corpora of copyrighted material, and
> all fancy "AI" companies don't give shit about copyright violations.
> In particular, there's a good risk that these tools would yield stuff we
> can't legally use.

IANAL, but IMHO if we stop respecting copyright law, even if indirectly via
LLMs, why should we expect others to respect our licenses? It could be prudent
to wait and see where will this land.

> 2. Quality concerns.  LLMs are really great at generating plausibly
> looking bullshit.  I suppose they can provide good assistance if you are
> careful enough, but we can't really rely on all our contributors being
> aware of the risks.

From my personal experience of using Github Copilot fine tuned on a large
private code base, it functions mostly okay as a more smart auto complete on a
single line of code, but when it comes to multiple lines of code, even when it
comes to filling out boiler plate code, it's at best a 'meh'. The problem is
that while the output looks okay-ish, often it will have subtle mistakes or will
hallucinate some random additional stuff not relevant to the source file in
question, so one ends up having to read and analyze the entire output of the LLM
to fix problems with the code. I found that the mental and time overhead rarely
makes it worth it, especially when a template can do a better job (e.g. this
would be the case for ebuilds).

Since during reviews we are supposed to be reading the entire contribution, not
sure how much difference this makes, but I can see a developer trusting LLM
too much might end up outsourcing the checking of the code to the reviewers,
which means we need to be extra vigilant and could lead to reduced trust of
contributions.

> 3. Ethical concerns.  As pointed out above, the "AI" corporations don't
> give shit about copyright, and don't give shit about people.  The AI
> bubble is causing huge energy waste.  It is giving a great excuse for
> layoffs and increasing exploitation of IT workers.  It is driving
> enshittification of the Internet, it is empowering all kinds of spam
> and scam.

I agree. I'm already tired of AI generated blog spam and so forth, such a waste
of time and quite annoying. I'd rather not have that on our wiki pages too. The
purpose of documenting things is to explain an area to someone new to it or
writing down unique quirks of a setup or a system. Since LLMs cannot write new
original things, just rehash information it has seen I'm not sure how could it
be helpful for this at all to be honest.

Overall my time is too valuable to shift through AI generated BS when I'm trying
to solve a problem, I'd prefer we keep a well curated high quality documentation
where possible.

Zoltan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2024-03-01  6:34 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 14:45 [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo Michał Górny
2024-02-27 15:10 ` Arsen Arsenović
2024-02-27 15:21 ` Kenton Groombridge
2024-02-27 15:31   ` Alex Boag-Munroe
2024-02-27 16:11 ` Marek Szuba
2024-02-27 16:29   ` Sam James
2024-02-27 16:48 ` Andreas K. Huettel
2024-02-27 17:02 ` Ionen Wolkens
2024-02-27 17:41 ` Rich Freeman
2024-02-27 18:07   ` Ulrich Mueller
2024-02-27 18:27     ` Kenton Groombridge
2024-02-27 17:46 ` Matthias Maier
2024-02-27 17:50 ` Roy Bamford
2024-02-27 18:40   ` Peter Böhm
2024-02-27 18:04 ` Sam James
2024-03-09 14:57   ` Michał Górny
2024-02-27 19:17 ` Eli Schwartz
2024-02-28  3:05 ` Oskari Pirhonen
2024-02-28  3:12   ` Michał Górny
2024-02-28 10:08     ` Ulrich Mueller
2024-02-28 11:06       ` Matt Jolly
2024-02-28 20:20         ` Eli Schwartz
2024-03-01  7:06         ` Sam James
2024-03-09 15:00           ` Michał Górny
2024-02-28 13:09       ` Michał Górny
2024-02-28 10:34 ` David Seifert
2024-02-28 18:50 ` Arthur Zamarin
2024-02-28 19:26   ` Rich Freeman
2024-03-01  6:33 ` Zoltan Puskas [this message]
2024-03-05  6:12 ` Robin H. Johnson
2024-03-06  6:53   ` Oskari Pirhonen
2024-03-08  3:59   ` [gentoo-dev] " Duncan
2024-03-09 15:04     ` Michał Górny
2024-03-09 21:13       ` Duncan
2024-03-10  1:53         ` Eli Schwartz
2024-03-06 13:53 ` [gentoo-dev] " martin-kokos
2024-03-08  7:09 ` Fco. Javier Felix Belmonte
2024-03-21 15:25 ` Michał Górny
2024-04-15 19:50   ` Jérôme Carretero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=k2web6yllnwxx4m6rpv5rwmlaf5qmrtn7sz2ncug7pwehtl74c@pcafmr7nkuak \
    --to=zoltan@sinustrom.info \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox