public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Arthur Zamarin <arthurzam@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo
Date: Wed, 28 Feb 2024 20:50:16 +0200	[thread overview]
Message-ID: <f8e7fc9a-398a-4967-90e6-55437d1fe34a@gentoo.org> (raw)
In-Reply-To: <a2b8c68b1649213cf237f40e41f9a460a5667c34.camel@gentoo.org>


[-- Attachment #1.1: Type: text/plain, Size: 4286 bytes --]

On 27/02/2024 16.45, Michał Górny wrote:
> Hello,
> 
> Given the recent spread of the "AI" bubble, I think we really need to
> look into formally addressing the related concerns.  In my opinion,
> at this point the only reasonable course of action would be to safely
> ban "AI"-backed contribution entirely.  In other words, explicitly
> forbid people from using ChatGPT, Bard, GitHub Copilot, and so on, to
> create ebuilds, code, documentation, messages, bug reports and so on for
> use in Gentoo.
> 
> Just to be clear, I'm talking about our "original" content.  We can't do
> much about upstream projects using it.

I support this motion.

> 
> Rationale:
> 
> 1. Copyright concerns.  At this point, the copyright situation around
> generated content is still unclear.  What's pretty clear is that pretty
> much all LLMs are trained on huge corpora of copyrighted material, and
> all fancy "AI" companies don't give shit about copyright violations.
> In particular, there's a good risk that these tools would yield stuff we
> can't legally use.

I know that GitHub Copilot can be limited to licenses, and even to just
the current repository. Even though, I'm not sure that the copyright can
be attributed to "me" and not the "AI" - so still gray area.

> 2. Quality concerns.  LLMs are really great at generating plausibly
> looking bullshit.  I suppose they can provide good assistance if you are
> careful enough, but we can't really rely on all our contributors being
> aware of the risks.

Let me tell a story. I was interested if I can teach an LLM the ebuild
format, as a possible helper tool for devs/non-devs. My prompt got so
huge, where I was teaching it all the stuff of ebuilds, where to input
the source code (eclasses), and such. At one point, it even managed to
output a close enough python distutils-r1 ebuild - the same level that
`vim dev-python/${PN}/${PN}-${PV}.ebuild` creates using the gentoo
template. Yes, my long work resulted in no gain.

For each other ebuild type: cmake, meson, go, rust - I always got
garbage ebuild. Yes, it was generating a good DESCRIPTION and HOMEPAGE
(simple stuff to copy from upstream) and even 60% accuracy for LICENSE.
But did you know we have "intel80386" arch for KEYWORDS? We can
RESTRICT="install"? We can use "^cat-pkg/pkg-1" syntax in deps? PATCHES
with http urls inside? And the list goes on. Sometimes it was even funny.

So until a good prompt can be created for gentoo, upon which we *might*
reopen discussion, I'm strongly supporting banning AI generating
ebuilds. Currently good templates per category, and just copying other
ebuilds as starting point, and even just skel.ebuild - all those 3
options bring much better result and less time waste for developers.

> 3. Ethical concerns.  As pointed out above, the "AI" corporations don't
> give shit about copyright, and don't give shit about people.  The AI
> bubble is causing huge energy waste.  It is giving a great excuse for
> layoffs and increasing exploitation of IT workers.  It is driving
> enshittification of the Internet, it is empowering all kinds of spam
> and scam.
> 

Many companies who use AI as reason for layoff are just creating a
reasoning out of bad will, or ignorance. The company I work at is using
AI tools as a boost for productivity, but at all levels of management
they know that AI can't replace a person - best case boost him 5-10%.
The current real reason for layoffs is tightening of budget movement
cross the industry (just a normal cycle, soon it would get better), so
management prefer to layoff not themselves. So yeah, sad world.

> 
> Gentoo has always stood out as something different, something that
> worked for people for whom mainstream distros were lacking.  I think
> adding "made by real people" to the list of our advantages would be
> a good thing — but we need to have policies in place, to make sure shit
> doesn't flow in.
> 
> Compare with the shitstorm at:
> https://github.com/pkgxdev/pantry/issues/5358
> 

Great read, really much WTF. This whole repo is just a cluster of AIs
competing against each other.

-- 
Arthur Zamarin
arthurzam@gentoo.org
Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU)


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  parent reply	other threads:[~2024-02-28 18:50 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 14:45 [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo Michał Górny
2024-02-27 15:10 ` Arsen Arsenović
2024-02-27 15:21 ` Kenton Groombridge
2024-02-27 15:31   ` Alex Boag-Munroe
2024-02-27 16:11 ` Marek Szuba
2024-02-27 16:29   ` Sam James
2024-02-27 16:48 ` Andreas K. Huettel
2024-02-27 17:02 ` Ionen Wolkens
2024-02-27 17:41 ` Rich Freeman
2024-02-27 18:07   ` Ulrich Mueller
2024-02-27 18:27     ` Kenton Groombridge
2024-02-27 17:46 ` Matthias Maier
2024-02-27 17:50 ` Roy Bamford
2024-02-27 18:40   ` Peter Böhm
2024-02-27 18:04 ` Sam James
2024-03-09 14:57   ` Michał Górny
2024-02-27 19:17 ` Eli Schwartz
2024-02-28  3:05 ` Oskari Pirhonen
2024-02-28  3:12   ` Michał Górny
2024-02-28 10:08     ` Ulrich Mueller
2024-02-28 11:06       ` Matt Jolly
2024-02-28 20:20         ` Eli Schwartz
2024-03-01  7:06         ` Sam James
2024-03-09 15:00           ` Michał Górny
2024-02-28 13:09       ` Michał Górny
2024-02-28 10:34 ` David Seifert
2024-02-28 18:50 ` Arthur Zamarin [this message]
2024-02-28 19:26   ` Rich Freeman
2024-03-01  6:33 ` Zoltan Puskas
2024-03-05  6:12 ` Robin H. Johnson
2024-03-06  6:53   ` Oskari Pirhonen
2024-03-08  3:59   ` [gentoo-dev] " Duncan
2024-03-09 15:04     ` Michał Górny
2024-03-09 21:13       ` Duncan
2024-03-10  1:53         ` Eli Schwartz
2024-03-06 13:53 ` [gentoo-dev] " martin-kokos
2024-03-08  7:09 ` Fco. Javier Felix Belmonte
2024-03-21 15:25 ` Michał Górny
2024-04-15 19:50   ` Jérôme Carretero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f8e7fc9a-398a-4967-90e6-55437d1fe34a@gentoo.org \
    --to=arthurzam@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox