From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 0A098158041 for ; Wed, 6 Mar 2024 13:54:06 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 14979E29D9; Wed, 6 Mar 2024 13:54:01 +0000 (UTC) Received: from mail-4324.protonmail.ch (mail-4324.protonmail.ch [185.70.43.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id BAA42E29D2 for ; Wed, 6 Mar 2024 13:54:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1709733238; x=1709992438; bh=JrxJLDb/uKT4d03R2OF4KRk6WRok/ohXQQ9MJ6N9+5Y=; h=Date:To:From:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector; b=qGsEY+REqYe0Lg89xZQSsAs3WKADDLYMe4yi4hj2+xIDvfpHLNEZOVwwwvqdZLpy/ KsNTB7CrWAvKAIB89VVGP1TU7sbay/gX2YAa0olvsIgclPzebCuNUTgLdbYcoZiFmL yuAuOy0hTJ8B057OzNmNzNEhHfg2fN3sbkFLf53q5qXwDquweX7XnEocGexEAiq8/P Ya9bC2QYmEYBp55VtupGI1nYOGae/doEs2FXUQ3Omcr0n5JMjvzuKn5aMbPv+bn19J gfIi2g/IV4ahDQ81K+uk2VgCfybXOgVymRY9GiansV31IpU9daGBZ8M41eZXnR4u6h HwEXnXzNEAtqw== Date: Wed, 06 Mar 2024 13:53:41 +0000 To: gentoo-dev@lists.gentoo.org From: martin-kokos Subject: Re: [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo Message-ID: In-Reply-To: References: Feedback-ID: 25395503:user:proton Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Archives-Salt: ff1125d1-8325-496d-a4bb-6959709d8da6 X-Archives-Hash: 8883a179f20c2a25a742ce48750fff0b On Tuesday, February 27th, 2024 at 3:45 PM, Micha=C5=82 G=C3=B3rny wrote: > Hello, >=20 > Given the recent spread of the "AI" bubble, I think we really need to > look into formally addressing the related concerns. In my opinion, > at this point the only reasonable course of action would be to safely > ban "AI"-backed contribution entirely. In other words, explicitly > forbid people from using ChatGPT, Bard, GitHub Copilot, and so on, to > create ebuilds, code, documentation, messages, bug reports and so on for > use in Gentoo. >=20 > Just to be clear, I'm talking about our "original" content. We can't do > much about upstream projects using it. >=20 >=20 > Rationale: >=20 > 1. Copyright concerns. At this point, the copyright situation around > generated content is still unclear. What's pretty clear is that pretty > much all LLMs are trained on huge corpora of copyrighted material, and > all fancy "AI" companies don't give shit about copyright violations. > In particular, there's a good risk that these tools would yield stuff we > can't legally use. >=20 > 2. Quality concerns. LLMs are really great at generating plausibly > looking bullshit. I suppose they can provide good assistance if you are > careful enough, but we can't really rely on all our contributors being > aware of the risks. >=20 > 3. Ethical concerns. As pointed out above, the "AI" corporations don't > give shit about copyright, and don't give shit about people. The AI > bubble is causing huge energy waste. It is giving a great excuse for > layoffs and increasing exploitation of IT workers. It is driving > enshittification of the Internet, it is empowering all kinds of spam > and scam. >=20 >=20 > Gentoo has always stood out as something different, something that > worked for people for whom mainstream distros were lacking. I think > adding "made by real people" to the list of our advantages would be > a good thing =E2=80=94 but we need to have policies in place, to make sur= e shit > doesn't flow in. >=20 > Compare with the shitstorm at: > https://github.com/pkgxdev/pantry/issues/5358 >=20 > -- > Best regards, > Micha=C5=82 G=C3=B3rny While I understand the concerns that may have triggered feeling the need fo= r a rule like this. As someone from the field of machine learning (AI) engi= neer, I feel I need to add my brief opinion. The pkgxdev thing very artificial and if there is a threat to quality/integ= rity it will not manifest itself as obviously which brings me to.. A rule like this is just not enforceable. The contributor as they're signed is responsible for the quality of the con= tribution, even if it's been written by plain editor, dev environment with = smart plugins (LSP) or their dog. Other organizations have already had to deal with automated contributions w= hich can sometimes go wrong for *all different* kinds of reasons for much l= onger and their approach may be an inspiration: [0] OpenStreetMap: automated edits - https://wiki.openstreetmap.org/wiki/Au= tomated_Edits_code_of_conduct [1] Wikipedia: bot policy - https://en.wikipedia.org/wiki/Wikipedia:Bot_pol= icy The AI that we are dealing right now is just another means of automation af= ter all. As a machine learning engineer myself, I was contemplating creating an inst= ance of a generative model myself for my own use from my own data, in which= case the copyright and ethical point would absolutely not apply. Also, there are ethically and copyright-ok language model projects such as = project Bergamo [2] vetted by universities and EU, also used by [3] Mozilla= (one of the prominent ethical AI proponents). Banning all tools, just because some might be not up to moral standards, pu= ts the ones that are, in a disadvantage in our world as a whole. [2] Project Bergamo - https://browser.mt/ [3] Mozilla blog: training translation models - https://hacks.mozilla.org/2= 022/06/training-efficient-neural-network-models-for-firefox-translations/ - Martin