From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 7F634158041 for ; Fri, 1 Mar 2024 06:34:16 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 16889E2A3D; Fri, 1 Mar 2024 06:34:12 +0000 (UTC) Received: from mailtransmit05.runbox.com (mailtransmit05.runbox.com [IPv6:2a0c:5a00:149::26]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id CA073E2A35 for ; Fri, 1 Mar 2024 06:34:11 +0000 (UTC) Received: from mailtransmit03.runbox ([10.9.9.163] helo=aibo.runbox.com) by mailtransmit05.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1rfwTB-00DPRZ-3k for gentoo-dev@lists.gentoo.org; Fri, 01 Mar 2024 07:34:09 +0100 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sinustrom.info; s=selector1; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:To:From:Date; bh=i/vQ6NSCqfn7QPa2WtWhfe5P2xQMb6/r0BFrQMTbq9I=; b=qLau9Ueigx2XJOCNbRLJcbj36w ROORZLPU6WhS9k6p7V0hU1xDX9s7x0jVj6fPiotrmdVB+cr8zyhsPUK72w5sYQrBwL43UaiHlOPNC Q4j7yZ5Irl0q1Ja56IiBumYocJRSfbGnlIM2JjdJR0k9lV6DN8IugQCcAzRp44Qocn1m31fvfU5Yw ZOPhTKLyZciCs/f6XZUv7GjBdIqm8aTAIK2KBHeznJbqoOFni73bkCMnyK8g28d/6PP1sY5UnRmWK Vm30jsbBV4Ou/0gzDHYdp7/eTAfLJbjueFjIY3dt1c3hvGfSbY9Llr+kV80QWfvlWXrC9F89yDw7B gygvAnJQ==; Received: from [10.9.9.73] (helo=submission02.runbox) by mailtransmit03.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1rfwT9-0002aY-MI for gentoo-dev@lists.gentoo.org; Fri, 01 Mar 2024 07:34:07 +0100 Received: by submission02.runbox with esmtpsa [Authenticated ID (1125710)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1rfwSy-009ci1-3I for gentoo-dev@lists.gentoo.org; Fri, 01 Mar 2024 07:33:56 +0100 Date: Thu, 29 Feb 2024 22:33:53 -0800 From: Zoltan Puskas To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] RFC: banning "AI"-backed (LLM/GPT/whatever) contributions to Gentoo Message-ID: Mail-Followup-To: gentoo-dev@lists.gentoo.org References: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="v4pbochf4b5xbqlf" Content-Disposition: inline In-Reply-To: X-Archives-Salt: 4778b674-1611-4caa-8e93-a3ebc2cbf841 X-Archives-Hash: b97549c86d9c8688839b346fa1f9f8db --v4pbochf4b5xbqlf Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, > Compare with the shitstorm at: > https://github.com/pkgxdev/pantry/issues/5358 Thank you for this, it made my day. Though I'm just a proxy maintainer for now, I also support this initiative, there should be some guard rails set up around LLM usage. > 1. Copyright concerns. At this point, the copyright situation around > generated content is still unclear. What's pretty clear is that pretty > much all LLMs are trained on huge corpora of copyrighted material, and > all fancy "AI" companies don't give shit about copyright violations. > In particular, there's a good risk that these tools would yield stuff we > can't legally use. IANAL, but IMHO if we stop respecting copyright law, even if indirectly via LLMs, why should we expect others to respect our licenses? It could be prud= ent to wait and see where will this land. > 2. Quality concerns. LLMs are really great at generating plausibly > looking bullshit. I suppose they can provide good assistance if you are > careful enough, but we can't really rely on all our contributors being > aware of the risks. =46rom my personal experience of using Github Copilot fine tuned on a large private code base, it functions mostly okay as a more smart auto complete o= n a single line of code, but when it comes to multiple lines of code, even when= it comes to filling out boiler plate code, it's at best a 'meh'. The problem is that while the output looks okay-ish, often it will have subtle mistakes or= will hallucinate some random additional stuff not relevant to the source file in question, so one ends up having to read and analyze the entire output of th= e LLM to fix problems with the code. I found that the mental and time overhead ra= rely makes it worth it, especially when a template can do a better job (e.g. this would be the case for ebuilds). Since during reviews we are supposed to be reading the entire contribution,= not sure how much difference this makes, but I can see a developer trusting LLM too much might end up outsourcing the checking of the code to the reviewers, which means we need to be extra vigilant and could lead to reduced trust of contributions. > 3. Ethical concerns. As pointed out above, the "AI" corporations don't > give shit about copyright, and don't give shit about people. The AI > bubble is causing huge energy waste. It is giving a great excuse for > layoffs and increasing exploitation of IT workers. It is driving > enshittification of the Internet, it is empowering all kinds of spam > and scam. I agree. I'm already tired of AI generated blog spam and so forth, such a w= aste of time and quite annoying. I'd rather not have that on our wiki pages too.= The purpose of documenting things is to explain an area to someone new to it or writing down unique quirks of a setup or a system. Since LLMs cannot write = new original things, just rehash information it has seen I'm not sure how could= it be helpful for this at all to be honest. Overall my time is too valuable to shift through AI generated BS when I'm t= rying to solve a problem, I'd prefer we keep a well curated high quality document= ation where possible. Zoltan --v4pbochf4b5xbqlf Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEyzsUa/Bn/zts2f9oJoL7EYNVSs4FAmXhdsoACgkQJoL7EYNV Ss5vHA//Tc15qYkGvmWjVfcM4oxycnfNweU1wkH3Z+h16FPHkVry93G2TeobaaMp HobVJv9io6+b9rtGuHX9GBBaDhH0Yym3AdXiDpt63ssNU3aKL/lphBKg+epM55JR 8tVQVQp2kjg36YM9JbIsrG4C9BXU/GBv7d1yTp97v0BiytvNb3YMDoxFYMsg/jSz msHto4WdB7Uu5A+n/mDCQs8Kf3bLote5Cr0jei+dKDiIKYFp/kwReScNWZvoNYEo MUnQurqWoDXnbSVgdyiknBNhpzcpoSeEa+I4UtjHhZJvFHiOBHKpGYM1eiClCb2q Gxo3G3QJeuMO9yUaxC9IhkJVqqdcUxa08Pwl0VcKREouyWrpkBGmKSI5GcsvMQHz jjsTgchNkG5McfIji08/M47/ls8uEMDo3dkGQz0JYqsT30nQqZSR92CEASKTg7m/ XipXkAMEWYd2zOif9CLzrQePHpK+FJumbKfYmUzjcRt1yODqtk/u29oAUxy33Xg5 0SacT0EFg/uFURuljsw0ZzUlf+6c2/ewyaxDadJOHt30mk9uPHK8SQl4/Liy36/e GZ12nwDPYuUNn/9MttgnHT7UNt9sWfLnooH9c6vKKtDJeJNwDcZ3gLmOGgUqsP58 2MONIX/dX71lnEp8YXAS+abPyjU34zQCmxV2USlhyqKV+aHuLxs= =erhu -----END PGP SIGNATURE----- --v4pbochf4b5xbqlf--