From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id CF74B1388C1 for ; Wed, 2 Mar 2016 18:14:28 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 3323F21C017; Wed, 2 Mar 2016 18:14:22 +0000 (UTC) Received: from a1www.kph.uni-mainz.de (a1www.kph.uni-mainz.de [134.93.134.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 33C7821C002 for ; Wed, 2 Mar 2016 18:14:20 +0000 (UTC) Received: from a1i15.kph.uni-mainz.de (a1i15.kph.uni-mainz.de [134.93.134.92]) by a1www.kph.uni-mainz.de (8.14.9/8.14.7) with ESMTP id u22IEJXG020886 for ; Wed, 2 Mar 2016 19:14:19 +0100 Received: from a1i15.kph.uni-mainz.de (localhost [127.0.0.1]) by a1i15.kph.uni-mainz.de (8.14.8/8.14.2) with ESMTP id u22IEJQG003732; Wed, 2 Mar 2016 19:14:19 +0100 Received: (from ulm@localhost) by a1i15.kph.uni-mainz.de (8.14.8/8.14.8/Submit) id u22IEJFs003728; Wed, 2 Mar 2016 19:14:19 +0100 Message-ID: <22231.11642.809779.509501@a1i15.kph.uni-mainz.de> Date: Wed, 2 Mar 2016 19:14:18 +0100 To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] Re: [gentoo-project] Portage repo usage survey and change evaluation In-Reply-To: <210da6ab-068c-b4a0-d02d-520e239dc7e1@gentoo.org> References: <22227.64207.210350.425232@a1i15.kph.uni-mainz.de> <22230.43334.255937.387943@a1i15.kph.uni-mainz.de> <210da6ab-068c-b4a0-d02d-520e239dc7e1@gentoo.org> X-Mailer: VM 8.2.0b under 24.3.1 (x86_64-pc-linux-gnu) From: Ulrich Mueller Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; boundary="pgp+signed+OzAGTjGTFwNpzGs"; micalg=pgp-sha256; protocol="application/pgp-signature" X-Archives-Salt: 88bf3743-465c-4847-b302-ae1fb13fe65a X-Archives-Hash: a446fa564c2b16a564629113016cc787 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --pgp+signed+OzAGTjGTFwNpzGs Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit >>>>> On Wed, 2 Mar 2016, Ian Stakenvicius wrote: > On 02/03/16 03:50 AM, Ulrich Mueller wrote: >> How is it possible that we have 52 MiB of ChangeLog entries >> generated in the 0.5 years since the git conversion, whereas we had >> only a total of 103 MiB in the 13.5 years since ChangeLogs were >> introduced in 2002? Certainly our commit rate hasn't increased by >> more than an order of magnitude in the last half year? > The content of a changelog entry from git is a lot bigger than it > was just from echangelog, isn't it? Not by a factor of ten. I've investigated a bit, and the main problem seems to be that for git commits that extend over several directories, the commit message is duplicated into many ChangeLog entries. For example, the message of the initial commit 56bd759 appears in some 18000 files, which accounts for 25 MiB. Then there is commit eaaface and its revert 1bfb585, again appearing in almost all ChangeLog files in the tree. These account for another 9 MiB. Last example, commit 8849b09, another 2 MiB. So about 70% of the size is caused by these 4 tree-wide commits alone. However, there are many more examples of duplication on a smaller scale. Ulrich --pgp+signed+OzAGTjGTFwNpzGs Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCAAGBQJW1y1xAAoJEMMJBoUcYcJzo/YH/1MVloiLO40AdLzqk0zN8cee P2XDM8xfua3wPTJ0odsIqVU06q0HDPghCkoLdchPWRXXcwNv8VmQzsHyBsKMwWMi zqLoyoT1cwJswUuGU1PbK27AfZKa5U1KJPR34cJtZUfx/lOmiL43OUrL9F238OY/ i3eKF/kL+o0EBDAcTzhIvFngNXCOtuWsukoSFnTBoZIfcAcdXtEDTf3z3yajtSwz Dr96OAiaAp9UXW1V9esbQOO/42S1FKnogM+FREExjmJnztUtF1YnjorJRcocJWfJ rsNzq9q4fIJ3kD/0TpnMU+k+T4fAKPaakqjmOoJafGAieIZByGzrAa47R8O9hIo= =+6ll -----END PGP SIGNATURE----- --pgp+signed+OzAGTjGTFwNpzGs--