From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 6A9C01388C0 for ; Sat, 27 Feb 2016 22:51:00 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id C3A8C21C037; Sat, 27 Feb 2016 22:50:48 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id E136C21C01C for ; Sat, 27 Feb 2016 22:50:47 +0000 (UTC) Received: from grubbs.orbis-terrarum.net (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id B0D19340928 for ; Sat, 27 Feb 2016 22:50:46 +0000 (UTC) Received: (qmail 14765 invoked by uid 10000); 27 Feb 2016 22:50:46 -0000 Date: Sat, 27 Feb 2016 22:50:46 +0000 From: "Robin H. Johnson" To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] Re: Bug #565566: Why is it still not fixed? Message-ID: References: <56CC937C.3030805@gentoo.org> <56CCD4DC.3040509@gentoo.org> <2fd3bbae-0fa0-d65e-dae4-874db95c1688@gentoo.org> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2fd3bbae-0fa0-d65e-dae4-874db95c1688@gentoo.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-Archives-Salt: a4cb704e-b082-4c51-bea6-b745f75e01d8 X-Archives-Hash: fc38f45aaab0c198383fdb178927ef50 On Sat, Feb 27, 2016 at 02:14:12PM +0100, Luca Barbato wrote: > On 24/02/16 01:33, Duncan wrote: > > That option is there, and indeed, a patch providing it was specifically > > added to portage for infra to use, because appending entries to existing > > files is vastly easier and more performant than trying to prepend entries > > and having to rewrite the entire file as a result. > This sounds wrong in many different ways. The changelog files are tiny > and makes next to no difference truncate+write or append. Prior to seperating ChangeLog files into years, this was way worse: a kernel bump present in any of gentoo-sources, hardened-sources, vanilla-sources meant another 100k of data to sent. It's not a lot overall, but here's some quick stats from one of our rsync servers, on bytes sent. Stats for Feb 25, from one of the 3 primary rsync.g.o servers, on the 'bytes sent' output from rsyncd. rsyncd example output: Feb 25 00:03:17 quetzal rsyncd[27280]: sent 4930260 bytes received 32215 bytes total size 408174052 3909 entries. Min RAW size: 4833709 bytes [1] Median RAW size: 22436094 bytes. Mean RAW size: 45652781 bytes. Sum of RAW size: 178456721459 bytes = ~166GiB (per day!) The min possible transfer size is forcing an rsync with no changes; it just sends the metadata about the files (path, mtime, size, etc). Let's subtract that from all the rest of the entries, to get stats about the data transfer. Median data size: 17602385 bytes Mean data size: 40819072 bytes So, now the question: If we use appending changelogs, the large changelogs only differ by a few hundred bytes. If we instead have to rewrite them, it's 50k+ per changelog. For each 50k changelog, the median transfer would get 0.25% larger. -- Robin Hugh Johnson Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee E-Mail : robbat2@gentoo.org GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85