From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1QVeDI-0003Ol-SX for garchives@archives.gentoo.org; Sun, 12 Jun 2011 06:32:01 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id AA22B1C006; Sun, 12 Jun 2011 06:31:38 +0000 (UTC) Received: from mail-wy0-f181.google.com (mail-wy0-f181.google.com [74.125.82.181]) by pigeon.gentoo.org (Postfix) with ESMTP id 6054B1C006 for ; Sun, 12 Jun 2011 06:31:37 +0000 (UTC) Received: by wyi11 with SMTP id 11so3545747wyi.40 for ; Sat, 11 Jun 2011 23:31:37 -0700 (PDT) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-soc@lists.gentoo.org Reply-to: gentoo-soc@lists.gentoo.org MIME-Version: 1.0 Received: by 10.227.209.146 with SMTP id gg18mr3883348wbb.71.1307860296197; Sat, 11 Jun 2011 23:31:36 -0700 (PDT) Received: by 10.227.69.84 with HTTP; Sat, 11 Jun 2011 23:31:36 -0700 (PDT) X-Originating-IP: [187.39.189.177] Date: Sun, 12 Jun 2011 03:31:36 -0300 Message-ID: Subject: [gentoo-soc] Distfile patching support - Weekly report #3 From: Rafael Martins To: gentoo-soc@lists.gentoo.org Cc: Denis Dupeyron , robbat2@gentoo.org Content-Type: text/plain; charset=UTF-8 X-Archives-Salt: X-Archives-Hash: 404690349c0d3b67b4ddbb8d48ad9ba7 Hi all Quick summary: Improve the performance of the Gentoo Linux mirrors by reducing the overall bandwidth load, allowing people to fetch binary patches from the mirrors, instead of the full source tarballs, when updating some package. This project is partially based on GLEP 25. The project is a bit ahead of schedule. == Progress == - Finished the integration of DeltaDB with existing code. A sample DeltaDB for sys-kernel/gentoo-source is available [1] - Finished the implementation of the delta reconstructor script. The script (distpatcher.py) gets a CPV, creates a list of all the distfiles needed by that version, check which ones are already available in $DISTDIR, walk the delta db in the reverse order, and reconstruct the needed distfiles, if possible. It also handle the checksums and is able to say if the checksum of the reconstructed distfile compressed is the same of the original distfile compressed, saving it in a diferent place, if needed, to be handled separately by Portage (to be implemented in a next task). - Generated a simple stats class. distdiffer.py produced 45 deltas for sys-kernel/gentoo-sources, then I created a plot [2] with the percentage of savings for each delta, if people download the delta instead of the full destination distfile. The deltas are sorted by percentage of savings, to make the visualization easy. This is just a visualization bonus, there's no really useful info about which distfiles are saving more download size. You need to check the DeltaDB file [1] if want such info. == Next steps == - Create the script to generate deltas for the full gentoo-x86 tree. - Improve error handling for existing scripts. - Rewrite some bits of existing code. - Try a full tree run, if possible. The project homepage now lives on the Gentoo infrastructure project page: http://www.gentoo.org/proj/en/infrastructure/distpatch/ [1] http://paste.pocoo.org/raw/404807/ [2] http://i.imgur.com/nKqpM.png That's all for now. Thanks! -- Rafael Goncalves Martins http://rafaelmartins.eng.br/