From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1NbWfx-0000ff-By for garchives@archives.gentoo.org; Sun, 31 Jan 2010 10:05:05 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id BD354E099D; Sun, 31 Jan 2010 10:04:47 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) by pigeon.gentoo.org (Postfix) with ESMTP id 72BB1E099D for ; Sun, 31 Jan 2010 10:04:47 +0000 (UTC) Received: from mail.isohunt.com (b01.ext.isohunt.com [208.71.112.51]) by smtp.gentoo.org (Postfix) with ESMTP id DC9981B4013 for ; Sun, 31 Jan 2010 10:04:46 +0000 (UTC) Received: (qmail 11238 invoked from network); 31 Jan 2010 10:04:43 -0000 Received: from tsi-static.orbis-terrarum.net (HELO grubbs.orbis-terrarum.net) (76.10.188.108) by mail.isohunt.com (qpsmtpd/0.33-dev on beta01) with (CAMELLIA256-SHA encrypted) ESMTPS; Sun, 31 Jan 2010 10:04:42 +0000 Received: (qmail 8046 invoked by uid 10000); 31 Jan 2010 10:04:40 -0000 Date: Sun, 31 Jan 2010 10:04:40 +0000 From: "Robin H. Johnson" To: gentoo-dev@lists.gentoo.org Subject: [gentoo-dev] GLEP61 - Manifest2 compression Message-ID: References: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6lCXDTVICvIQMz0h" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-Archives-Salt: 3af32d2f-c9dc-4551-9a99-4de33b63a1b0 X-Archives-Hash: a1002646c75eb0dac8145223d185303e --6lCXDTVICvIQMz0h Content-Type: multipart/mixed; boundary="PNpeiK4tTqhYOExY" Content-Disposition: inline --PNpeiK4tTqhYOExY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Changes: - This GLEP can stand independently of GLEP58. - Add XZ to compression types list. - Move cutoff to 32KiB. Provide size example w/ 32KiB+gzip. - Split specification into generation and validation. --=20 Robin Hugh Johnson Gentoo Linux: Developer, Trustee & Infrastructure Lead E-Mail : robbat2@gentoo.org GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85 --PNpeiK4tTqhYOExY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline; filename="glep-0061.txt" Content-Transfer-Encoding: quoted-printable GLEP: 61 Title: Manifest2 compression Version: $Revision: 1.6 $ Last-Modified: $Date: 2010/01/31 09:55:43 $ Author: Robin Hugh Johnson =20 Status: Draft Type: Standards Track Content-Type: text/x-rst Requires: 44 Created: July 2008 Updated: October 2008, January 2010 Updates: 44 Post-History: December 2009, January 2010 Abstract =3D=3D=3D=3D=3D=3D=3D=3D Deals with compression of large Manifest2 files. Motivation =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D With the introduction of MetaManifest, and full-tree Manifest coverage, we are faced with the possibility of having very large Manifests. Preliminary experiments with MetaManifest, show that with just the existing per-package Manifests, the full MetaManifest (top-level only, no first-level sub directories), for a tree including metadata/, exceeds 8MiB in size. Applying common compression can achieve a 50-60% reduction in this size. Additionally, some of the larger already-existing Manifests in the tree can also be reduced. This GLEP is not mandatory for the tree-signing specification, but instead helps to cut down the size impact of large Manifest2 files, some of which are already present in the tree. As such, it is also able to stand on it's own. Specification =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Creation of compressed Manifests: --------------------------------- 32KiB is suggested as a arbitrary cut-off point to start generating compressed Manifest2 files. The compression must only applied during the creation of a tree intended for end users. No Manifests stored in a VCS should be compressed in the VCS. For the main gentoo-portage tree, this means that the compressed Manifests should be generated using the CVS to Rsync process. The Manifest compression process is required to ensure that inconsistent compressed versions do not exist. Validation of Manifests: ------------------------ When searching for a Manifest2 file, if the basename form does not exist, the package manager should search in the same location using common compressed suffixes, and use the compressed file in place of the Manifest2. gzip, bzip2, lzma, xz should all be supported if available on the given platform. In the case that multiple versions exist, the package manager should simply pick one - they should be identical, differing only in compression. Example Results with a 32KiB cut-off, gzip algorithm =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D As of 2010/01/30, the suggested cut-off would impact the following 21 exist= ing Manifests, for a saving of nearly 900KiB:: Size Path 65788 app-doc/linux-gazette/Manifest 75739 app-office/openoffice-bin/Manifest 40534 app-text/texlive-core/Manifest 41710 dev-texlive/texlive-bibtexextra/Manifest 38197 dev-texlive/texlive-documentation-english/Manifest 129610 dev-texlive/texlive-fontsextra/Manifest 36022 dev-texlive/texlive-humanities/Manifest 686118 dev-texlive/texlive-latexextra/Manifest 43392 dev-texlive/texlive-latexrecommended/Manifest 33375 dev-texlive/texlive-mathextra/Manifest 39781 dev-texlive/texlive-pictures/Manifest 69567 dev-texlive/texlive-pstricks/Manifest 75460 dev-texlive/texlive-publishers/Manifest 50879 dev-texlive/texlive-science/Manifest 36711 kde-base/kde-l10n/Manifest 36539 media-gfx/bootsplash-themes/Manifest 33058 net-fs/autofs/Manifest 39781 www-client/firefox-bin/Manifest 48983 www-client/icecat/Manifest 60213 www-client/mozilla-firefox/Manifest 39065 x11-themes/gkrellm-themes/Manifest =20 Additionally, with the MetaManifest proposal, the following new manifests w= ould also be compressed, for a saving of nearly 4MiB:: Size Path 33442 app-admin/Manifest 71073 app-dicts/Manifest 35923 app-emacs/Manifest 45808 app-misc/Manifest 50169 app-text/Manifest 112786 dev-java/Manifest 65581 dev-libs/Manifest 42619 dev-lisp/Manifest 182163 dev-perl/Manifest 96198 dev-python/Manifest 58963 dev-ruby/Manifest 59736 dev-util/Manifest 58338 eclass/Manifest 55749 kde-base/Manifest 110064 licenses/Manifest 35262 media-gfx/Manifest 53995 media-libs/Manifest 55607 media-plugins/Manifest 71911 media-sound/Manifest 34835 media-video/Manifest 5747849 metadata/Manifest 47452 net-analyzer/Manifest 65989 net-misc/Manifest 316787 profiles/Manifest 67784 sys-apps/Manifest 48971 x11-misc/Manifest 41475 x11-plugins/Manifest =20 Backwards Compatibility =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The package Manifests should also be maintained as ONLY uncompressed in CVS. For processing of all existing per-package Manifests, if compression is used, it should be done in parallel to the existing Manifests, to provide for a changeover period. Newer versions of Portage may later choose to exclude all non-compressed Manifests during emerge --sync if compressed versions are guaranteed to exist on the servers. MetaManifests may come into existence as compressed from the start, as do not have an backwards compatibility issues. As a side note, this breaks all manual interaction with Manifests such as grep, and so should only be applied to large Manifest2 files, such as the MetaManifest.=20 References =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =2E. [#GLEP44] Mauch, M. (2005) GLEP44 - Manifest2 format. http://www.gentoo.org/proj/en/glep/glep-0044.html=09 Copyright =3D=3D=3D=3D=3D=3D=3D=3D=3D Copyright (c) 2008-2010 by Robin Hugh Johnson. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0. vim: tw=3D72 ts=3D2 expandtab: --PNpeiK4tTqhYOExY-- --6lCXDTVICvIQMz0h Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it. iEYEARECAAYFAktlVbgACgkQPpIsIjIzwizQeQCffekhJkDghNzrXxKpgqe5VDgO PR0AoMHypTePxEZnnmN4NUeRYL0L3aK8 =S7iS -----END PGP SIGNATURE----- --6lCXDTVICvIQMz0h--