public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Robin H. Johnson" <robbat2@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: [gentoo-dev] GLEP61 - Manifest2 compression
Date: Sun, 31 Jan 2010 10:04:40 +0000	[thread overview]
Message-ID: <robbat2-20100131T100201-562205149Z@orbis-terrarum.net> (raw)
In-Reply-To: <robbat2-20100131T091044-262651070Z@orbis-terrarum.net>


[-- Attachment #1.1: Type: text/plain, Size: 392 bytes --]

Changes:
- This GLEP can stand independently of GLEP58.
- Add XZ to compression types list.
- Move cutoff to 32KiB. Provide size example w/ 32KiB+gzip.
- Split specification into generation and validation.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

[-- Attachment #1.2: glep-0061.txt --]
[-- Type: text/plain, Size: 5653 bytes --]

GLEP: 61
Title: Manifest2 compression
Version: $Revision: 1.6 $
Last-Modified: $Date: 2010/01/31 09:55:43 $
Author: Robin Hugh Johnson <robbat2@gentoo.org> 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Requires: 44
Created: July 2008
Updated: October 2008, January 2010
Updates: 44
Post-History: December 2009, January 2010

Abstract
========
Deals with compression of large Manifest2 files.

Motivation
==========
With the introduction of MetaManifest, and full-tree Manifest coverage,
we are faced with the possibility of having very large Manifests.

Preliminary experiments with MetaManifest, show that with just the
existing per-package Manifests, the full MetaManifest (top-level only,
no first-level sub directories), for a tree including metadata/, exceeds
8MiB in size. Applying common compression can achieve a 50-60% reduction
in this size.

Additionally, some of the larger already-existing Manifests in the tree
can also be reduced.

This GLEP is not mandatory for the tree-signing specification, but
instead helps to cut down the size impact of large Manifest2 files, some
of which are already present in the tree. As such, it is also able to
stand on it's own.

Specification
=============
Creation of compressed Manifests:
---------------------------------
32KiB is suggested as a arbitrary cut-off point to start generating
compressed Manifest2 files.

The compression must only applied during the creation of a tree intended
for end users. No Manifests stored in a VCS should be compressed in the
VCS. For the main gentoo-portage tree, this means that the compressed
Manifests should be generated using the CVS to Rsync process.

The Manifest compression process is required to ensure that inconsistent
compressed versions do not exist.

Validation of Manifests:
------------------------
When searching for a Manifest2 file, if the basename form does not
exist, the package manager should search in the same location using
common compressed suffixes, and use the compressed file in place of the
Manifest2.

gzip, bzip2, lzma, xz should all be supported if available on the given
platform. In the case that multiple versions exist, the package manager
should simply pick one - they should be identical, differing only in
compression.

Example Results with a 32KiB cut-off, gzip algorithm
====================================================
As of 2010/01/30, the suggested cut-off would impact the following 21 existing
Manifests, for a saving of nearly 900KiB::

  Size   Path
   65788 app-doc/linux-gazette/Manifest
   75739 app-office/openoffice-bin/Manifest
   40534 app-text/texlive-core/Manifest
   41710 dev-texlive/texlive-bibtexextra/Manifest
   38197 dev-texlive/texlive-documentation-english/Manifest
  129610 dev-texlive/texlive-fontsextra/Manifest
   36022 dev-texlive/texlive-humanities/Manifest
  686118 dev-texlive/texlive-latexextra/Manifest
   43392 dev-texlive/texlive-latexrecommended/Manifest
   33375 dev-texlive/texlive-mathextra/Manifest
   39781 dev-texlive/texlive-pictures/Manifest
   69567 dev-texlive/texlive-pstricks/Manifest
   75460 dev-texlive/texlive-publishers/Manifest
   50879 dev-texlive/texlive-science/Manifest
   36711 kde-base/kde-l10n/Manifest
   36539 media-gfx/bootsplash-themes/Manifest
   33058 net-fs/autofs/Manifest
   39781 www-client/firefox-bin/Manifest
   48983 www-client/icecat/Manifest
   60213 www-client/mozilla-firefox/Manifest
   39065 x11-themes/gkrellm-themes/Manifest
  

Additionally, with the MetaManifest proposal, the following new manifests would
also be compressed, for a saving of nearly 4MiB::

  Size    Path
    33442 app-admin/Manifest
    71073 app-dicts/Manifest
    35923 app-emacs/Manifest
    45808 app-misc/Manifest
    50169 app-text/Manifest
   112786 dev-java/Manifest
    65581 dev-libs/Manifest
    42619 dev-lisp/Manifest
   182163 dev-perl/Manifest
    96198 dev-python/Manifest
    58963 dev-ruby/Manifest
    59736 dev-util/Manifest
    58338 eclass/Manifest
    55749 kde-base/Manifest
   110064 licenses/Manifest
    35262 media-gfx/Manifest
    53995 media-libs/Manifest
    55607 media-plugins/Manifest
    71911 media-sound/Manifest
    34835 media-video/Manifest
  5747849 metadata/Manifest
    47452 net-analyzer/Manifest
    65989 net-misc/Manifest
   316787 profiles/Manifest
    67784 sys-apps/Manifest
    48971 x11-misc/Manifest
    41475 x11-plugins/Manifest
  

Backwards Compatibility
=======================
The package Manifests should also be maintained as ONLY uncompressed in
CVS.

For processing of all existing per-package Manifests, if compression is
used, it should be done in parallel to the existing Manifests, to
provide for a changeover period. Newer versions of Portage may later
choose to exclude all non-compressed Manifests during emerge --sync if
compressed versions are guaranteed to exist on the servers.

MetaManifests may come into existence as compressed from the start, as
do not have an backwards compatibility issues.

As a side note, this breaks all manual interaction with Manifests
such as grep, and so should only be applied to large Manifest2 files,
such as the MetaManifest. 

References
==========
.. [#GLEP44] Mauch, M. (2005) GLEP44 - Manifest2 format.
  http://www.gentoo.org/proj/en/glep/glep-0044.html	

Copyright
=========
Copyright (c) 2008-2010 by Robin Hugh Johnson. This material may be
distributed only subject to the terms and conditions set forth in the
Open Publication License, v1.0.

vim: tw=72 ts=2 expandtab:

[-- Attachment #2: Type: application/pgp-signature, Size: 330 bytes --]

  parent reply	other threads:[~2010-01-31 10:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-31  9:20 [gentoo-dev] Tree-signing GLEPs update Robin H. Johnson
2010-01-31  9:48 ` [gentoo-dev] GLEP58 - MetaManifest Robin H. Johnson
2010-02-02  6:27   ` Denis Dupeyron
2010-02-02  7:35     ` Robin H. Johnson
2010-01-31  9:57 ` [gentoo-dev] GLEP59 - Manifest2 hashes Robin H. Johnson
2010-02-01  5:05   ` Robin H. Johnson
2010-02-01  8:23   ` Doug Goldstein
2010-02-02  6:06     ` Denis Dupeyron
2010-02-04  2:57       ` Robin H. Johnson
2010-02-02  6:09     ` Robin H. Johnson
2010-01-31 10:01 ` [gentoo-dev] GLEP60 - Manifest2 filetypes Robin H. Johnson
2010-01-31 10:04 ` Robin H. Johnson [this message]
2010-02-08  1:02   ` [gentoo-dev] GLEP61 - Manifest2 compression Brian Harring
2010-02-08  5:23     ` Robin H. Johnson
2010-02-08  5:50       ` Brian Harring
2010-01-31 10:11 ` [gentoo-dev] Tree-signing GLEPS review notes Robin H. Johnson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=robbat2-20100131T100201-562205149Z@orbis-terrarum.net \
    --to=robbat2@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox