From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20606 invoked from network); 3 Nov 2004 20:31:09 +0000 Received: from smtp.gentoo.org (156.56.111.197) by lists.gentoo.org with AES256-SHA encrypted SMTP; 3 Nov 2004 20:31:09 +0000 Received: from lists.gentoo.org ([156.56.111.196] helo=parrot.gentoo.org) by smtp.gentoo.org with esmtp (Exim 4.41) id 1CPRmf-0005sn-Bc for arch-gentoo-portage-dev@lists.gentoo.org; Wed, 03 Nov 2004 20:31:09 +0000 Received: (qmail 19425 invoked by uid 89); 3 Nov 2004 20:31:07 +0000 Mailing-List: contact gentoo-portage-dev-help@gentoo.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail Reply-To: gentoo-portage-dev@lists.gentoo.org X-BeenThere: gentoo-portage-dev@gentoo.org Received: (qmail 13516 invoked from network); 3 Nov 2004 20:31:07 +0000 From: Ned Ludd Reply-To: solar@gentoo.org To: gentoo-portage-dev@lists.gentoo.org Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-BdD/zP+xbZtceJd3high" Organization: Gentoo (hardened,security,infrastructure,embedded) Developer Message-Id: <1099513862.7041.13029.camel@simple> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Wed, 03 Nov 2004 15:31:03 -0500 Subject: [gentoo-portage-dev] /var/db/pkg, package - sizes and formats X-Archives-Salt: 8a87d97c-6f1c-422b-aaed-dfcc690ae236 X-Archives-Hash: e3d7194516523a59315ae6d982dab70b --=-BdD/zP+xbZtceJd3high Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Does anybody know the point of /var/db/pkg/*/*/{[A-Z]*,*.ebuild} when /var/db/pkg/*/*/env*.bz2 contains all the same info minus the (CBUILD,CONTENTS,COUNTER,DEBUGBUILD)? DEBUGBUILD is 0 bytes most of the time. Does it really need to exist in production environments? These sparse files add up in size really quickly and I'd like to see portage shift into a direction where these things are planned out vs just added. I don't see any reason why portage could not simply extract the variables portage from the saved environment.bz2 when needed or even preloading them. I work on an embedded images and I have the option to dump 100% of /var/db/ thus having no way of knowing what's installed but I'd like to keep track of what's installed and not be forced into using a hacked up methods which post rm stuff from a target ROOT=3D in order to yield the desired result of keeping track of data without wasting valuable resources. Using patch2 from http://bugs.gentoo.org/show_bug.cgi?id=3D67190 And a mini subdistro meta ebuild http://oc12.net/~solar/gensoekris-0.0.1.ebuild INSTALL_MASK=3D'*.a *.h /usr/include /usr/lib/*.o *.sample' ROOT=3D/tmp/roo= t emerge gensoekris -K du -hs . 12M . cd var/db/pkg/ du -hs . 3.8M . # About 1/3rd of this file systems entire size is extra pkg data. # Everything else seems to be in the env so lets kill some of the temp fluff and do some testing. find . -name '[A-Z]*' -o -name '*.ebuild' | egrep -v 'CBUILD|CONTENTS|COUNTER' | xargs rm du -hs . 1.1M . # So it seems we could be down from 3.8 to 1.1M and still have all needed info for portage. # Now lets take a look at the package and how this could work. I'll use iptables as an exmaple. When all is said and done we want only an environment.bz2 left in var/db/pkg/iptables-1.2.11-r2/ ls -l $(portageq envvar PKGDIR)/All/iptables-1.2.11-r2.tbz2 -rw-r--r-- 1 root root 179724 Oct 10 00:53 /usr/portage//packages/i386-pc-linux-uclibc//All/iptables-1.2.11-r2.tbz2 # extract the portage made tarball. mkdir root tar jxf /usr/portage//packages/i386-pc-linux-uclibc//All/iptables-1.2.11-r2.tbz2 -C root/ du -hs root 660K root # Repack the contents (new pkg format) cd root tar jcvf ../iptables-1.2.11-r2.tbz2 ./ cd ../net-firewall/iptables-1.2.11-r2/ bzip2 CONTENTS bunzip2 environment.bz2 echo CBUILD=3D$(cat CBUILD) >> environment echo COUNTER=3D$(cat COUNTER) >> environment bzip2 environment cat CONTENTS.bz2 >> environment.bz2 cd ../../ cat net-firewall/iptables-1.2.11-r2/environment.bz2 >> iptables-1.2.11-r2.tbz2 ls -l iptables-1.2.11-r2.tbz2 -rw-r--r-- 1 root root 174892 Nov 3 19:01 iptables-1.2.11-r2.tbz2 # As we can see the package size has become smaller. echo $((179724-174892)) 4832 # Just for kix take that number and assume say about 100000 users have about 500 packages installed and see that across the board we could save the users about export f=3D$(($(($((4832*500))*100000))/1024)) dd if=3D/dev/zero of=3D$f count=3D1 bs=3D$f ; ls -lh $f ; rm $f -rw-r--r-- 1 root root r 226M Nov 3 14:19 235937500 # probably not really realistic as 100k of users don't have buildpkg enabled. # Anyway lets test the new package.tbz2. mkdir root tar jxf iptables-1.2.11-r2.tbz2 -C root ; echo $? 0 du -hs root 660K root # good 660K matches same as portage built tarball that was extracted into the root/ from before. # The du -b here could become some other tool which seeks to the second=20 BZh9 header in the tarball. But to test that we could still get info out of the tarballs so portage could use/load it I did. tail -c $(du -b net-firewall/iptables-1.2.11-r2/environment.bz2 ) | bzgrep ^ARCH=3D # which returned. ARCH=3Dx86 # now lets see how big=20 rm iptables-1.2.11-r2.tbz2 find . -name '[A-Z]*' | xargs rm du -hs . 732K . # Well sweet we just got a dir that was 3.8M down to 732K and don't appear to have lost anything. # New total size du -hs /tmp/root 8.2M /tmp/root # which compresses down to about 3.6M nicely (Not bad for 36 pkgs) # another box which is my desktop is du -hs /var/db/pkg/ ; du -bhs /var/db/pkg/ 83M /var/db/pkg/ 29M /var/db/pkg/ # I bet most of yours are nearly as large or larger. # Cost savings here could be alot better for everybody if we moved into a direction which I just demonstrated.=20 In short this all could work if we simply Compressed the file CONTENTS and appended it to end of environment.bz2 then appended environment.bz2 to the end of the ${P}.tbz2 Anybody see flaws/problems in the basic idea? Would this structure be suitable as the next gen pkg format? --=20 Ned Ludd Gentoo (hardened,security,infrastructure,embedded) Developer --=-BdD/zP+xbZtceJd3high Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQBBiUAG94CCfB4KcwwRAvAGAKC3l0CCOXzw0LdPbHiFD7JWCx68kwCgvRgk RM0HwvM6TACKsipJKXxQaeE= =BW4q -----END PGP SIGNATURE----- --=-BdD/zP+xbZtceJd3high--