From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3161 invoked by uid 1002); 11 Jun 2003 16:02:24 -0000 Mailing-List: contact gentoo-dev-help@gentoo.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@gentoo.org Received: (qmail 4803 invoked from network); 11 Jun 2003 16:02:24 -0000 Date: Wed, 11 Jun 2003 11:02:02 -0500 From: Brian Harring To: Gentoo-dev Message-id: <1055347321.22765.151.camel@tylendel.genetics.wisc.edu> Organization: UW-Genetics MIME-version: 1.0 X-Mailer: Ximian Evolution 1.2.4 Content-type: text/plain Content-transfer-encoding: 7BIT Subject: [gentoo-dev] proposed md5sum change X-Archives-Salt: f9ae294f-4f8a-4eef-b5a4-52e1a65724cc X-Archives-Hash: 710ede9feab7390a1174921ee6476549 Hola all, Straight to the point, I propose instead of md5summing the compressed distfile, we md5sum the actual data, the tarball. There are a couple of reasons/benefits of this- 1) users are currently tied to a specific compression on the tarball- for those who would want to convert their distfiles to bzip2 rather then gzip (for space reasons), they're a bit out of luck- yes, they can attempt to update md5sum digests or force it to ignore the incorrect sums, but that gets old *real* quick. 2) Say for whatever reason, the tarball gets inflated- if the original tarball was compressed w/ say bzip2 0.90, and the user has bzip2 1.x, even if they recompress it they're out of luck- the bzip2 algorithm was tweaked for better compression after .90, resulting in a different md5sum then the original. Yet the distfile is still data-correct- it's just compressed slightly differently. 3) For anyone making a serious attempt at distfile diffs, the reconstruction process is seriously borked by the possibility that it's data-correct, but the compression has changed/been improved resulting in a different md5sum. I do know JJW's deltup attempt ran smack dab into this problem w/ the openoffice tarballs. I've also ran into the problem, and I'd prefer not to use the deltup method of having both old bzip2 and current bzip2 installed. In terms of performance of the md5summing, it would still likely be i/o limited- decompression would be done in memory after all. That said and done, I'm not after bludgeoning someone into implementing this- assuming people don't have any major criticism's against it and it has more then a snowball's chance in hell of being used I'm more then willing to code it myself. Comments/Flames/Death Threats? ~Brian -- gentoo-dev@gentoo.org mailing list