From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 010ED138825 for ; Sat, 1 Nov 2014 17:59:46 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 1D3A4E0FE4; Sat, 1 Nov 2014 17:59:39 +0000 (UTC) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id BF37EE0EC4 for ; Sat, 1 Nov 2014 17:59:37 +0000 (UTC) Received: from localhost ([84.133.128.9]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0MHbpA-1XliMF3YJ2-003QNa for ; Sat, 01 Nov 2014 18:59:36 +0100 Date: Sat, 1 Nov 2014 18:59:34 +0100 From: meino.cramer@gmx.de To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Re: OT Best way to compress files with digits Message-ID: <20141101175934.GB3860@solfire> References: <20141031153659.GA13217@solfire> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: mutt-ng/devel-r804 (Linux) X-Provags-ID: V03:K0:VA30uGqnY3Cb0DEVv762qALzdZDzGGcctxWXHNm0XcT6uJw/xDX CktxAaQay0yTkvNhpJW/gechM0kfj79xYGzT1cd/ZGqZnO7eRy5h+XrSFoe2njFmrpbh6Rx c6uCuUs64dCQbNGpQXVOs3oUwKNNuRSxXXwxLmrmrSBWmWYbOFFyC/vKsU+P0AdMgKbQ6qI txVl6hFZORSuZ5tGQdWQQ== X-UI-Out-Filterresults: notjunk:1; X-Archives-Salt: 129c9978-eed3-475f-bd1d-d07eb6c502ce X-Archives-Hash: d5de1fb1ad09ed63854a3106c9f2d312 James [14-11-01 18:16]: > gmx.de> writes: > > > > I have a lot of files with digits of PI. The digits > > are the characters of 0-9. Currently they are ZIPped, > > which I think is not the best way to do that. > > Hello Meino, > > It's a bit of effort, but the world's recognized authority > on algorithms is Don Knuth. [1] He's old now, but his > pioneering attempt at categorizing most algorithms: > "The art of computer programming" and his MMIX alogrithm > implementations (kinda like assembler) are certainly > part of many first-step research efforts on algorithms > and their implementations. > > It's not a cookbook; more of a scholarly (high_brow) reference, > just to supplement all the good postings by your peers on gentoo user. > > Alan may loan you his copy? > (ha ha ha)? > > > > hth, > James > > [1] http://www-cs-faculty.stanford.edu/~uno/ > Hello james, Don Knuth ... oh YES! :) For a long time I am using and prefering TeX over anything else (ok...for ASCII I use vim... ;). And beside his computer wisdom I also like his kind of humor a lot... for example this one: https://www.youtube.com/watch?v=eKaI78K_rgA&list=PLUu0XRts4lK6Ri7-xaCNYqTHx7We95Rk8&index=10 But my initial question was more targeted to "practical computing" as to groundshakeing and fundamental research topics. More like "what tool to pick?"... I did some compression tests myself and currently I have this: >From http://piworld.calico.jp/ (http://piworld.calico.jp/estart.html) I got zipped package of 1000 million places of PI each (~57MB for one ZIP). I unpacked the first package and recompressed it with different methods of 7zip, gzip and bzip2. For gzip and bzip2 I used the highest compression mode (-9). When a files name matches /.*ultra.*/, I used the highest compression mode (-mx=9), else I only set the compression method and leave the rest untouched (defaults). 119888896 2014-10-31 16:44 pi-0001.txt 57105419 2014-10-31 16:47 pi-0001.txt.gz 52632832 2014-10-31 16:48 pi-0001.txt.bz2 52045827 2014-10-31 16:54 pi-0001.txt.ppmd.7z 57110291 2014-10-31 17:23 pi-0001.zip 51766683 2014-10-31 17:26 pi-0001.txt.lzma.7z 51668838 2014-10-31 17:34 pi-0001.txt.lzma.ultra.7z 52862115 2014-10-31 17:36 pi-0001.txt.ppmd.ultra.7z 51668838 2014-10-31 17:39 pi-0001.txt.ultra.7z 7zip's lzma wins here, which is also the default method of 7zip. I set the ultra mode for this by hand. >From other sites which offer PI for download I know of methods, which store the ASCII-digits in binary and compresses then. Would be interesting, whether this creates a more "handy" input from 7zips point of view... Ah! By the way...I was astonished to read, that the digits of PI are called random on the one hand and on the other hand there is a formula [1] to calculate a certain digit of PI without calculation of the previous digits... Calculated random? Are nature constants the purest form of PRNGs ??? ;) (Quantum physics is everywhere... ;;)) [1]: http://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%93Plouffe_formula Best regards, Meino