public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] ship app-arch/pbzip2 instead of app-arch/bzip2
@ 2012-09-26 20:30 Michael Mol
  2012-09-26 20:43 ` Matt Turner
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Michael Mol @ 2012-09-26 20:30 UTC (permalink / raw
  To: gentoo-dev

A few months ago, I filed bug 423651 to ask that bzip2 on the install
media be replaced with
 pbzip2. It was closed a short while later, telling me that it'd
involve changing what's kept in @system, and that had to be discussed
here, rather than in a bug report.

Here's a detailed description of how pbzip2 operates, as described by
a friend of mine:

> pbzip2's compression routine splits the input into blocks (with a default of 900,000
> bytes), which it then feeds into the standard bzip2 compression routine. The output
> of the various calls to the bzip2 compression routine are then concatenated together.
> The end result is the same as if you had first used the "split" command on the input,
> run individual bzip2 commands on the split pieces, then recombined the individual
> bz2 files using cat.
>
> The down side to this is that you have multiple file headers, footers, and byte-align
> padding, plus the fact that bzip2 does a RLE compression stage to fill the buffer it
> feeds to the BWT, the main part of the compression routine. If you happen to have a
> section with 1MiB of the same byte, the pbzip2 front-end will split that into two blocks
> (at the default settings) and feed them to separate bzip2 compressors. bzip2 will
> then compress the first block down to a buffer of about 17kiB before passing it on
> to be compressed further, and the rest of the data would have fit within this block, if
> pbzip2 hadn't split it the way it had.
>
> As for decompression, pbzip2 can only really do parallel decompression of files that it
> created, since it seeks for the bz2 file header in order to split it to different workers. One
> reason for this is that the bz2 block header is not byte aligned.

I really don't know how to carry this discussion any further than
this; I'll answer any questions I can.

-- 
:wq


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-01-03 10:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-26 20:30 [gentoo-dev] ship app-arch/pbzip2 instead of app-arch/bzip2 Michael Mol
2012-09-26 20:43 ` Matt Turner
2012-09-26 21:27   ` Florian Philipp
2012-09-26 21:53     ` Michael Mol
2012-09-27  7:22       ` Florian Philipp
2012-09-27 15:09         ` Florian Philipp
2012-09-27  9:23   ` Piotr Szymaniak
2012-10-29 14:13     ` Rick "Zero_Chaos" Farina
2012-10-29 14:39       ` Rich Freeman
2012-10-31 13:56         ` Sergey Popov
2013-01-02 23:11         ` Pacho Ramos
2013-01-03  1:28           ` Rick "Zero_Chaos" Farina
2013-01-03 10:30             ` Samuli Suominen
2013-01-03 10:35               ` Diego Elio Pettenò
2012-09-26 21:49 ` Chí-Thanh Christopher Nguyễn
2012-09-26 21:59   ` Michael Mol
2012-09-26 22:31     ` Mike Gilbert
2012-09-26 22:57       ` Christoph Junghans
2012-09-26 23:57         ` Michael Mol
2012-09-27  0:55         ` Diego Elio Pettenò
2012-09-27  8:58   ` Tobias Klausmann
2012-09-27  9:48 ` Ulrich Mueller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox