On 09/09/2010 07:24 PM, Matt Neimeyer wrote: > My generic question is: When I'm using a pipe line series of commands > do I use up more/less space than doing things in sequence? > > For example, I have a development Gentoo VM that has a hard drive that > is too small... I wanted to move a database off of that onto another > machine but when I tried the following I filled my partition and 'evil > things' happened... > > mysqldump blah... > gzip blah... > > In this specific case I added another virtual drive, mounted that and > went on with life but I'm curious if I could have gotten away with the > pipe line instead. Will doing something like this still use "twice" > the space? > > mysqldump | gzip > file.sql.gz > > OR going back to my generic question if I pipe line like "type | sort > | unique > output" does that only use 1x or 3x the disk space? > > Thanks in advance! > > Matt > > P.S. If the answer is "it depends" how do know what it depends on? > Everyone already answered the disk space question. I want to add just this: It also saves you lots of i/o-bandwidth: only the compressed data gets written to disk. As i/o is the most common bottleneck, it is often an imperative to do as much as possible in a pipe. If you're lucky it can also mean, that multiple programs run at the same time, resulting in higher throughput. Lucky is, when consumer and producer (right and left of pipe) can work simultaneously because the buffer is big enough. You can see this every time you (un)pack a tar.gz. Bye, Daniel -- PGP key @ http://pgpkeys.pca.dfn.de/pks/lookup?search=0xBB9D4887&op=get # gpg --recv-keys --keyserver hkp://subkeys.pgp.net 0xBB9D4887