Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

From: "Boyd Stephen Smith Jr." <bss03@volumehost.net>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other?
Date: Wed, 14 Jun 2006 11:38:53 -0500	[thread overview]
Message-ID: <200606141138.57986.bss03@volumehost.net> (raw)
In-Reply-To: <1150290735.8524.12.camel@joe.realss>

[-- Attachment #1: Type: text/plain, Size: 3817 bytes --]

On Wednesday 14 June 2006 08:12, 张韡武 <zhangweiwu@realss.com> wrote 
about '[gentoo-user] how does a pipe work? Which process wait for which 
one, or they don't actually wait each other?':
> How does pipe actually work? I mean, when there is a pipe like this:
> $ appA | appB
> What happen if appA produced output when appB is still busy processing
> the data and did not require any data from input?
>
> possibility 1) as long as appA can get the resource (CPU?), it simply
> keep outputing data, and this data is cached in memory, as long as there
> is enough memory, and will finally feed to appB when appB finished his
> business and begin to accept more data;
>
> possibility 2) as long as appB stop requiring data, appA is suspended,
> the resource goes to appB. appA is only given resource (CPU?) when appB
> finished his business and begin to accept more data;
>
> Which one is true? I know usually 1) and 2) makes no difference to most
> users, that's why the detail explanation of "1) or 2)" is so hard to
> google out.

Neither! Both!

The implementation of pipes varies from *NIX to *NIX, and possibly within 
the same *NIX, since a shell might 'enhance' the pipes provided by the 
kernel/libc.  (The shell binary is ultimately responsible for implementing 
the pipe, so it may arbitrarily 'decorate' a standard pipe.)

In any case, you can't depend on any particular behavior if you want your 
shell script to be portable.

In linux/bash I believe it works like this:  Each, pipe has a small (~1 
page) FIFO buffer in memory.  (Not sure if this is kernel or userspace.)  
Both processes are started and compete for CPU time in the standard way.  
Either process may block on I/O when it performs standard, blocking I/O on 
the pipe. appA will block if the FIFO gets full; appB will block if the 
FIFO gets empty.

If you really must know: Use the Source, Luke.

> In my case appA gets the data from another host who have very short
> timeout settings, appB is used to compress the data obtained from appA.
> the compression is very difficult, usually at 30Kbps, the network is
> very fast, around 10Mbps. appB compress the data tunck by tunck, if
> Linux actually works in mode 2), the network connection is dropped when
> the interval of two tuncks of appB compressing data is longer then the
> network timeout setting. appA acutally don't know how to restart
> connection from where it was dropped, thus understanding this difference
> makes sense to me.

This also depends a lot on the application. appA can use asynchronous I/O, 
provide a larger buffer (perhaps even a temporary file), and/or send 
keepalives through the network.  Also, appB's compression my be 
interrupted while more data is written to the buffer.

> I made several experiements and my appA and appB both works fine, but I
> don't dare to share this appA/B to others unless I get the mechnism
> understood.

With the speeds you mention, the timeout would have to be ~8s or less for 
the socket to be dropped.[1]  Once a socket is established, they are 
amazingly stable; timeout for an established socket is usually more like 
5-10 minutes or even an hour.  Heck, I think OBSD 3.8 defaulted to a 1 DAY 
timeout before the OS reaped an established socket.

Also, you generally want to compress stuff before putting it on the wire, 
not after...

-- 
"If there's one thing we've established over the years,
it's that the vast majority of our users don't have the slightest
clue what's best for them in terms of package stability."
-- Gentoo Developer Ciaran McCreesh

[1] That's assuming 15Kbps compression rate and the ability to send 
full-size 16KB ip packets.  Most likely, ~1s would suffice, since packets 
are generally 1500B ~= 15Kb in size.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

     prev parent reply	other threads:[~2006-06-14 17:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-14 13:12 [gentoo-user] how does a pipe work? Which process wait for which one, or they don't actually wait each other? 张韡武
2006-06-14 13:28 ` Uwe Thiem
2006-06-14 14:15 ` Devon Miller
2006-06-14 16:38 ` Boyd Stephen Smith Jr. [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200606141138.57986.bss03@volumehost.net \
    --to=bss03@volumehost.net \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox