[gentoo-doc-cvs] cvs commit: l-posix1.xml

public inbox for gentoo-doc-cvs@lists.gentoo.org
 help / color / mirror / Atom feed

From: "Xavier Neys" <neysx@lark.gentoo.org>
To: gentoo-doc-cvs@lists.gentoo.org
Subject: [gentoo-doc-cvs] cvs commit: l-posix1.xml
Date: Wed,  3 Aug 2005 10:36:19 +0000	[thread overview]
Message-ID: <200508031036.j73Aa0vD021343@robin.gentoo.org> (raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 23434 bytes --]

neysx       05/08/03 10:36:19

  Added:       xml/htdocs/doc/en/articles l-posix1.xml l-posix2.xml
                        l-posix3.xml
  Log:
  #100538 xmlified posix articles

Revision  Changes    Path
1.1                  xml/htdocs/doc/en/articles/l-posix1.xml

file : http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix1.xml?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=gentoo
plain: http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix1.xml?rev=1.1&content-type=text/plain&cvsroot=gentoo

Index: l-posix1.xml
===================================================================
<?xml version='1.0' encoding="UTF-8"?>
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/articles/l-posix1.xml,v 1.1 2005/08/03 10:36:19 neysx Exp $ -->
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">

<guide link="/doc/en/articles/l-posix1.xml">
<title>POSIX threads explained, part 1</title>

<author title="Author">
  <mail link="drobbins@gentoo.org">Daniel Robbins</mail>
</author>
<author title="Editor">
  <mail link="rane@gentoo.pl">Łukasz Damentko</mail>
</author>

<abstract>
POSIX (Portable Operating System Interface) threads are a great way to increase
the responsiveness and performance of your code. In this series, Daniel Robbins
shows you exactly how to use threads in your code. A lot of behind-the-scenes
details are covered, so by the end of this series you'll really be ready to
create your own multithreaded programs.
</abstract>

<version>1.0</version>
<date>2005-07-27</date>

<chapter>
<title>A simple and nimble tool for memory sharing</title>
<section>
<title>Threads are fun</title>
<body>

<note>
The original version of this article was published on IBM developerWorks, and is
property of Westtech Information Services. This document is an updated version
of the original article, and contains various improvements made by the Gentoo
Linux Documentation team.
</note>

<p>
Knowing how to properly use threads should be part of every good programmer's
repertoire. Threads are similar to processes. Threads, like processes, are
time-sliced by the kernel. On uniprocessor systems the kernel uses time slicing
to simulate simultaneous execution of threads in much the same way it uses time
slicing with processes. And, on multiprocessor systems, threads are actually
able to run simultaneously, just like two or more processes can.
</p>

<p>
So why is multithreading preferable to multiple independent processes for most
cooperative tasks? Well, threads share the same memory space. Independent
threads can access the same variables in memory. So all of your program's
threads can read or write the declared global integers. If you've ever
programmed any non-trivial code that uses fork(), you'll recognize the
importance of this tool. Why? While fork() allows you to create multiple
processes, it also creates the following communication problem: how to get
multiple processes, each with their own independent memory space, to
communicate. There is no one simple answer to this problem. While there are many
different kinds of local IPC (inter-process communication), they all suffer from
two important drawbacks:
</p>

<ul>
  <li>
    They impose some form of additional kernel overhead, lowering performance.
  </li>
  <li>
    In almost all situations, IPC is not a "natural" extension of your code. It
    often dramatically increases the complexity of your program.
  </li>
</ul>

<p>
Double bummer: overhead and complication aren't good things. If you've ever had
to make massive modifications to one of your programs so that it supports IPC,
you'll really appreciate the simple memory-sharing approach that threads
provide. POSIX threads don't need to make expensive and complicated
long-distance calls because all our threads happen to live in the same house.
With a little synchronization, all your threads can read and modify your
program's existing data structures. You don't have to pump the data through a
file descriptor or squeeze it into a tight, shared memory space. For this reason
alone you should consider the one process/multithread model rather than the
multiprocess/single-thread model.
</p>

</body>
</section>
<section>
<title>Threads are nimble</title>
<body>

<p>
But there's more. Threads also happen to be extremely nimble. Compared to a
standard fork(), they carry a lot less overhead. The kernel does not need to
make a new independent copy of the process memory space, file descriptors, etc.
That saves a lot of CPU time, making thread creation ten to a hundred times
faster than new process creation. Because of this, you can use a whole bunch of
threads and not worry too much about the CPU and memory overhead incurred. You
don't have a big CPU hit the way you do with fork(). This means you can
generally create threads whenever it makes sense in your program.
</p>

<p>
Of course, just like processes, threads will take advantage of multiple CPUs.
This is a really great feature if your software is designed to be used on a
multiprocessor machine (if the software is open source, it will probably end up
running on quite a few of these). The performance of certain kinds of threaded
programs (CPU-intensive ones in particular) will scale almost linearly with the
number of processors in the system. If you're writing a program that is very
CPU-intensive, you'll definitely want to find ways to use multiple threads in
your code. Once you're adept at writing threaded code, you'll also be able to
approach coding challenges in new and creative ways without a lot of IPC red
tape and miscellaneous mumbo-jumbo. All these benefits work synergistically to
make multithreaded programming fun, fast, and flexible.
</p>

</body>
</section>
<section>
<title>I think I'm a clone now</title>
<body>

<p>
If you've been in the Linux programming world for a while, you may know about
the __clone() system call. __clone() is similar to fork(), but allows you to do
lots of things that threads can do. For example, with __clone() you can
selectively share parts of your parent's execution context (memory space, file
descriptors, etc.) with a new child process. That's a good thing. But there is
also a not-so-good thing about __clone(). As the __clone() man page states:
</p>

<pre caption="__clone() man page excerpt">
    "The __clone call is Linux-specific and should not be used in programs
    intended to be portable. For programming threaded applications (multiple
    threads of control in the same memory space), it is better to use a library
    implementing the POSIX 1003.1c thread API, such as the Linux-Threads
    library. See pthread_create(3thr)."
</pre>

<p>
So, while __clone() offers many of the benefits of threads, it is not portable.
That doesn't mean you shouldn't use it in your code. But you should weigh this
fact when you are considering using __clone() in your software. Fortunately, as
the __clone() man page states, there's a better alternative: POSIX threads. When
you want to write portable multithreaded code, code that works under Solaris,
FreeBSD, Linux, and more, POSIX threads are the way to go.
</p>

</body>
</section>
<section>
<title>Beginning threads</title>
<body>

<p>
Here's a simple example program that uses POSIX threads:
</p>

<pre caption="Sample program using POSIX threads">
#include &lt;pthread.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;unistd.h&gt;

void *thread_function(void *arg) {
  int i;
  for ( i=0; i&lt;20; i++ ) {
    printf("Thread says hi!\n");
    sleep(1);
  }
  return NULL;
}

int main(void) {

  pthread_t mythread;

  if ( pthread_create( &amp;mythread, NULL, thread_function, NULL) ) {
    printf("error creating thread.");
    abort();
  }

  if ( pthread_join ( mythread, NULL ) ) {
    printf("error joining thread.");
    abort();
  }

  exit(0);

}
</pre>

<p>
To compile this program, simply save this program as thread1.c, and type:
</p>

1.1                  xml/htdocs/doc/en/articles/l-posix2.xml

file : http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix2.xml?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=gentoo
plain: http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix2.xml?rev=1.1&content-type=text/plain&cvsroot=gentoo

Index: l-posix2.xml
===================================================================
<?xml version='1.0' encoding="UTF-8"?>
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/articles/l-posix2.xml,v 1.1 2005/08/03 10:36:19 neysx Exp $ -->
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">

<guide link="/doc/en/articles/l-posix2.xml">
<title>POSIX threads explained, part 2</title>

<author title="Author">
  <mail link="drobbins@gentoo.org">Daniel Robbins</mail>
</author>
<author title="Editor">
  <mail link="rane@gentoo.pl">Łukasz Damentko</mail>
</author>

<abstract>
POSIX threads are a great way to increase the responsiveness and performance of
your code. In this second article of a three-part series, Daniel Robbins shows
you how to protect the integrity of shared data structures in your threaded code
by using nifty little things called mutexes.
</abstract>

<version>1.0</version>
<date>2005-07-27</date>

<chapter>
<title>The little things called mutexes</title>
<section id="thread3c">
<title>Mutex me!</title>
<body>

<note>
The original version of this article was published on IBM developerWorks, and is
property of Westtech Information Services. This document is an updated version
of the original article, and contains various improvements made by the Gentoo
Linux Documentation team.
</note>

<p>
In my <uri link="/doc/en/articles/l-posix1.xml">previous article</uri>, I talked
about threaded code that did unusual and unexpected things. Two threads each
incremented a global variable twenty times.  The variable was supposed to end up
with a value of 40, but ended up with a value of 21 instead. What happened? The
problem occurred because one thread repeatedly "cancelled out" the increment
performed by the other thread. Let's take a look at some corrected code that
uses a <b>mutex</b> to solve the problem:
</p>

<pre caption="thread3.c">
#include &lt;pthread.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;unistd.h&gt;
#include &lt;stdio.h&gt;

int myglobal;
pthread_mutex_t mymutex=PTHREAD_MUTEX_INITIALIZER;

void *thread_function(void *arg) {
  int i,j;
  for ( i=0; i&lt;20; i++ ) {
    pthread_mutex_lock(&amp;mymutex);
    j=myglobal;
    j=j+1;
    printf(".");
    fflush(stdout);
   sleep(1);
    myglobal=j;
    pthread_mutex_unlock(&amp;mymutex);
  }
  return NULL;
}

int main(void) {

  pthread_t mythread;
  int i;

  if ( pthread_create( &amp;mythread, NULL, thread_function, NULL) ) {
    printf("error creating thread.");
    bort();
  }

  for ( i=0; i&lt;20; i++) {
    pthread_mutex_lock(&amp;mymutex);
    myglobal=myglobal+1;
    pthread_mutex_unlock(&amp;mymutex);
    printf("o");
    fflush(stdout);
    sleep(1);
  }

  if ( pthread_join ( mythread, NULL ) ) {
    printf("error joining thread.");
    abort();
  }

  printf("\nmyglobal equals %d\n",myglobal);

  exit(0);

}
</pre>

</body>
</section>
<section>
<title>Comprehension time</title>
<body>

<p>
If you compare this code to the version in my <uri
link="/doc/en/articles/l-posix1.xml">previous article</uri>, you'll notice the
addition of the calls pthread_mutex_lock() and pthread_mutex_unlock(). These
calls perform a much-needed function in threaded programs. They provide a means
of mutual exclusion (hence the name). No two threads can have the same mutex
locked at the same time.
</p>

<p>
This is how mutexes work. If thread "a" tries to lock a mutex while thread "b"
has the same mutex locked, thread "a" goes to sleep. As soon as thread "b"
releases the mutex (via a pthread_mutex_unlock() call), thread "a" will be able
to lock the mutex (in other words, it will return from the pthread_mutex_lock()
call with the mutex locked). Likewise, if thread "c" tries to lock the mutex
while thread "a" is holding it, thread "c" will also be put to sleep
temporarily. All threads that go to sleep from calling pthread_mutex_lock() on
an already-locked mutex will "queue up" for access to that mutex.
</p>

<p>
pthread_mutex_lock() and pthread_mutex_unlock() are normally used to protect
data structures. That is, you make sure that only one thread at a time can
access a certain data structure by locking and unlocking it. As you may have
guessed, the POSIX threads library will grant a lock without having put the
thread to sleep at all if a thread tries to lock an unlocked mutex.
</p>

<figure link="/images/docs/l-posix-mutex.gif" caption="For your enjoyment, four
znurts re-enact a scene from recent pthread_mutex_lock() calls"/>

<p>
The thread in this image that has the mutex locked gets to access the complex
data structure without worrying about having other threads mess with it at the
same time. The data structure is in effect "frozen" until the mutex is unlocked.
It's as if the pthread_mutex_lock() and pthread_mutex_unlock() calls are "under
construction" signs that surround a particular piece of shared data that's being
modified or read. The calls act as a warning to other threads to go to sleep and
wait their turn for the mutex lock. Of course this is only true if your surround
every read and write to a particular data structure with calls to
pthread_mutex_lock() and pthread_mutex_unlock().
</p>

</body>
</section>
<section>
<title>Why mutex at all?</title>
<body>

<p>
Sounds interesting, but why exactly do we want to put our threads to sleep?
After all, isn't the main advantage of threads their ability to work
independently and in many cases simultaneously? Yes, that's completely true.
However, every non-trivial threads program will require at least some use of
mutexes. Let's refer to our example program again to understand why.
</p>

<p>
If you take a look at thread_function(), you'll notice that the mutex is locked
at the beginning of the loop and released at the very end. In this example
program, mymutex is used to protect the value of myglobal. If you look carefully
at thread_function() you'll notice that the increment code copies myglobal to a
local variable, increments the local variable, sleeps for one second, and only
then copies the local value back to myglobal. Without the mutex,
thread_function() will overwrite the incremented value when it wakes up if our
main thread increments myglobal during thread_function()'s one-second nap. Using
a mutex ensures that this doesn't happen. (In case you're wondering, I added the
one-second delay to trigger a flawed result. There is no real reason for
thread_function() to go to sleep for one second before writing the local value
back to myglobal.) Our new program using mutex produces the desired result:
</p>

<pre caption="Output of program using mutex">
$ <i>./thread3</i>
o..o..o.o..o..o.o.o.o.o..o..o..o.ooooooo
myglobal equals 40
</pre>

<p>
To further explore this extremely important concept, let's take a look at the
increment code from our program:
</p>

<pre caption="Incremented code">
thread_function() increment code: 
   j=myglobal;

1.1                  xml/htdocs/doc/en/articles/l-posix3.xml

file : http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix3.xml?rev=1.1&content-type=text/x-cvsweb-markup&cvsroot=gentoo
plain: http://www.gentoo.org/cgi-bin/viewcvs.cgi/xml/htdocs/doc/en/articles/l-posix3.xml?rev=1.1&content-type=text/plain&cvsroot=gentoo

Index: l-posix3.xml
===================================================================
<?xml version='1.0' encoding="UTF-8"?>
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/doc/en/articles/l-posix3.xml,v 1.1 2005/08/03 10:36:19 neysx Exp $ -->
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">

<guide link="/doc/en/articles/l-posix3.xml">
<title>POSIX threads explained, part 3</title>

<author title="Author">
  <mail link="drobbins@gentoo.org">Daniel Robbins</mail>
</author>
<author title="Editor">
  <mail link="rane@gentoo.pl">Łukasz Damentko</mail>
</author>

<abstract>
In this article, the last of a three-part series on POSIX threads, Daniel takes
a good look at how to use condition variables. Condition variables are POSIX
thread structures that allow you to "wake up" threads when certain conditions
are met. You can think of them as a thread-safe form of signalling. Daniel wraps
up the article by using all that you've learned so far to implement a
multi-threaded work crew application.
</abstract>

<version>1.0</version>
<date>2005-07-28</date>

<chapter>
<title>Improve efficiency with condition variables</title>
<section>
<title>Condition variables explained</title>
<body>

<note>
The original version of this article was published on IBM developerWorks, and is
property of Westtech Information Services. This document is an updated version
of the original article, and contains various improvements made by the Gentoo
Linux Documentation team.
</note>

<p>
I ended my <uri link="/doc/en/articles/l-posix2.xml">previous article</uri> by
describing a particular dilemma how does a thread deal with a situation where
it is waiting for a specific condition to become true? It could repeatedly lock
and unlock a mutex, each time checking a shared data structure for a certain
value. But this is a waste of time and resources, and this form of busy polling
is extremely inefficient. The best way to do this is to use the
pthread_cond_wait() call to wait on a particular condition to become true.
</p>

<p>
It's important to understand what pthread_cond_wait() does -- it's the heart of
the POSIX threads signalling system, and also the hardest part to understand.
</p>

<p>
First, let's consider a scenario where a thread has locked a mutex, in order to
take a look at a linked list, and the list happens to be empty. This particular
thread can't do anything -- it's designed to remove a node from the list, and
there are no nodes available. So, this is what it does.
</p>

<p>
While still holding the mutex lock, our thread will call
pthread_cond_wait(&amp;mycond,&amp;mymutex). The pthread_cond_wait() call is
rather complex, so we'll step through each of its operations one at a time.
</p>

<p>
The first thing pthread_cond_wait() does is simultaneously unlock the mutex
mymutex (so that other threads can modify the linked list) and wait on the
condition mycond (so that pthread_cond_wait() will wake up when it is
"signalled" by another thread). Now that the mutex is unlocked, other threads
can access and modify the linked list, possibly adding items. 
</p>

<p>
At this point, the pthread_cond_wait() call has not yet returned. Unlocking the
mutex happens immediately, but waiting on the condition mycond is normally a
blocking operation, meaning that our thread will go to sleep, consuming no CPU
cycles until it is woken up. This is exactly what we want to happen. Our thread
is sleeping, waiting for a particular condition to become true, without
performing any kind of busy polling that would waste CPU time. From our thread's
perspective, it's simply waiting for the pthread_cond_wait() call to return. 
</p>

<p>
Now, to continue the explanation, let's say that another thread (call it "thread
2") locks mymutex and adds an item to our linked list. Immediately after
unlocking the mutex, thread 2 calls the function
pthread_cond_broadcast(&amp;mycond). By doing so, thread 2 will cause all
threads waiting on the mycond condition variable to immediately wake up. This
means that our first thread (which is in the middle of a pthread_cond_wait()
call) will now wake up.
</p>

<p>
Now, let's take a look at what happens to our first thread. After thread 2
called pthread_cond_broadcast(&amp;mymutex) you might think that thread 1's
pthread_cond_wait() will immediately return. Not so! Instead,
pthread_cond_wait() will perform one last operation: relock mymutex. Once
pthread_cond_wait() has the lock, it will then return and allow thread 1 to
continue execution. At that point, it can immediately check the list for any
interesting changes.
</p>

</body>
</section>
<section>
<title>Stop and review!</title>
<body>

<!-- These bits do not make any sense to me, commented out

<pre caption="queue.h">
    pthread_cond_t mycond;
</pre>

<pre caption="control.h">
    pthread_cond_t mycond;

    pthread_cond_init(&amp;mycond,NULL);

    pthread_cond_destroy(&amp;mycond);

    pthread_cond_wait(&amp;mycond, &amp;mymutex);

    pthread_cond_broadcast(&amp;mycond);

    pthread_cond_signal(&amp;mycond);
</pre>
-->
<pre caption="queue.h">
/* queue.h
<comment>** Copyright 2000 Daniel Robbins, Gentoo Technologies, Inc.
** Author: Daniel Robbins
** Date: 16 Jun 2000</comment>
*/
typedef struct node {
  struct node *next;
} node;
typedef struct queue {
  node *head, *tail; 
} queue;
void queue_init(queue *myroot);
void queue_put(queue *myroot, node *mynode);
node *queue_get(queue *myroot);
</pre>

<pre caption="queue.c">
/* queue.c
<comment>** Copyright 2000 Daniel Robbins, Gentoo Technologies, Inc.
** Author: Daniel Robbins
** Date: 16 Jun 2000
**
** This set of queue functions was originally thread-aware.  I
** redesigned the code to make this set of queue routines
** thread-ignorant (just a generic, boring yet very fast set of queue
** routines).  Why the change?  Because it makes more sense to have
** the thread support as an optional add-on.  Consider a situation
** where you want to add 5 nodes to the queue.  With the
** thread-enabled version, each call to queue_put() would
** automatically lock and unlock the queue mutex 5 times -- that's a
** lot of unnecessary overhead.  However, by moving the thread stuff
** out of the queue routines, the caller can lock the mutex once at
** the beginning, then insert 5 items, and then unlock at the end.
** Moving the lock/unlock code out of the queue functions allows for
** optimizations that aren't possible otherwise.  It also makes this
** code useful for non-threaded applications.
**
** We can easily thread-enable this data structure by using the
** data_control type defined in control.c and control.h.</comment> */
#include &lt;stdio.h&gt;
#include "queue.h"
void queue_init(queue *myroot) {
  myroot->head=NULL;
  myroot->tail=NULL;
}
void queue_put(queue *myroot,node *mynode) {
  mynode->next=NULL;
  if (myroot->tail!=NULL)
    myroot->tail->next=mynode;
  myroot->tail=mynode;
  if (myroot->head==NULL)
    myroot->head=mynode;
}
node *queue_get(queue *myroot) {
  //get from root
  node *mynode;
  mynode=myroot->head;
  if (myroot->head!=NULL)
    myroot->head=myroot->head->next;
  return mynode;
}
</pre>

-- 
gentoo-doc-cvs@gentoo.org mailing list

                 reply	other threads:[~2005-08-03 10:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200508031036.j73Aa0vD021343@robin.gentoo.org \
    --to=neysx@lark.gentoo.org \
    --cc=docs-team@lists.gentoo.org \
    --cc=gentoo-doc-cvs@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox