For some short period of time, T3 continues to run. When the signal arrives from the kernel, T3 is
interrupted and forced to run the signal handler. That, in turn, calls the scheduler, which context
switches T3 out and T2 in. And that's it! At time 3, T1 and T2 are both active, T3 is runnable, and
T2 holds the lock.
There are a couple things to notice here. There's no guarantee that T2 will get the lock. It's
possible that T1 could have reclaimed it; it's even possible that T3 could have snatched it away
just before the signal arrived. If either of these events occurred, the net result is that a bit of time
would have been wasted, but they would both work perfectly. This scenario works as described,
irrespective of the number of CPUs. If this runs on a multiprocessor, it will work exactly the same
way as it does on a uniprocessor, only faster.
In this example we have described two context switches. The first one was voluntary--T2 wanted
to go to sleep. The second was involuntary (preemptive)--T3 was perfectly happy and only
context switched because it was forced to.
Preemption is the process of rudely kicking a thread off its LWP (or an LWP off its CPU) so that
some other thread can run instead. (This is what happened at time 3.) For SCS threads, preemption
is handled in the kernel by the kernel scheduler. For PCS threads, it is done by the thread library.
Preemption is accomplished by sending the LWP in question a signal specifically invented for that
purpose. The LWP then runs the handler, which in turn realizes that it must context switch its
current thread and does so. (You will notice that one LWP is able to direct a signal to another
specific LWP in the case in which they are both in the same process. You should never do this
yourself. You may send signals to threads but never to LWPs.)
In Solaris 2.5 and below, it was SIGLWP. This is a kernel-defined signal that requires a system
call to implement. Digital UNIX uses a slightly different mechanism, but the results are the same.
Preemption requires a system call, so the kernel has to send the signal, which takes time. Finally,
the LWP, to which the signal is directed, must receive it and run the signal handler. Context
switching by preemption is involuntary and is more expensive than context switching by
"voluntary" means. (You will never have to think about this while programming.)
The discussion of context switching and preemption above is accurate for all the various libraries.
It is accurate for threads on LWPs and for LWPs (or traditional processes) on CPUs, substituting
the word interrupt for signal.
How Many LWPs?
The UNIX98 threads library has a call, pthread_setconcurrency(), which tells the library
how many LWPs you'd like to have available for PCS threads. If you set the number to ten and
you have nine threads, then when you create a tenth thread, you'll get a tenth LWP. When you
create an eleventh thread, you won't get another LWP. Now the caveat. This is a hint to the library
as to what you'd like. You may not get what you ask for! You might even get more. Your program
must run correctly without all the LWPs you want, although it may run faster if it gets them. In
practice, this becomes an issue only when your program needs a lot of LWPs.
You've got the power, but how do you use it wisely? The answer is totally application-dependent,
but we do have some generalities. (N.B.: Generalities. If you need a highly tuned application,
you've got to do the analysis and experimentation yourself.) We assume a dedicated machine.
If your program is completely CPU bound, one LWP per CPU will give you maximum
processing power. Presumably, you'll have the same number of threads.
Search WWH :