Hardware Reference
In-Depth Information
system, a hyperthreaded Core i7 chip looks like a dual processor in which both
CPUs happen to share a common cache and main memory. The operating system
schedules the threads independently. If two applications are running at the same
time, the operating system can run each one at the same time. For example, if a
mail daemon is sending or receiving email in the background while a user is inter-
acting with some program in the foreground, the daemon and the user program can
be run in parallel, as though there were two CPUs available.
Application software that has been designed to run as multiple threads can use
both virtual CPUs. For example, video editing programs usually allow users to
specify certain filters to apply to each frame in some range. These filters can mod-
ify the brightness, contrast, color balance, or other properties of each frame. The
program can then assign one CPU to process the even-numbered frames and the
other to process the odd-numbered frames. The two can then run in parallel.
Since the two threads share all the hardware resources, a strategy is needed to
manage the sharing. Intel identified four useful strategies for resource sharing in
conjunction with hyperthreading: resource duplication, partitioned resources,
threshold sharing, and full sharing. We will now touch on each of these in turn.
To start with, some resources are duplicated just for threading. For example,
since each thread has its own flow of control, a second program counter had to be
added. The table that maps the architectural registers ( EAX , EBX , etc.) onto the
physical registers also had to be duplicated, as did the interrupt controller, since the
threads can be independently interrupted.
Next we have partitioned resource sharing , in which the hardware resources
are rigidly divided between the threads. For example, if the CPU has a queue be-
tween two functional pipeline stages, half the slots could be dedicated to thread 1
and the other half to thread 2. Partitioning is easy to accomplish, has no overhead,
and keeps the threads out of each other's hair. If all the resources are partitioned,
we effectively have two separate CPUs. On the down side, it can easily happen
that at some point one thread is not using some of its resources that the other one
wants but is forbidden from accessing. As a consequence, resources that could
have been used productively lie idle.
The opposite of partitioned sharing is full resource sharing . When this
scheme is used, either thread can acquire any resources it needs, first come, first
served. However, imagine a fast thread consisting primarily of additions and
subtractions and a slow thread consisting primarily of multiplications and divis-
ions. If instructions are fetched from memory faster than multiplications and divis-
ions can be carried out, the backlog of instructions fetched for the slow thread and
queued but not yet fed into the pipeline will grow in time.
Eventually, this backlog will occupy the entire instruction queue, bringing the
fast thread to a halt for lack of space in the instruction queue. Full resource shar-
ing solves the problem of a resource lying idle while another thread wants it, but
creates a new problem of one thread potentially hogging so many resources that it
slows the other one down or stops it altogether.
 
Search WWH ::




Custom Search