Hardware Reference
In-Depth Information
of release consistency (Amza, 1996). Potentially writable pages may be present at
multiple nodes at the same time, but before doing a write, a process must first do
an acquire operation to signal its intention. At that point, all copies but the most
recent one are invalidated. No other copies may be made until the corresponding
release is done, at which time the page can be shared again.
A second optimization done in Treadmarks is to initially map each writable
page in read-only mode. When the page is first written to, a protection fault occurs
and the system makes a copy of the page, called the twin . Then the original page
is mapped in as read-write and subsequent writes can go at full speed. When a re-
mote page fault happens later and the page has to be shipped over there, a word-
by-word comparison is done between the current page and the twin. Only those
words that have been changed are sent, reducing the size of the messages.
When a page fault occurs, the missing page has to be located. Various solu-
tions are possible, including those used in NUMA and COMA machines, such as
(home-based) directories. In fact, many of the solutions used in DSM are also
applicable to NUMA and COMA because DSM is really just a software imple-
mentation of NUMA or COMA with each page being treated like a cache line.
DSM is a hot area of research. Interesting systems include CASHMERE
(Kontothanassis, et al., 1997 and Stets et al., 1997), CRL (Johnson et al., 1995),
Shasta (Scales et al., 1996), and Treadmarks (Amza, 1996 and Lu et al., 1997).
Linda
Page-based DSM systems like IVY and Treadmarks use the MMU hardware to
trap accesses to missing pages. While making and sending differences instead of
whole pages helps, the fact remains that pages are an unnatural unit for sharing, so
other approaches have been tried.
One such approach is Linda, which provides processes on multiple machines
with a highly structured distributed shared memory (Carriero and Gelernter, 1989).
This memory is accessed through a small set of primitive operations that can be
added to existing languages, such as C and FORTRAN, to form parallel languages,
in this case, C-Linda and FORTRAN-Linda.
The unifying concept behind Linda is that of an abstract tuple space , which is
global to the entire system and accessible to all processes in it. Tuple space is like
a global shared memory, only with a certain built-in structure. The tuple space
contains some number of tuples , each consisting of one or more fields. For C-
Linda, field types include integers, long integers, and floating-point numbers, as
well as composite types such as arrays (including strings) and structures (but not
other tuples). Figure 8-47 shows three tuples as examples.
Four operations are provided on tuples. The first one, out , puts a tuple into the
tuple space. For example,
out(
′′
abc
′′
,2,5);
 
Search WWH ::




Custom Search