Information Technology Reference
In-Depth Information
terabytes on clusters with thousands of nodes. The Phoenix implementation is based
on the same principles but targets shared-memory systems such as multi-core chips
and symmetric multiprocessors.
Phoenix uses threads to spawn parallel Map or Reduce tasks. It also uses shared-
memory buffers to facilitate communication without excessive data copying. The
runtime schedules tasks dynamically across the available processors in order to
achieve load balance and maximize task throughput. Locality is managed by adjusting
the granularity and assignment of parallel tasks.
In this paper we evaluate 5 applications (4 of them commonly used in cloud
application and on general application) that have been implemented using the Phoenix
MapReduce framework [12]:
Word Count: This application is commonly used in search engines for
the indexing of the web pages based on the words. It counts the frequency
of occurrence for each word in a set of files. The Map tasks process
different sections of the input files and return intermediate data that
consist of a word (key) and a value of 1 to indicate that the word was
found. The Reduce tasks add up the values for each word (key).
String Match: It processes two files: the “encrypt” file contains a set of
encrypted words and a “keys” file contains a list of non-encrypted words.
The goal is to encrypt the words in the “keys” file to determine which
words were originally encrypted to generate the “encrypt file”.
Histogram: It analyzes a given bitmap image to compute the frequency of
occurrence of a value in the 0-255 range for the RGB components of the
pixels. It can be used in image indexing and image search engines.
Linear Regression: It computes the line that best fits a given set of
coordinates in an input file. The algorithm assigns different portions of the
file to different map tasks, which compute certain summary statistics like
the sum of squares.
Matrix Multiply: Each Map task computes the results for a set of rows of
the output matrix and returns the (x,y) location of each element as the key
and the result of the computation as the value. This application is a mainly
computational intensive application and has been added to show the
differences between typical mathematic benchmarks with the applications
that are used in cloud computing applications.
Map stage
Reduce stage
Map
Reduce
Map
Reduce
Map
Reduce
Fig. 2. The Phoenix MapReduce framework
Search WWH ::




Custom Search