Information Technology Reference
In-Depth Information
MBSPDiscover: An Automatic Benchmark
for MultiBSP Performance Analysis
Marcelo Alaniz 1 , Sergio Nesmachnow 2 , Brice Goglin 3 , Santiago Iturriaga 2 ,
Veronica Gil Gosta 1 , and Marcela Printista 1
1 Universidad Nacional de San Luis, Argentina
2 Universidad de la Republica, Uruguay
3 Inria Bordeaux-Sud-Ouest, University of Bordeaux, France
Abstract. Multi-Bulk Synchronous Parallel (MultiBSP) is a recently
proposed parallel programming model for multicore machines that ex-
tends the classic BSP model. MultiBSP is very useful to design algo-
rithms and estimate their running time, which are hard to do in High
Performance Computing applications. For a correct estimation of the
running time, the main parameters of the MultiBSP model for different
multicore architectures need to be determined. This article presents a
benchmark proposal for measuring the parameters that characterize the
communication and synchronization cost for the model. Our approach
discovers automatically the hierarchical structure of the multicore archi-
tecture by using a specific tool (hwloc) that allows obtaining runtime
information about the machine. We describe the design, implementation
and the results of benchmarking two multicore machines. Furthermore,
we report the validation of the proposed method by using a real Multi-
BSP implementation of the vector inner product algorithm and compar-
ing the predicted execution time against the real execution time.
1 Introduction
Performance prediction is an important tool for performance analysis of parallel
applications [5]. This technique involves modeling program performance as a
function of the hardware and software characteristics of a system. By changing
these characteristics in the model, the execution time of standard programs can
be accurately predicted for a variety of platforms and configurations.
The Bulk Synchronous Parallel (BSP) model [7], is one of the most popular
among several analytical models proposed. The model assumes a BSP abstract
machine with identical processors. Each processor has access to its own local
memory and they communicate with each other through a all-to-all network,
providing uniform point-to-point access time and bandwidth capacity.
The BSP model was introduced for distributed computers, but assuming only
one core per computing node. Although the model was very successfully used in
the 1990s, it gradually became less used with the emergence of new multicore
architectures in the last decade. As the evaluation of computers gained renewed
importance, the BSP model was extended to MultiBSP by Valiant [8]. MultiBSP
Search WWH ::




Custom Search