Information Technology Reference
In-Depth Information
Ecient Symmetric Band Matrix-Matrix
Multiplication on GPUs
Ernesto Dufrechou 2 , Pablo Ezzatti 2 , Enrique S. Quintana-Ortı 3 ,
and Alfredo Remon 1
1 Max Planck Institute for Dynamics of Complex Technical Systems,
Magdeburg, Germany
remon@mpi-magdeburg.mpg.de
2 Instituto de Computacion, Universidad de la Republica,
11.300-Montevideo, Uruguay
{ edufrechou,pezzatti } @fing.edu.uy
3 Dep. de Ingenierıa y Ciencia de la Computacion, Universidad Jaime I,
12.071-Castellon, Spain
quintana@icc.uji.es
Abstract. Matrix-matrix multiplication is an important linear algebra
operation with a myriad of applications in scientific and engineering com-
puting. Due to the relevance and inner parallelism of this operation, there
exist many high performance implementations for a variety of hardware
platforms. Exploit the structure of the matrices involved in the operation
in general provides relevant time and memory savings. This is the case,
e.g., when one of the matrices is a symmetric band matrix. This work
presents two ecient specialized implementations of the operation when
a symmetric band matrix is involved and the target architecture con-
tains a graphics processor (GPU). In particular, both implementations
exploit the structure of the matrices to leverage the vast parallelism of
the underlying hardware. The experimental results show remarkable re-
ductions in the computation time over the tuned implementations of the
same operation provided by MKL and CUBLAS.
1 Introduction
The matrix product
C := ʱAB + ʲC,
(1)
n ,andboth ʱ, ʲ are scalars is a common
and well-known kernel in numerical linear algebra [6]. This operation exhibits a
high level of concurrency and there exist highly tuned implementations available
for most high performance computing (HPC) hardware architectures.
In this work we address the special case of the matrix-matrix product (1)
when matrix A presents a symmetric band structure (and, therefore, m = k ),
meaning that all the nonzero elements of A are placed in a small set of super-
and sub-diagonals adjacent to the main diagonal. Exploiting the structure of
m
×
n , A
m
×
k , B
k
×
where C
R
R
R
Search WWH ::




Custom Search