Skeletonizing Digital Images with Cellular Automata - Cellular Automata in Image Processing and Geometry

Image Processing Reference

In-Depth Information

Fig. 3.5 Scheme of threads for the CUDA implementation

It shows some important features that make this new hardware architecture suitable

for further implementations of CA algorithms.

CUDA TM is a general purpose parallel computing architecture that allows the par-

allel NVIDIA Graphics Processors Units (GPUs) to solve many complex computa-

tional problems in a more efficient way than on a Central Processing Unit (CPU).

GPUs constitute nowadays a solid alternative for high performance computing, and

the advent of CUDA allows programmers a friendly model to accelerate a broad

range of applications. The way GPUs exploit parallelism differs from multi-core

CPUs, which raises new challenges to take advantage of its computing power. GPU

is especially well-suited to address problems that can be expressed as data-parallel

computations.

3.5.1

Examples

In [35], a parallel implementation of the Guo and Hall algorithm for CA based on

the principles described above was presented. It has been implemented by using

Microsoft Visual Studio 2008 Professional Edition (C++) with the plug-in Parallel

Nsight (CUDA TM ) under Microsoft Windows 7 Professional with 32 bits. CUDA TM

C, an extension of C for implementations of executable kernels in parallel with

graphical cards NVIDIA has been used to implement the CA. It has been necessary

to use the nvcc compiler of CUDA TM

Toolkit and some libraries from openCV for

image input and output.

The experiments have been performed on a computer with a CPU AMD Athlon II

x4 645, which allows to work with four cores of 64 bits to 3.1 GHz. The computer

has four blocks of 512KB of L2 cache memory and 4 GB DDR3 to 1600 MHz

of main memory. The used graphical card (GPU) is an NVIDIA Geforce GT240

Search WWH ::

Custom Search

Home