A New Approach of GPU Accelerated Visual Tracking - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

Template

t=0.00s

t=1.12s

t=5.22s

t=5.50s

Fig. 7. 3D object tracking based on 2 planar regions tracking. Two adjacent tracking regions are

described in red and blue boxes separately. The color boxes in first raw show the tracked regions

in current images. The warped images are shown in the second row. SIFT tracking results are

shown in the third row. The tracked region by SIFT is also shown with color boxes. Occlusion

happens at t = 5 . 22 s and disappears at t = 5 . 50 s .

homography solution. The tracked regions at t

5 . 50 s show that our

system can still track the 3D region by using SIFT results even when occlusion happens.

=

5 . 22 s and t

=

5

Optimization

Though CUDA uses C language with several extensions which makes it easier than

other GPU languages, to make GPU code highly proficient, carefully optimization must

be exploited and several important factors must be considered. In this section, we de-

scribe our optimization experience in our GPU applications.

5.1

Memory Hierarchy

CUDA provides a hierarchy of memory resources including on-chip memory (register,

shared memory) and off-chip memory (local memory, global memory const and tex-

ture memory) . In our GPU applications, we intensively utilize the fast on-chip shared

memory instead of the long-latency global memory. For example, in kernel “Jesm”, the

computation of J esm matrix needs several intermediate results based on the image gra-

dient. So we first load the image gradient data into shared memory and then continue

other computation from shared memory. By using this “cache” like strategy, we have

greatly reduced the kernel's running time.

Search WWH ::

Custom Search

Home