A GPU-Accelerated Real-Time NLMeans Algorithm for Denoising Color Video Sequences - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

We remark that not all passes need to process all input images, i.e. it is com-

pletely legal that

U ( i +1)

= U ( i )

. In this case, we express this formally by saying

f ( i )

U ( i )

that the function

( p ) is constant in

,..., U ( i K

2.2 Straightforward GPU Implementation of the NLMeans Filter

First, we will show that a straightforward (naive) implementation of the tradi-

tional NLMeans filter from [13, 16] leads to a very high number of passes, hence

an algorithm that is inecient even on the GPU. Next, we will explain how our

own algorithmic accelerations can be converted into a program for the GPU as

in equation (4). We will do this for a broad range of weighting functions that

are a function of the Euclidean distance measure between two patches:

⎛

⎞

r ( Δx,Δy )

p , q

⎝

⎠

w ( p

p + q )= g

(5)

( Δx,Δy ) ∈ [ −B,...,B ] 2

r ( Δx,Δy )

with

p , q = Y ( p x + q x + Δx, p y + q y + Δy, p t + q t ) − Y ( p x + Δx, p y + Δy, p t ),

with (2 B +1) × (2 B +1)the patch size and where the function g ( r ) has the

property that g (0) = 1 (such that the weight w =1if the Euclidean distance

between two patches is zero, i.e., for similar patches) and lim r→∞ g ( r )=0(the

weight w =0for dissimilar patches). In particular, we consider the Bisquare

robust weighting function, for which g ( r ) is defined as follows:

⎧

⎨

1 − ( r/h ) 2 2

≤

g ( r )=

⎩

r>h

with h a constant parameter that is fixed in advance (for more details, see [4]).

Substituting (5) into (2) gives:

q ∈δ g ( Δx,Δy ) ∈ [ −B,...,B ] 2 r ( Δx,Δy )

Y ( p + q )

q ∈δ g ( Δx,Δy ) ∈ [ −B,...,B ] 2 r ( Δx,Δy )

p , q

X ( p )=

(6)

Comparing (6) to (3) immediately leads to the kernel function:

q ∈δ g ( Δx,Δy ) ∈ [ −B,...,B ] 2 r ( Δx,Δy )

U (1)

( p + q )

p , q

f (1)

U (1)

( p )=

q ∈δ g ( Δx,Δy ) ∈ [ −B,...,B ] 2 r ( Δx,Δy )

(7)

p , q

U (1)

with

( p )= Y ( p ). We see that the number of operations performed by the

kernel function is linear in

| (2 B +1) 2 ,with

the cardinality of δ . Although

this approach seems feasible, some GPU hardware (especially less recent GPU

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home