Image Processing Reference

In-Depth Information

CHAPTER 36

Distances and kernels based

on cumulative distribution

functions

Hongjun Su; Hong Zhang
Department of Computer Science and Information Technology, Armstrong State University, Savannah, GA,

USA

Abstract

Similarity and dissimilarity measures such as kernels and distances are key components of classiication

and clustering algorithms. We propose a novel technique to construct distances and kernel functions

between probability distributions based on cumulative distribution functions. The proposed distance

measures incorporate global discriminating information and can be computed efficiently.

Keywords

Cumulative distribution function

Distance

Kernel

Similarity

1 Introduction

A kernel is a similarity measure that is the key component of support vector machine ([
1
]) and

other machine learning techniques. More generally, a distance (a metric) is a function that rep-

resents the dissimilarity between objects.

In many patern classiication and clustering applications, it is useful to measure the simil-

arity between probability distributions. Even if the data in an application is not in the form of

a probability distribution, they can often be reformulated into a distribution through a simple

normalization.

A large number of divergence and affinity measures on distributions have already been

deined in traditional statistics. These measures are typically based on the probability density

functions and are not effective in detecting global changes.

Search WWH ::

Custom Search