Information Technology Reference
In-Depth Information
Definition 3.4
For ʔ , ʘ
P ( X ), we define the metric L as follows:
inf m ( x , y ) ( x , y )
M ( ʔ , ʘ ) .
=
|
L ( ʔ , ʘ )
ʓ
Lemma 3.1
If ( X , m ) is a separable metric space then K and L are metrics on
P ( X ) .
The famous Kantorovich-Rubinstein duality theorem gives a dual representation
of K in terms of L .
Theorem 3.2 ( Kantorovich-Rubinstein [ 38 ] ) If ( X , m ) is a separable metric space
then for any two distributions ʔ , ʘ
In view of the above theorem, many papers in the literature directly take Def-
inition 3.4 as the definition of the Kantorovich metric. Here we keep the original
definition, but it is helpful to understand K by using L . Intuitively, a probability
measure ʓ M ( ʔ , ʘ ) can be understood as a transportation from one unit mass
distribution ʔ to another unit mass distribution ʘ . If the distance m ( x , y ) represents
the cost of moving one unit of mass from location x to location y then the Kan-
torovich distance gives the optimal total cost of transporting the mass of ʔ to ʘ .We
refer the reader to [ 39 ] for an excellent exposition on the Kantorovich metric and the
duality theorem.
Many problems in computer science only involve finite state spaces, so discrete
distributions with finite supports are sometimes more interesting than continuous
distributions. For two discrete distributions ʔ and ʘ with finite supports
=
P ( X ) we have K ( ʔ , ʘ )
L ( ʔ , ʘ ) .
{
x 1 , ... , x n }
and
, respectively, minimising the total cost of a discretised version of
the transportation problem reduces to the following linear programming problem:
Minimise i = 1 l j = 1 m ( x i , y j ) ʓ ( x i , y j )
subject to
{
y 1 , ... , y l }
n : l j = 1 ʓ ( x i , y j )
•∀
1
i
=
ʔ ( x i )
(3.6)
l : i = 1 ʓ ( x i , y j )
•∀
1
j
=
ʘ ( y j )
•∀
1
i
n ,1
j
l : ʓ ( x i , y j )
0 .
Since ( 3.6 ) is a special case of the discrete mass transportation problem, some
well-known polynomial time algorithms like [ 34 ] can be employed to solve it, which
is an attractive feature for computer scientists.
Recall that a pseudometric is a function that yields a nonnegative real number for
each pair of elements and satisfies the following: m ( s , s )
=
0, m ( s , t )
= m ( t , s ), and
m ( s , t )
m ( s , u )
+ m ( u , t ), for any s , t S . We say a pseudometric m is 1-bounded
if m ( s , t )
1 for any s and t . Let ʔ and ʘ be distributions over a finite set S of states.
In [ 40 ] a 1-bounded pseudometric m on S is lifted to be a 1-bounded pseudometric
ˆ
m on
D
( S ) by setting the distance
m ( ʔ , ʘ ) to be the value of the following linear
ˆ
Search WWH ::




Custom Search