Information Technology Reference
In-Depth Information
4.1 The Data Splitter Setting
The (univariate) discrete data splitter corresponds to a classifier function
z
=
z
(
x
)=
z
,x
x
≤
,
(4.3)
z
,x>x
−
where
x
is a data split point (or threshold) and
z
∈{−
is a class label.
The theoretic optimal classification (decision) rule corresponds to a split point
x
∗
and class label
z
∗
such that:
1
,
1
}
(
x
∗
,z
∗
)=argmin
P
(
z
(
X
)
=
t
(
X
))
,
(4.4)
with
min
P
e
=inf
z
=
−
1
pF
X|
1
(
x
)+
q
(1
F
X|−
1
(
x
))
+
−
+
z
=1
p
(1
F
X|
1
(
x
)) +
qF
X|−
1
(
x
)
,
−
(4.5)
where
F
X|t
is the distribution function of class
ω
t
for
t ∈{−
1
,
1
}
and
p
and
q
are the class priors.. The first term inside braces in equation (4.5) corresponds
to the situation where min
P
e
is reached when
z
1 is at the left of
x
;
the second term corresponds to swapping the class labels. A split given by
(
x
∗
,z
∗
) is called a
theoretical Stoller split
[223]. The data-based version, the
empirical Stoller split
, essentially chooses the solution (
x
,z
) such that the
empirical error is minimal [223], that is,
=
−
n
.
1
n
(
x
,z
)=arg
min
(
x,z
)
∈
R
×{−
1
,
1
}
+
(4.6)
{X
i
≤x,T
i
=
z}
{X
i
>x,T
i
=
−z}
i
=1
The probability of error of the empirical Stoller split converges to the Bayes
error for
n
[52].
We assume from now on that
ω
−
1
is at the left of
ω
1
,thatis,
z
=
→∞
−
1,
and, our data splitter is given by
z
=
z
(
x
)=
−
x
1
,x>x
1
,x
≤
,
(4.7)
An important result on candidate optimal split points is given by the following
theorem [216]:
Theorem 4.1.
For continuous univariate class-conditional PDFs
f
X|−
1
and
f
X|
1
the Stoller split occurs either at an intersection of
qf
X|−
1
with
pf
X|
1
or
at
+
∞
or
−∞
.
Proof.
The probability of error for a given split point
x
is given by