Database Reference
In-Depth Information
3. Deciding on a better segmentation, based on the average of
P over
all segments of a data stream. A higher average denotes a better segmentation.
4. Deciding on a better segmentation, based on the percentage of segments where
)
1
value CD
(
)
P exceeds the minimal 0.5% confidence level out of all the
segments of a data stream. A higher percentage denotes a better segmentation.
1
value CD
(
120%
93.9%
93.4%
100%
88.2%
80%
58.2%
60%
50.0%
33.3%
40%
25.0%
18.5%
15.5%
20%
22.9%
10.6%
0%
7.8%
1
2
3
Range
Stdev
Average
Percentage 0.5% significant
Fig. 4.6. Analysis of
1
P
value CD
(
)
in three trials of segmentation in “stock” data
set.
In this example, for instance, a relevancy rank is assigned to all statistical
parameters; a higher rank (1 to 3) describes a better score in a statistical parameter.
The overall score is calculated as a weighted average and the outcome is described
in Table 4.7. The weighting schema of all the parameters should be a choice of the
user of the segmentation model.
Table 4.6. Evaluating segmentation in the “stock” data set.
Standard
deviation
Percentage 0.5%
significant
Trial
Range
Average
Score
Weight
25%
25%
25%
25%
100%
1
1
1
3
1
1.5
2
2
2
2
2
2
3
3
3
1
3
2.5
It is obvious that in this case the third trial describes the best segmentation out
of three trials based on the change-detection methodology. But the question
whether or not this is the best possible segmentation within a range of various
types and lengths of segmentation in the specific “stock” data, cannot be answered
Search WWH ::




Custom Search