Information Technology Reference
In-Depth Information
Laube et al. ( P12 . 2011a ) illustrated a typical dilemma for the assessment of inter-
nal validity. Amajor difficulty lies in finding suitable data sets that (a) express exactly
the patterns to be detected, and (b) feature sufficient semantic information document-
ing those patterns. For example, the fine-grained cow tracking data used in Laube
et al. ( P12 . 2011a ) did not feature information about the spatially and temporally
varying composition and arrangement of the tracked group of cows. Consequently,
assumptions had to be made for the validation process that certainly are up to debate.
This shortcoming can be overcome when synthetic data is generated where the num-
ber and distribution of patterns can precisely be controlled for experiments (as has
been illustrated above). This, however, can undermine the credibility or the generic
character of a proposed technique since one could argue that the simulation was inap-
propriately fitted to suit the data mining technique. As a code of conduct this topic
suggest to aim for a combination of both simulated and real data for both validation
and verification purposes.
3.3.1.4 Sensitivity Analysis
Movement mining methods may require the setting of parameters, and consequently
can express variable sensitivity with respect to these parameters. Sensitivity analysis
investigates which parameters cause significant changes in the methods' outcomes.
The core argument of Laube and Purves ( P13 . 2011 ) is based on a sensitivity analysis.
The paper investigates the sensitivity of methods for computing movement descrip-
tors to the selected analysis scale and associated data uncertainties. In this case the
validation procedure highlighted crucial sensitivities that are often neglected.
Another form of sensitivity analysis is performed in the movement mining
approaches presented in Laube et al. ( P6 . 2008b ) and Laube et al. ( P12 . 2011a ).
Both studies required some formof algorithmparameterization, balancing eoo versus
eoc in the constrained decentralized computing environment (see Sect. 4.1 ) . Here,
the varied constraint is the size of the communication range c relative to the pat-
tern radius p : The larger the communication range, the smaller eoo and the larger
eoc (Fig. 3.10 ). The figure also illustrates that specifically error of omission can be
reduced when the rigor of the task is relaxed from finding complete flock patterns
( n
=
10 individuals) to “partial” flocks built of fewer individuals.
3.3.1.5 Comparison to Other Methods
When newmethods extend other methods then a direct comparison of their outcomes
is a suitable validation means. Dodge et al. ( P14 . 2012 ) propose a new variant of
an edit distance for assessing trajectory similarity, that clearly is positioned in a
succession of related methods. The comparative study then revealed similarities
and differences between the compared methods, allowing a validation of the newly
proposed method.
Search WWH ::




Custom Search