Information Technology Reference
In-Depth Information
Fig. 4. Probabilistic expert system.
7 Comparison of the Methods
We have considered two classes of statistical models to model web clickstream data.
It is quite difficult to choose between them. Here the situation is complicated by the
fact that we have to compare local models (such as sequence rules) with global
models (such as probabilistic expert systems).
For global models, such as probabilistic expert systems, statistical evaluation can
proceed in terms of classical scoring methods, such as likelihood ratio scoring, AIC or
BIC. Or, alternatively, by means of computationally intensive predictive evaluation,
based on cross-validation and/or bootstrapping. But the real problem is how to
compare them with sequence rules.
A simple and natural scoring function of a sequence rule is its support, that gives
the proportion of the population to which the rule applies. Another measure of
interestingness of a rule, with respect to a situation of irrelevance, is the lift of the rule
itself. The lift is the ratio between the support of the confidence of the rule A B
and the support of B. Recalling the definition of the confidence index, the lift
compares the observed absolute frequency of the rule with that corresponding to
independence between A and B.
Ultimately, though, the assessment of an association pattern has to be judged by
their utility for the objectives of the analysis at hand. In the present case-study, for
instance, the informative value of the start_session end_session rule, which in
table 1 has the largest support and confidence (100%) is, for instance, null. On the
other hand, the informative value of the rules that go from start_session to other
pages, and from other pages to end_session can be extremely important for the design
of the website.
Search WWH ::




Custom Search