Information Technology Reference
In-Depth Information
8 Conclusions
In this paper we have presented two methodologies for the analysis of web
clickstream data. The first one is based on classical association and sequence rules.
We have proposed a method that calculates direct sequence rules explicitly. By means
of association rules one is able to understand local associations between visited pages.
Association rules are relatively easy to extract and interpret. However, it is difficult to
have a global picture of what is going on.
The second methodology, on the other hand, is based on a more sophisticated
statistical model, a probabilistic expert system. We have proposed a simple
implementation of such models. Probabilistic expert systems are global in nature, and
they thus allow an overall interpretation of the associations. On the other hand, the
specification and interpretation of such models may be quite difficult.
We thus conclude that both methodologies should be considered in practical
applications, and the choice accomplished on the basis of the objectives of the
analysis.
References
1.
BLANC E., GIUDICI P. (2002): Statistical Models for web clickstream analysis,
Technical Report, Submitted.
2.
BLANC E., TARANTOLA C. (2002): Dependency Networks and Bayesian Networks for
Web Mining, Technical Report, Submitted.
3.
CABENA P., HADJINIAN P., STADLER R., VERHEES J. and ZANASI A. (1997):
Discovering Data Mining from Concept to Implementation, Prentice-Hall, New York.
4.
DI SCALA L., LA ROCCA L. (2002): A Markov Model for Web Data, Technical Report,
Submitted.
5.
GIUDICI P. (2001): Metodi statistici per le applicazioni di Data Mining, McGraw-Hill
Libri Italia, Milano.
6.
HAN J., KAMBER M. (2000), Data Mining: Concepts and Tecniques, Morgan
Kaufmann.
7.
HAND D. J., HEIKKI M., PADHRAIC SMYTH (2001), Principles of Data Mining, MIT
Press.
8.
HASTIE, T., TIBSHIRANI, R., FRIEDMAN, J. (2001): The elements of statistical
learning: data mining, inference and prediction, Springer-Verlag.
9.
LAURITZEN S. (1996): Graphical Models, Clarendon Press, Oxford.
10.
ROGNONI M., GIUDICI P., POLPETTINI P. (2002): Statistical models for the forecast
of the visit sequences on web site, Technical Reports, Submitted.
11.
SRIVASTAVA J., COOLEY R., DESHPANDE M. and TAN P. (2000): Web Usage
Mining: Discovery and Applications of Usage Patterns from Web Data, SIGKDD
Explorations, vol. I, Issue 2, 12-23.
12.
WHITTAKER J. (1990): Graphical Models in Applied Multivariate Statistics, Wiley,
Chichester.
Search WWH ::




Custom Search