Efficient Intrusion Detection with KNN Classification and DS Theory - Proceedings of All India Seminar on Biomedical Engineering 2012 (AISOBE-2012)

Biomedical Engineering Reference

In-Depth Information

Statistical rules are used in general in the classification of textual information,

which include several tasks in Information Retrieval. It includes not only the

determination of good documents in terms of relevance attending to user needs,

but also the classification of documents into categories (topics) attending to prede

ned classes [ 3 ]. In the following, we include studies found in the literature about

both the retrieval and the categorization tasks.

The use of rules for categorization comes from a process of classification of

documents into different categories regarding their topics in order to optimize a

posteriori retrieval process. One of the most relevant works of categorization using

rules is the one of [ 4 ]. The general idea of this work is the discovery of classifi-

cation patterns automatically for document categorization. The aim of the induc-

tion process is for sets of decision rules to distinguish among different categories

which documents belong to. The attributes of the rules can be a word or a pair of

words constructing a dictionary where an elimination process of the less frequent

words is carried out. Finally, association rules have also been used for categori-

zation [ 5 ], where the authors propose a solution for text categorization based on

the application of the best generated association rules to build a classifier.

The Dempster-Shafer Theory

The Dempster-Shafer theory (DST) of evidence originated in the work of [ 6 , 7 ]on

theory of probabilities with upper and lower bounds. It has since been extended by

numerous authors and popularized, but only to a degree, in the literature on

Artificial Intelligence (AI) and expert systems, as a technique for modeling rea-

soning under uncertainty. In this respect it can be seen to offer numerous advan-

tages over the more ''traditional'' methods of Statistics and Bayesian decision

theory. Hajek [ 8 ] remarked that real, practical applications of DST methods have

been rare, but subsequent to these remarks there has been a marked increase in the

applications incorporating the use of DST. Although DST is not in widespread use,

it has been applied with some success to such topics as face recognition [ 9 ],

statistical classification [ 10 ], and target identification [ 11 ]. Additional applications

centered on multisource information, including medical diagnosis [ 12 ] and plan

recognition [ 13 ]. An exception is the paper by Cortes-Rello and Golshani [ 14 ],

which although written for a computing science-AI readership does deal with the

''knowledge domain'' of forecasting and Marketing Planning. For those with even

limited knowledge of these domains the paper appears rather naive. Referring for

example to rather venerable old editions of standard texts such as [ 15 ]. The aim of

this paper is to suggest that there is a good deal of potential in the DST approach,

which is as yet largely unexploited. The origins of the mathematical theory of

probability date back at least to the work of the eighteenth century scholar, The

Reversed Thomas [ 16 ], whose work was published posthumously in 1763.

It provides the foundations for the theory of statistical inference (involving both

estimation and testing of hypotheses) and for techniques of design making under

Search WWH ::

Custom Search

Home