Biomedical Engineering Reference
In-Depth Information
To further understand the Bayesian approach, especially with regard to repre-
sentation of ignorance, consider the following example, similar to that in [ 5 ]. Let
there be a proposition that, ''I live in Kings Road, Cardiff''.
How could one construct P(a), a Bayesian belief in a? First, we must choose a
frame of discernment, denoted by H and a subset A of H representing the propo-
sition a; then would I need to use the Principle of Insufficient Reason to arrive at a
Bayesian belief. The problem is there are a number of possible frames of discern-
ment H that we could choose, depending effectively on how many Cardiff roads can
be enumerated. If only two such streams are identifiable, then H = {x 1 ,x 2 },
A = {x 1 }. The ''Principle of Insufficient Reason'' then gives P(a), to be 0.5, through
evenly allocating subjective probabilities over the frame of discernment. If it is
estimated that there are about 1,000 roads in Cardiff, then H = {x 1 ,x 2, ……. x 1000 }
with again A = {x i } and other x i 's representing the other roads. In this case the
''theory of insufficient reason'' gives P(A) = 0.001. Either of these frames may be
reasonable, but the probability assigned to A is crucially dependent upon the frame
chosen. Hence, once Bayesian belief is a function not only of the information given
and one's background knowledge, but also of sometimes the arbitrary choice of
frame of discernment. To put the point in another way, we need to distinguish
between uncertainty and ignorance. Similar arguments hold where we are discussing
not probabilities per se but weights which measure subjective assessments of rela-
tive importance. This issue arises in decision support models such as the Analytic
Hierarchy Process (AHP), which requires certain weights on a given level of deci-
sion tree to unity, see [ 22 ].
KDD Data Set 99
In 1998, DARPA in concert with Lincoln Laboratory at MIT launched the DARPA
1998 data set for evaluating IDS [ 23 ]. The DARPA 1998 data set contains 7 weeks
of training and also 2 weeks of testing data. In total, there are 38 attacks in training
data as well as in testing data. The refined version of DARPA data set which
contains only network data (i.e. Tcpdump data) is termed as KDD data set. The
Third International Knowledge Discovery and Data Mining Tools Competition
were held in colligation with KDD-99, the Fifth International Conference on
Knowledge Discovery and Data Mining. KDD data set is a data set employed for
this Third International Knowledge Discovery and Data Mining Tools
Competition. KDD training data set consists of relatively 4,900,000 single
connection vectors where each single connection vectors consist of 41 features and
is marked as either normal or an attack, with exactly one particular attack type
[ 23 ]. These features had all continuous and symbolic forms with extensively
varying ranges falling into four categories:
• In a connection, the first category consists of the intrinsic features which
comprises
the
fundamental
features
of
each
individual
TCP
connections.
Search WWH ::




Custom Search