Database Reference
In-Depth Information
Category
size
Most typical
Most discriminative typical Most atypical
Mammal
40
Boar, Cheetah, Leopard,
Boar, Cheetah, Leopard,
Platypus
Lion, Lynx, Mongoose,
Lion, Lynx, Mongoose,
(
T
=
0
.
01
)
Polecat, Puma, Raccoon,
Polecat, Puma, Raccoon,
Wolf ( T
=
0
.
16)
Wolf ( DT
=
0
.
08)
Bird
20
Lark, Pheasant, Sparrow,
Lark, Pheasant, Sparrow,
Penguin
Wren ( T =
0
.
15)
Wren
( DT =
0
.
04
)
( T =
0
.
04
)
Fish
14
Bass, Catfish, Chub, Herring,
Bass, Catfish,
Carp
Piranha
(
T
=
0
.
15
)
Herring, Chub,
(
T
=
0
.
03
)
Piranha
(
DT
=
0
.
03
)
Invertebrate
10
Crayfish, Lobster
(
T
=
0
.
16
)
Crayfish, Lobster
Scorpion
(
DT
=
0
.
01
)
(
T
=
0
.
08
)
Insect
8
Moth, Housefly
(
T
=
0
.
13
)
Gnat
Honeybee
( DT = 0 . 02 )
( T = 0 . 06 )
Reptile
5
Slowworm
(
T
=
0
.
17
)
Pitviper
Seasnake
(
DT
=
0
.
007
)
(
T
=
0
.
08
)
Frog ( T = 0 . 2 )
Frog Newt, Toad
( DT = 0 . 008 ) ( T = 0 . 16 )
Table 4.2 The most typical, the most discriminatively typical, and the most atypical animals
( T
Amphibian
3
=
simple typicality value, DT
=
discriminative typicality value).
4.5.1 Typicality Queries on Real Data Sets
In this section, we use two real data sets to illustrate the effectiveness of typicality
queries on real applications.
4.5.1.1 Typicality Queries on the Zoo Data Set
We use the Zoo Database from the UCI Machine Learning Database Repository 1 ,
which contains 100 tuples on 15 Boolean attributes and 2 numerical attributes, such
as hair (Boolean), feathers (Boolean) and number of legs (numeric). All tuples are
classified into 7 categories ( mammals , birds , reptiles , fish , amphibians , insects and
invertebrates ).
We can consider each category as an uncertain object and each tuple as an
instance. The Euclidean distances are computed between instances by treating
Boolean values as binary values. We apply the simple typicality, discriminative typ-
icality and representative typicality queries on the Zoo Database. The results of the
three queries all match the common sense of typicality.
We compute the simple typicality for each animal in the data set. Table 4.2 shows
the most typical and the most atypical animals of each category. Since some tuples,
such as those 10 most typical animals in category mammals , have the same values
on all attributes, they have the same typicality value. The most typical animals re-
turned in each category can serve as good exemplars of the category. For example,
1 http://www.ics.uci.edu/ ˜ mlearn/MLRepository.html.
 
Search WWH ::




Custom Search