Databases Reference
In-Depth Information
. f /
where the indicator
ij D 0 if either (1) x if
or x jf
is missing (i.e., there is no mea-
surement of attribute f
for object i or object j ), or (2) x if D x jf D 0 and attribute
. f /
f
is asymmetric binary; otherwise,
ij D 1. The contribution of attribute f
to the
dissimilarity between i and j (i.e., d . f /
ij
) is computed dependent on its type:
If f is numeric: d . f /
j x if x jf j
max h x hf min h x hf
ij D
, where h runs over all nonmissing objects for
attribute f .
If f is nominal or binary: d . f /
ij D 0 if x if D x jf ; otherwise, d . f /
ij D 1.
r if 1
M f 1 , and treat z if
If f
is ordinal: compute the ranks r if
and z if D
as numeric.
These steps are identical to what we have already seen for each of the individual
attribute types. The only difference is for numeric attributes, where we normalize so
that the values map to the interval [0.0, 1.0]. Thus, the dissimilarity between objects
can be computed even when the attributes describing the objects are of different
types.
Example2.22 Dissimilarity between attributes of mixed type. Let's compute a dissimilarity matrix
for the objects in Table 2.2. Now we will consider all of the attributes, which are of
different types. In Examples 2.17 and 2.21, we worked out the dissimilarity matrices
for each of the individual attributes. The procedures we followed for test-1 (which is
nominal) and test-2 (which is ordinal) are the same as outlined earlier for processing
attributes of mixed types. Therefore, we can use the dissimilarity matrices obtained for
test-1 and test-2 later when we compute Eq. (2.22). First, however, we need to compute
the dissimilarity matrix for the third attribute, test-3 (which is numeric). That is, we
must compute d .3/
ij . Following the case for numeric attributes, we let max h x h D 64 and
min h x h D 22. The difference between the two is used in Eq. (2.22) to normalize the
values of the dissimilarity matrix. The resulting dissimilarity matrix for test-3 is
2
3
0
0.55
4
5
0
.
0.45
1.00
0
0.40
0.14
0.86
0
We can now use the dissimilarity matrices for the three attributes in our computation of
Eq. (2.22). The indicator
. f /
ij D 1 for each of the three attributes, f . We get, for example,
/D 1 . 1 /C 1 . 0.50 /C 1 . 0.45 /
3
d
.
3, 1
D 0.65. The resulting dissimilarity matrix obtained for the
 
Search WWH ::




Custom Search