Information Technology Reference
In-Depth Information
Surprisingly, restricting the list of terms when applying LSA provides minimal
improvement over the output of traditional LSA, i.e. on the full list of terms. Only
when the explicit relations between terms and latent concepts are modeled there
was a substantial improvement in the agreement with the hand-coded classification
in the initial study.
Tables 6.5, 6.6 and 6.7 depict confusion matrices for the three semi-automated
approaches (using the best possible value approach when mapping the nice clusters
into five overall categories).
Ta b l e 6 . 5 Confusion Matrix for traditional LSA
Stim. Learn. LT Usab. Usef. Social
C1.
0
0
0
0
0
C2.
24
51
23
1
1
C3.
1
23
79
22
20
C4.
1
1
1
78
2
C5.
0
1
2
3
4
Ta b l e 6 . 6 Confusion Matrix for LSA on restricted terms
Stim. Learn. LT Usab. Usef. Social
C1.
0
0
0
0
0
C2.
22
53
8
10
7
C3.
2
13
95
19
15
C4.
3
9
0
72
1
C5.
0
0
0
0
0
Ta b l e 6 . 7 Confusion Matrix for Concept Analysis (proposed procedure)
Stim. Learn. LT Usab. Usef. Social
C1.
24
0
0
0
0
C2.
41
67
3
6
4
C3.
0
0
89
0
0
C4.
2
8
11
90
0
C5.
0
0
0
5
19
6.5
Discussion
Overall, the proposed approach, was shown to display a substantially closer fit to the
results of manual clustering of narratives in comparison to Latent-Semantic Analy-
sis. However, interestingly enough, this was mainly rooted in the explicit modeling
Search WWH ::




Custom Search