A Semi-Automated Approach to the Content Analysis of Experience Narratives - Modeling Users' Experiences with Interactive Systems

Information Technology Reference

In-Depth Information

6.4.4

Traditional Latent-Semantic Analysis

The third approach was a traditional Latent-Semantic Analysis as described in this

chapter. It involved two pre-processing steps: a) extracting a list of stop-words ,

and b) stemming terms to their root form. This resulted in a total of 1873 unique

terms that were used to characterize the 329 narratives. The resulting 1873x329

matrix was submitted to a Singular-Value Decomposition and the dominating 26

latent dimensions were extracted.

6.4.5

Cluster Analysis on Dissimilarity Matrices

All three procedures resulted in a 329x329 matrix depicting the dissimilarity be-

tween the narratives. The three dissimilarity matrices were then submitted to hierar-

chical cluster analysis using a minimum variance criterium and the first nine clusters

were extracted.

The performance of the three approaches is compared by contrasting the output

of each method to the output of the hand-coded classification in the original study

(chapter 4). The original hand-coded classification resulted in the identification of

five overall categories: stimulation, learnability, long-term usability, usefulness and

social experiences . Traditional content analysis, as applied in the original study, is

considered as an optimal classification and used as reference for the three automated

procedures.

To enable the comparison between the output of the three approaches with the

output of the content analysis of the initial study, a mapping needs to be created

between the 9 clusters generated by each of the three approaches and the five cate-

gories of the traditional content analysis. Once all 9 clusters are assigned to one of

the five overall categories, interrater agreement indices such as the Kappa statistic

(Fleiss et al., 2003), or the overall percent of correctly classified narratives may be

computed in assessing the agreement between the three automated approaches and

the traditional content analysis.

We employ two approaches for assigning each of the nine clusters to one of the

five identified categories. First, this may be performed based on the distribution of

narratives within a cluster over the five categories. The distribution for all nine clus-

ters may be visualized in a 9x5 matrix where each cell of the matrix m i , j depicts the

number of narratives that are classified to the cluster i (out of the 9 overall clusters

that resulted from the automated analysis procedure) and to the j category (out of

the 5 categories that resulted from the manual coding procedure in the initial study).

According to this criterium, each cluster is assigned to that category that contains

the highest number of narratives. This approach minimizes the error induced by the

mapping process, and results in the best possible value for the agreement between

the automated methods and traditional content analysis.

However, this best possible value may not be obtained in real settings where

human interpretation is required to further classify the narratives. Thus, a second

approach involves human raters. Each cluster, as proposed earlier in this chapter,

can be characterized by the three most dominant terms in the experience narratives

Modeling Users' Experiences with Interactive Systems

Search WWH ::

Custom Search

Home