Tangible Media in Process Modeling – A Controlled Experiment - Advanced Information Systems Engineering

Information Technology Reference

In-Depth Information

5.1 Descriptive Statistics

From seventeen students in two runs, we collected 34 questionnaires with 510

statements in total. The video analysis is based on 6,74 hours of video material.

One t.BPM session taping went wrong. That results in N=16 for all hypotheses

that rely on video analysis. Videos taken during t.BPM sessions took twenty

minutes (19.52) on average ranging from ten (10.25) to almost forty minutes

(38.98). On the other hand, interviews took about five minutes (5.42) on average

ranging from three and a half (3.53) to ten minutes (9.68) at most.

5.2 Data Set Preparation

The data was tested with the Kolmogorov-Smirnov and Shapiro-Wilk test and

is normally distributed. The original experiment evaluation involved two more

video codings and three more questionnaire items. The related hypotheses did

not hold and the data was therefore dropped for discussion in this paper due to

the limited space. Apart from that, no collected data was excluded from the set.

5.3 Measurement Reliability and Validity

According to Kirk and Miller the reliability is the extent to which ”a measure-

ment procedure yields the same answer however and whenever carried out” ([13],

p.19) while validity is the ”extent to which it gives the correct answer”.

We assess two aspects of measurement reliability. First we check the inter-rater

agreement for the video codings using Cohen's kappa coeficient ( κ ). It compares

both video codings before the negotiation process. The inter-rater agreement

over all videos and all coding schemes is κ = . 463 where 0 . 41 <κ< 0 . 60 is

a moderate agreement level [14]. Thus, we interpret our coding instructions as

reasonably reliable and the results as moderately reproducible.

Furthermore, the reliability of the questionnaire is measured using Cronbach's

alpha ( α ). It determines the degree to which the items related to one hypothesis

coincide. In other words, whether they actually measure the same underlying

concept, e.g. fun. In the literature [6] α> .8 is suggested to be a good value for

questionnaires, while α> 0.7 is still acceptable. All our variables had α>. 8,

except for α ( motivation Q 4 )= . 702 and α ( clarity Q 8 )= . 687. We keep those

exceptions in mind but overall a high degree of reliability is indicated for the

questionnaire.

Validating whether our variables correctly describe ”effective elicitation” is

not directly possible. We use effective elicitation as an umbrella term for the

aspects of engagement and result validation. From there we derive variables to

measure these aspects. In [15] we conducted a principal component analysis for

validation. It is a technique to determine sets of strongly correlating variables

which are approximated with one factor, the principal component [6]. Ideally, the

variables would form two factors. Those that reflect the measures for engagement

and those measuring result validation.

Using orthogonal (varimax) rotation, our nine dependent variables split up to

three factors that do not match our hypothesis decomposition. Interestingly, all

Search WWH ::

Custom Search

Home