Information Technology Reference
In-Depth Information
5.1 Descriptive Statistics
From seventeen students in two runs, we collected 34 questionnaires with 510
statements in total. The video analysis is based on 6,74 hours of video material.
One t.BPM session taping went wrong. That results in N=16 for all hypotheses
that rely on video analysis. Videos taken during t.BPM sessions took twenty
minutes (19.52) on average ranging from ten (10.25) to almost forty minutes
(38.98). On the other hand, interviews took about five minutes (5.42) on average
ranging from three and a half (3.53) to ten minutes (9.68) at most.
5.2 Data Set Preparation
The data was tested with the Kolmogorov-Smirnov and Shapiro-Wilk test and
is normally distributed. The original experiment evaluation involved two more
video codings and three more questionnaire items. The related hypotheses did
not hold and the data was therefore dropped for discussion in this paper due to
the limited space. Apart from that, no collected data was excluded from the set.
5.3 Measurement Reliability and Validity
According to Kirk and Miller the reliability is the extent to which ”a measure-
ment procedure yields the same answer however and whenever carried out” ([13],
p.19) while validity is the ”extent to which it gives the correct answer”.
We assess two aspects of measurement reliability. First we check the inter-rater
agreement for the video codings using Cohen's kappa coeficient ( κ ). It compares
both video codings before the negotiation process. The inter-rater agreement
over all videos and all coding schemes is κ = . 463 where 0 . 41 <κ< 0 . 60 is
a moderate agreement level [14]. Thus, we interpret our coding instructions as
reasonably reliable and the results as moderately reproducible.
Furthermore, the reliability of the questionnaire is measured using Cronbach's
alpha ( α ). It determines the degree to which the items related to one hypothesis
coincide. In other words, whether they actually measure the same underlying
concept, e.g. fun. In the literature [6] α> .8 is suggested to be a good value for
questionnaires, while α> 0.7 is still acceptable. All our variables had α>. 8,
except for α ( motivation Q 4 )= . 702 and α ( clarity Q 8 )= . 687. We keep those
exceptions in mind but overall a high degree of reliability is indicated for the
questionnaire.
Validating whether our variables correctly describe ”effective elicitation” is
not directly possible. We use effective elicitation as an umbrella term for the
aspects of engagement and result validation. From there we derive variables to
measure these aspects. In [15] we conducted a principal component analysis for
validation. It is a technique to determine sets of strongly correlating variables
which are approximated with one factor, the principal component [6]. Ideally, the
variables would form two factors. Those that reflect the measures for engagement
and those measuring result validation.
Using orthogonal (varimax) rotation, our nine dependent variables split up to
three factors that do not match our hypothesis decomposition. Interestingly, all
 
Search WWH ::




Custom Search