Database Reference
In-Depth Information
The answer lies in confidence intervals, which take the form [ x u , x 0 ] for the
desired indicator and which apply for a specified percentage. For example, a 95 %
confidence interval [ 1.5, 3.5 %] for the sales increase means that for the period in
question, the true value can be expected with 95 % probability in the range between
1.5 % and 3.5 %. This is since the measured value lies with a probability of 95 %
in the expected range of the sample. Notice that the simpler formulation that the
true value lies with 95 % probability in the confidence interval is mathematically
not correct. However, colloquially, it describes the meaning of the confidence
interval well.
So rather than simply stating a value of 1.0 % increased sales, its 95 % confi-
dence interval [ 1.5, 3.5 %] is given too. As the statistical set increases, the
confidence interval narrows and closes in on the indicator.
Determining the confidence interval for the increased sales due to a recommen-
dation engine is by no means straightforward. The method was developed by
the mathematicians Holm Sieber and Toni Volkmer in [SV10], and we will present
it briefly.
W.l.o.g. we suppose the session quotient q ¼ 1. Furthermore, let X A be the
average revenue per session of group A and X B the revenue of group B . Then the
increase in revenue of group B is calculated as
X B
X A 1
d ¼
ð 11
1 Þ
:
:
This value typically has a very high variation, so it is insufficient to state just the
mean value in order to make reliable conclusions. Thus, we will present a way to
calculate the confidence interval for d .
The revenue increase d is a random variable. This follows from the random
character of the revenue of one session.
Let X A be the revenue of a session in group A , and X B the revenue of a session
group B , and the numbers of the corresponding sessions are n A and n B . In sessions
without order, the revenue is simply 0. We can assume that the revenue satisfies an
unknown but stationary distribution. The expected value and variance are unknown
but can be estimated from a sample.
By a pp lying the central limit theorem, we can first describe the distributions of
X A and X B :
Both are approximately normally distributed:
,
X A NEX A , D 2 X A
n A
X B NEX B , D 2 X B
n B
:
Search WWH ::




Custom Search