Database Reference
In-Depth Information
Table 11.1 Simpson's paradox based on the example of a 2-day A/B test
Recommendation group
Control group
Period
Sessions
Sales volume
Sessions
Sales volume
Sales increase
50 %
1 Day
10
500
20
2,000
2.5 %
2 Day
20
3,900
10
2,000
Total
30
4,400
30
4,000
+10 %
Example 11.1 We can illustrate Simpson's paradox using a simple example of a
2-day A/B test in a web shop (Table 11.1 ):
Although in percentage terms the results for the recommendation group are worse
than those for the control group on both days, the first group appears to emerge at the
end as the clear winner, with +10 %. The reason for this is that the session quotient
q differed on each day: on day 1, it was q ¼ 10:20 ¼ 0.5, but on the second day it
was q ¼ 20:10 ¼ 2.0.
The superficial reason for the paradox is the fact that the individual results are
weighted differently in the overall result. In essence, the paradox usually indicates
that certain influencing factors have not been taken into consideration. In our case, it
is due to the different session quotients, and the solution is to keep them constant. This
underlines once again the need to maintain maximum constancy of all environmental
conditions, as we mentioned in point 1.
11.5 Summary
The robust measurement of the efficiency of recommendation algorithms is an
extremely important factor in the development of REs. We provided some methodical
remarks on this topic in this chapter, even though it is not directly connected to the
problem of adaptive learning. We have further proposed a straightforward algorithm
to calculate confidence intervals for REs.
Search WWH ::




Custom Search