Database Reference
In-Depth Information
11.1 Same Environments in Both Groups
If the test is to be meaningful, it must not be influenced by any factors other than the
recommendation algorithm itself. If a different recommendation algorithm is used
in the control group, it is usually satisfied (but not always). It is usually more
difficult if no recommendations are displayed in the control group, in other words
if we are testing against an empty set.
In a web shop, the free space in the control group is sometimes used to display
additional information or services, for example. This influences the outcome of the
test, because we are no longer then measuring the use of the recommendation
algorithm but rather testing the recommendation algorithm against the additional
information or service, which is not the intention. If the recommendations are
displayed below the product view, for example, but before the detailed product
description, the recommendations may reduce the usability of the shop. Our test is
then assessing the recommendation algorithm against detailed product information.
In a test against an empty control group, the recommendation display will of
course inevitably change the appearance of the product detail view. But this change
should be kept as minimal as possible. At the same time, however, the recommenda-
tions must be displayed prominently; otherwise, they could be ignored. For instance,
recommendations could simply be displayed underneath the existing product detail
view. Then the appearance of the page would scarcely change, but at the same time
the recommendations could be overlooked. So there is clearly an element of conflict
between the two requirements, but an effort should be made to find a reasonable
compromise. One common solution is to display recommendations on the far right of
the page, away from the product information. In this way, the page appearance is
virtually unchanged, but the recommendations are well positioned.
The struggle to achieve maximum constancy of environmental conditions is
one of the key factors differentiating science from scholasticism. It is frequently
underestimated (and in A/B tests often complicated), for which reason we would
like to spend a little more time on it. The following passage comes from the Soviet
winner of the Nobel Prize for chemistry, Nikolay Semyonov, who reflected on the
difficulty of biological evaluations [Sem81]:
It is sometimes said that in biology, because of the complexity, state and individuality of an
organism, experimental conditions cannot be set with the same degree of precision as in
physics or chemistry, and that as a consequence the results obtained may vary.
Such differences do of course arise in experiments on living creatures, and in particular
on human beings. For example, a drug can help some people and harm others suffering
from the same illness.
However,
the statistical result over a large number of people will show the same
distribution.
The causes of this distribution help us to identify the precise physiological character-
istics of a certain type of person which determine whether a drug is beneficial or harmful.
The claim that consistent experimental results cannot be obtained objectively in biology
is wrong. Otherwise medicine or agronomy would be impossible.
With a large enough data set, the differences between different organisms of the
same species can be seen in the statistical distribution, the mean of which is the same
Search WWH ::




Custom Search