Biology Reference
In-Depth Information
values of both parameters ( m 1 and m ) are unknown prior to the
discovery experiment. But for the calculation of the sample size of
the discovery set, it is necessary to make an assumption on their
ratio ( p 1 = m 1 / m ), i.e., to make an educated guess of the percent-
age of all proteins which expression might be affected by the treat-
ment or the disease. The p 1 value is usually in the range from 0.01
to 0.10 ( 12 ) in dependence on the sample material and on the
biological question of the study (see Note 3).
The effect size ( q ) is the ratio of the difference between group
means (D) of the expression levels of a protein for the two experi-
mental groups to the common standard deviation of this protein
( s ). For simplicity, the standard deviation is assumed to be equal in
both groups. The effect size can be either estimated from the result
of a pilot DIGE experiment (see Note 4) or has to be assumed.
Therefore, it may be set to the minimal clinical relevant effect size,
one wants to detect in the experiment. An effect size of more than
four would indicate that there is almost no overlap in the expression
level of the respective protein between the two groups (see Note 5).
But such an extraordinary high effect size of a biomarker is rare in
clinical proteomics. It is very likely to fi nd no such biomarker in a
study. Most biomarkers have a much lower effect size showing
considerable overlap of their distribution with the control group.
However, the larger the overlap, the lower may be the clinical
relevance of the biomarker. In the proteomic setting, a minimal
effect size smaller than 1.0 often does not seem to be worthwhile.
Make an Assumption
on the Minimal Effect Size
The sample size per group can be calculated on the basis of the
parameters defi ned above. The diagrams in Fig. 1 show the rela-
tionship between the sample size per group and the statistical
power for different effect sizes q and different proportions of effec-
tive proteins p 1 (0.01 in a, 0.05 in b, or 0.10 in c). The curves were
calculated using Storey's approach (see ref. ( 8 )) using an FDR of
0.05 to correct for multiple testing. The effect size ( q ) was assumed
to range from 1.0 to 3.0. The informative value of the graphs can
be explained with the help of an example: A study where we assume
that 1% of the investigated proteins are related with the clinical
outcome ( p 1 = 0.01; graph a) with a common effect size of q = 1.5
(dashed line) would require 21 samples/group to reach a statistical
power of 0.80. A power of 0.8 (corresponding to a type 2 error of
0.2) indicates that in the long run, 80% of all effective proteins are
identifi ed. In contrast, assuming p 1 = 0.1 15 samples/group would
be suffi cient (graph c) to reach the same power. Table 2 shows the
required sample sizes (for different q and p 1 ) to achieve a statistical
power of 0.80 or 0.90. Note again that sample size calculations are
based on two-sided two-sample t tests and an FDR of 0.05.
Calculate the Adequate
Sample Size
Collect the required number of protein samples in correspondence
to the calculations shown above. Then split each sample into two
3.1.2. Perform the
Discovery Experiment
Search WWH ::




Custom Search