Biology Reference
In-Depth Information
Table 1
List of statistical terms
Term
Symbol
Defi nition
In the single-hypothesis setting
Type 1 error
a
Probability that a protein is identifi ed, which in truth, is not
related to the clinical outcome
Type 2 error
b
Probability that a protein is not identifi ed, which in truth, is
related to the clinical outcome
Power
1 −
b
Probability that a protein is identifi ed, which in truth, is related
to the clinical outcome
In the multiple-hypotheses setting
Family-wise type
1 error rate
FWER
Probability that at least one of the proteins, which are in truth
not related to the clinical outcome, is identifi ed
False discovery rate
FDR
Expected proportion of identifi ed proteins, which are in truth not
related to the clinical outcome, among all identifi ed proteins
Power
1 −
b
Expected proportion of identifi ed proteins, which are in truth,
related to the clinical outcome
Sample size
n
Sample size per group
Standard deviation
s
Common standard deviation (assumed to be equal in both groups)
Effect size
q
Difference in group means times the common standard deviation
m
Number of tested hypotheses (proteins)
m 1
Number of effective proteins, in truth, related with the clinical
outcome
p 1
Proportion of proteins, which are in truth related to the clinical
outcome, among all investigated proteins
D
Difference of group means of the expression values
proteins. Testing about 20 proteins simultaneously, each with a
0.05 level, we already expect 1 protein to be a false-positive (if, in
truth, all 20 proteins are not related with the clinical outcome). If
about 3,000 proteins are tested and assuming none of them to be,
in truth, related with the clinical outcome, again using a 0.05 level
for each test would lead to 150 false-positive decisions.
To avoid these pitfalls, a number of multiplicity adjustment
procedures are available (see refs. ( 4, 5 )). These adjustment pro-
cedures generally result in so-called adjusted p values (instead of
the standard p values) that are to be compared to the signifi cance
level 0.05. Literature is commonly discussing methods for gene
expression studies; however, for proteomic studies, these methods
can generally be applied. Two approaches are widely used to adjust
for multiplicity:
1.
Control of the family-wise error rate (FWER), i.e., the proba-
bility to identify at least one protein that is not related with the
clinical outcome.
Search WWH ::




Custom Search