Information Technology Reference
3.3 Estimating Significance in FC-GSEA
To estimate the statistical significance of an observed ES for a gene-set, it needs to be
compared with the set of scores ES NULL computed with randomly permutated samples
with respect to sample labels. In original GSEA, a null distribution of ES is like Fig. 8.
In Fig. 8, the left graph shows the case of locating an observed ES in a positive region
while the right graph shows the case of locating an observed ES in a negative region.
Also the bold-lined vertical bar indicates the location of an observed ES and the x axis is
the permutation ES.
Fig. 8. An example of a null distribution of ES in original GSEA
In above cases, the estimation of significance for a gene-set is made with an ob-
served ES whether it is in a positive region or in a negative region. If the observed ES
is in a positive region, the positive portion of the distribution greater than the ob-
served ES is used to estimate a nominal p-value for a gene-set. If the observed ES is
in a negative region, the negative portion of the distribution less than the observed ES
is used to estimate a nominal p-value.
On the other hand, in FC-GSEA, an observed ES is always located in a positive re-
gion. Thus, a null distribution of ES in FC-GSEA is like Fig. 9. Here the left (the
right) graph corresponds to the case of having an observed ES in a positive (negative)
region in original GSEA. As shown in Fig. 9, in either case, since an observed ES is
always in a positive region for FC-GSEA, the positive portion of the distribution
greater than the observed ES is used to estimate a nominal p-value for a gene-set.
Thus, in order to identify significant gene-sets, one tailed test is performed in FC-
GSEA while two-tailed test is performed in original GSEA.
Fig. 9. An example of a null distribution of ES in FC-GSEA