Biomedical Engineering Reference
In-Depth Information
Tabl e 9. 7 Comparison of the number of the most significant associated SNPs after correction with
GC and correction using 3 PCs in EIGENSTRAT
# top significant SNPs
after GC correction on # top significant SNPs # COMMON
allelic chi-squared after EIGENSTRAT significantly associated
statistic correction SNPs
UCI 13 22 8
CATIE 6 (102) 57 (61) 4 (3)
In the columns of EIGENSTRAT, the values refer to the results after outliers removal whereas the
values in parentheses refer to the results without removal of outliers
Tabl e 9. 8 Genomic inflation factor, values, calculated on PLINK and EIGENSTRAT chi-squared
association statistics in UCI ( a ) and CATIE-NIMH ( b ) sample, both before and after using 3 PCs
to correct
(a) UCI
(b) CATIE-NIMH
PLINK
EIGENSTRAT
PLINK
EIGENSTRAT
Before correction
1.136
1.075
1.737
1.757
After correction
1.012
1.008
1.639
1.046
both methods are able to correct for PS because the value is close to 1. On the con-
trary in the CATIE-NIMH sample, the chi-squared statistics of PLINK is inflated by
substructure also after correction ( D 1.639). This finding can explain the previous
results (larger number of true positives and false negatives in Table 9.3 )andshows
that the CMH method in PLINK may not be powerful enough to correct for PS.
This can be due to the dependence of CMH test from the user-defined number of
cluster that cannot identify the presence of some hidden stratification. Indeed, if we
set the number of clusters to 3 (Europe, African and other ethnicities), we are not
able to find out the substructure within the group “other or more than one ethnicity.”
Considering the inflation factor , an open issue is how to choose an appropriate
threshold of inflation factor to consider a sample as substructured and the resulting
association inflated by stratification. To understand this, we calculated the value
in EIGENSTRAT for UCI and CATIE-NIMH sample (Table 9.3 ) on the chi-squared
statistics after correction using 1-10 PCs (Table 9.9 ).
In the UCI sample, the calculated value using as covariates the three major
components (see ANOVA statistics in Detection stratification) has a value of 1.008
but decrease to 1.000 only using nine PCs as covariates. On the other hand, in
the CATIE-NIMH sample, we do not reach the limit of 1 also using 10 PCs (1.033)
either keeping or removing outlier individuals. If we compare the most significantly
associated SNPs in EIGENSTRAT, we observe that the results change if we correct
using 3 or 9 PCs (Table 9.10 ), while the results using 9 PCs and 10 PCs (where is
1.000) are the same.
Then, the problem is how much can deviate from 1 to consider stratification
present and also if we can accept the results corrected using 3 PCs as not inflated
by stratification, because we can observe that a little fluctuation of the value can
 
Search WWH ::




Custom Search