Biomedical Engineering Reference
In-Depth Information
For short, together with GRRs, the disease models allow specifying the probability
to be affected, depending on the genotype at the causal locus:
GRR
=
P(
affected|Aa
)
P(
affected|aa
)
,
where
A
is the disease allele. The specification of the disease model amongst
add
,
dom
,
mul
and
rec
allows the adjustment of the probability to be affected when carrying the
two disease alleles
AA
, with respect to the probability to be affected when carrying
Aa
(or
aA
). Thus various effect sizes may be simulated (see Table 3).
Ta b l e 3 .
The genotype relative risks for four standard disease models. The value
1
stands for
the effect when no disease allele (
A
) is present at the causal locus (
aa
). The effect sizes for the
carriers of one disease allele (
Aa
or
aA
) and two disease alleles (
AA
) are indicated for all four
disease models.
Genotype Relative Risk
Major Homozygotous Heterozygotous Minor Homozygotous
aa
aA or
(
Aa
)
AA
α
2
additive
1
1+
1+
α
dominant
1
1+
α
1+
α
1+
α
2
multiplicative
1
1+
α
recessive
1
1
1+
α
HAPGEN was run on the widely used reference haplotypes of the HapMap phase II
coming from U.S. residents of northern and western European ancestry (CEU) (http://
hapmap.ncbi. nlm.nih.gov/). The disease prevalence (percentage of cases observed in
a population) specified to HAPGEN was set to
0
.
01
, a standard value used for disease
locus simulation. The simulated data were generated for
1000
unaffected subjects and
1000
affected subjects and consist of unphased genotypes relative to a
1
.
5
Mb
region
containing around
100
SNPs. Combining all previous conditions leads to testing
36
scenarii (
3
4
). To derive significant trends, each scenario was replicated
100
times. Together with the objective of a comprensive study, the necessity of replication
explains the choice of the number of variables (
100
SNPs). Standard quality control for
genotypic data was carried out: SNPs with MAF less than
0
.
05
and SNPs deviant from
the so-called Hardy-Weinberg Equilibrium (not detailed) with a p-value below
0
.
001
were removed.
×
3
×
6.2 Choice of the Association Test
The
G
2
standard test of independence was preferred over the well-known
Chi
2
test.
For relatively small sample sizes (below
300
subjects) as in the real dataset analyzed in
SubSection 7.2,
G
2
is recommended:
G
2
=2
ij
o
ij
·
ln(
o
ij
/e
ij
)
,where
o
ij
and
e
ij
are observed and expected frequencies (in absence of genotype-phenotype association)
in the cells of table
genotypes
phenotypes
. Various p-values were obtained through
successive tests of the phenotype
Y
against, respectively, the causal SNP, the causal
SNP ancestor nodes (A nodes) and other nodes (abbreviated as Os) in the FLTM's
graph. The phenotype
Y
is the affected/unaffected status.
×
6.3
Adapted Correction for Multiple Testing
To measure the significance of associations, it is necessary to adapt a permutation pro-
cedure dedicated to the computation of the per-test error rate
α
(type I error), in order
Search WWH ::
Custom Search