Information Technology Reference
In-Depth Information
the variance for known main and epistatic effects in order to understand the contribution
of each with no search method. Zeng et al. [18] and [12] use [6] partition the variance
when epistatic effects with multiple alleles are present however no search method is
presented in this work. Hanlon and Lorenz [8] use a optimization approach to find com-
binations of epistatic effects that best represent the trait of interest based on squared
error distance.
To avoid the issue of model selection, Broman and Speed [2] use Markov Chain
Monte Carlo Model Composition ( MC 3 ) to search for the main effects (additive mod-
els) that contribute to the trait. This procedure is a variant of reversible jump Markov
chain Monte Carlo by [7]. Boone et al. [4] extend this to restricted model spaces to al-
low for situations where a genome contains more loci than plant lines. Yi et al. in [14],
[15], [16], [17] use the MC 3 framework with various restrictions on the model space to
search for main and epistatic effects. However, [14], [15], [16], [17] and the R/qtlbim
software of [13] do not require that the main effect terms corresponding to the epistatic
effects be present in the model. Furthermore, [2], [14], [15], [16], [17] and [13] employ
information criteria such as AIC or BIC as the basis for the MC 3 search. Boone et al.
[3] show that while BIC is an asymptotically correct approximation for posterior model
probabilities, in the low to moderate sample size cases, BIC performs poorly.
This work uses activation probabilities, defined in Section 2.2 for each of the main
and epistatic effects to determine the marginal posterior probability of each effect re-
gardless of which model is chosen. Figure 1 shows an example heatmap of the activation
probabilities that may occur when epistasis is present. Activation probabilities along the
diagonal correspond to the main effects of the loci. The off diagonal activation prob-
abilities correspond to epistatic effects. Notice that by looking along the diagonal the
main effects appear to be at locus 12, locus 26 and locus 35 as the (12 , 12) , (26 , 26)
and (35 , 35) regions have high probability. Furthermore one can look at the off diag-
onal regions and see that loci 12 and 26 appear to have an epistatic effect denoted by
high probability in the (12 , 26) region. However, loci 12 and 35 and loci 26 and 35 do
not appear to have an epistatic effect due to low probability in the regions common to
(12 , 35) and (26 , 35) on the heatmap.
Section 2 defines the model, basic search strategy, activation probabilities and con-
ditional activation probabilities. Section 2.3 explains the neighborhood definition and
search strategy under restricted model spaces. Section 3 gives a simulation study show-
ing the efficacy of the method for detecting both main effects and two-way interaction
effects. Section 4 considers the Arabidopsis Thaliana as an example. The dataset for
this model organism has 158 lines of RIL and 38 markers (loci) and cotelydon opening
angle is the quantitative trait of interest.
2
Bayesian Model Search
2.1
Model Definition
Let y i be the quantitative trait value for the i th observation. For each of the p loci
l 1 ,l 2 , ..., l p the parentage of the allele is recorded as A if the allele came from parent
A and B if the allele came from parent B. However, in some instances the allele is not
 
Search WWH ::




Custom Search