A Restricted Model Space Approach for the Detection of Epistasis in Quantitative Trait Loci Using Markov Chain Monte Carlo Model Composition - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

of interaction terms, the size grows considerably more. In a dataset with 30 loci, a full

model with all first order terms and two-way interaction terms will have 465 terms.

This can be prohibitively large for most datasets and algorithms. If the model space

is restricted to r<p predictors and the corresponding epistasis terms, then any model

considered will not have nearly as many terms. If r is chosen wisely, then the researcher

can ensure that each model under consideration has sufficient degrees of freedom for

parameter estimation.

Furthermore, cases where linear dependencies exist among the predictors estimation

can be complicated. One approach to address this issue is to assign P ( M c )=0 to

all models where linear dependencies exist among the predictors. Hence removing all

multicollinear models from consideration. Any time there are multicollinear terms an

index will need to be created in order to keep track of any aliased terms. This aliasing

can cause problems when there is a large effect size for the aliased terms.

The use of restricted model spaces allows for the assessment of all candidate vari-

ables, however it restricts the number of candidate variables that may be simultaneously

considered in a single model. [14], [15], [16], [17] and [13] use two restrictions one for

the number of main effect terms and one for the number of epistatic terms allowed in

the model simultaneously. They also give a simple guideline to determine the size of

each restriciton. They suggest to choose the restriction r = m +2 √ m where m is the

a priori expected number of main effects. Similarly the same formula can be employed

where m is the expected number of epistatic effect. While this is an easily determined

guideline, in practice and is shown, anecdotally, in Section 4.1 that the restriction size

does not seem to have a great impact on the resulting inferences from the proposed

method. However, one should note that if the restriction is set very small the stochastic

search will have a difficult time moving around the model space and hence the algo-

rithm will take a long time to converge.

To search through the restricted model space, MC 3 can be employed using equa-

tion (7). Note that q ( M t |M c ) must be determined to move through the sample space.

Let nbd ( M c ) be all models with one main effect term more, one valid interaction term

more, one main effect term less and one interaction term less than model M l . Denote

adding a main effect term as AMT, adding an interaction effect term as AIT, drop-

ping a main effect term as DMT and dropping an interaction effct term as DIT. The

probability of each of these actions depends on the attributes of the current model M c .

Let γ c and φ c be the number of main effect terms and number of interaction terms

in M c , respectively. In order to ensure that all models in nbd ( M c ) are equally likely,

the probability of each action, AMT, AIT, DMT and DIT need to be determined. Let

Ω =

be an action space. Once these probabilities have

been calculated, the following procedure allows for each of the models in nbd ( M c ) to

be candidate models. First determine, P ( AMT ) , P ( AIT ) , P ( DMT ) and P ( DIT ) ,

and choose an action with the corresponding probability. Then select with equal prob-

ability a model that is in nbd ( M c ) and corresponds to the action chosen. This proce-

dure ensures that all models in nbd ( M c ) have equal probability. Having all models in

nbd ( M c ) equally likely will be necessary in computing q ( M c |

{

AMT,AIT,DMT,DIT

}

M t ) .

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home