Biomedical Engineering Reference
In-Depth Information
The scalable FLTM learning algorithm allows to reach orders of magnitude consis-
tent with GWAS demands ( 10 5 variables, 2000 individuals). In addition to scalability,
data dimension reduction advocates the use of FLTM-based modeling in GWASs: the
issue of multiple hypothesis testing in GWASs would be resolved by testing a low num-
ber of latent variables instead of a large number of observed variables. In the methodical
investigation presented in this chapter, a permutation procedure was necessary to cor-
rect for multiple testing. In the context of a GWAS, only would be explored the trees
rooted in latent variables shown to be significantly associated with the phenotype. Thus
the permutation procedure remains necessary to compute the significance threshold spe-
cific to each layer. Finally, before envisaging an FLTM-based GWAS, an inescapable
prerequisite was testing whether the bottom-up information fading through the forest
would nevertheless allow reliable association detection. No less unavoidable was the
close examination of ratios of latent variables erroneously associated with the disease.
In such an exhaustive analysis of latent variables as above described, the high concen-
tration of (false) associations in a tree pinpointed the causal tree. However, in a GWAS
implementation, a mere best-first search in the FLTM would not allow the identification
of this high concentration. Therefore, the question remains open to design an optimized
procedure where some variant of the best-first FLTM traversal strategy, dimension re-
duction and conditional dependence testing have a role to play.
References
1. Zhang, Y., Ji, L.: Clustering of SNPs by a Structural EM Algorithm. In: International Joint
Conference on Bioinformatics, Systems Biology and Intelligent Computing, pp. 147-150
(2009)
2. Mourad, R., Sinoquet, C., Leray, P.: Learning Hierarchical Bayesian Networks for Genome-
Wide Association Studies. In: Lechevallier, Y., Saporta, G. (eds.) 19th International Confer-
ence on Computational Statistics (COMPSTAT), pp. 549-556 (2010)
3. Mourad, R., Sinoquet, C., Leray, P.: A Hierarchical Bayesian Network Approach for Linkage
Disequilibrium Modeling and Data-Dimensionality Reduction Prior to Genome-wide Asso-
ciation Studies. BMC Bioinformatics 12, 16+ (2011)
4. Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., Lander, E.S.: High-Resolution Haplo-
type Structure in the Human Genome. Nature Genetics 29(2), 229-232 (2001)
5. Verzilli, C.J., Stallard, N., Whittaker, J.C.: Bayesian Graphical Models for Genome-Wide
Association Studies. The American Journal of Human Genetics 79, 100-112 (2006)
6. Han, B., Park, M., Chen, X.-W.: A Markov Blanket-Based Method for Detecting Causal
SNPs in GWAS. BMC Bioinformatics 11(suppl. 3), S5+ (2010)
7. Thomas, A., Camp, N.J.: Graphical Modeling of the Joint Distribution of Alleles at Associ-
ated Loci. The American Journal of Human Genetics 74, 1088-1101 (2004)
8. Lee, P.H., Shatkay, H.: BNTagger: Improved Tagging SNP Selection Using Bayesian Net-
works. Bioinformatics 22(14), 211-219 (2006)
9. Greenspan, G., Geiger, D.: High Density Linkage Disequilibrium Mapping Using Models of
Haplotype Block Variation. Bioinformatics 20, 137-144 (2004)
10. Kimmel, G., Shamir, R.: GERBIL: Genotype Resolution and Block Identification Using
Likelihood. Proceedings of the National Academy of Sciences of The United States of Amer-
ica (PNAS) 102(1), 158-162 (2005)
Search WWH ::




Custom Search