Biomedical Engineering Reference
Exome Sequencing as a Discovery
and Diagnostic Tool
The estimated size of the human genome is 2,872 Mbps consisting of genes and
noncoding sequences of DNA. Approximately 1.5 % of the human genome is known
to code for proteins and this portion is the exome. This coding portion has been
shown to be more evolutionary-conserved, thus more sensitive to change (Birney
et al. 2007 ). The decreasing cost of sequencing, due to emerging next-generation-
sequencing (NGS) technologies, provides an opportunity to screen the exome at an
affordable cost for gene discovery and diagnostic purposes. The great amount of
information generated from the human genome sequencing, 1000 genomes project,
HapMap, and whole exome sequencing (WES) projects has allowed us to interpret
sequence changes with a higher level of confi dence (Abecasis et al. 2012 , 2010 ;
Tennessen et al. 2012 ). To deal with the large sequencing datasets, a variety of bio-
informatics tools have been developed to automate the process of annotation and
prediction of sequence changes (Wang et al. 2010b ). Due to the massive parallel
nature of NGS, research and clinical applications of NGS include the sequencing of
many genes, as targeted panels, exomes, and even genomes. An increase in pub-
lished fi ndings has allowed cataloging of polymorphisms and disease-associated
mutations at various databases that include the database of single nucleotide poly-
morphisms (dbSNP), the human gene mutation database (HGMD), ENSEMBL, the
1000 genomes project database ( http://www.1000genomes.org/ ), and the exome
sequencing project database ( http://evs.gs.washington.edu/EVS/ ) to mention a few.
The large data is evident in dbSNP that has close to 53 million records and the num-
ber of new submissions has been exponentially increasing (Wheeler et al. 2007 ).
For the past 5 years WES has been used successfully applied as a diagnostic tool,
in the clinical area, and as a discovery tool to fi nd new disease genes (Bolze et al.
2010 ; Coelho et al. 2012 ; Dibbens et al. 2013 ; Yu et al. 2013 ). In these studies,
family-based analysis designs provide a simple means of cross sample hypothesis
testing (Ku et al. 2011 ). About 180,000 exons are targeted for array or solution-
based capture methods followed by NGS (Okou et al. 2007 ; Sulonen et al. 2011 ).