Biomedical Engineering Reference
Each NGS platform has its own specifi c sequencing biases which could affect the
types and rates of errors made during the data generation. These can include signal
intensity decay over the read and erroneous insertions and deletions in homopoly-
meric stretches (Ledergerber and Dessimoz 2011 ). It is critical to utilize a base-
calling package that is designed to reduce specifi c platform-related errors. Phred-like
score associated with each base call (or other quality metric and measurement
scores) is a useful measurement for the quality of platform-specifi c base calling.
Several commercially available or open-source tools (GATK and NextGENe) for
read alignment are available which utilize a variety of alignment algorithms and
may be more effi cient for certain types of data than for others (Li and Homer 2010 ).
They differ in accuracy and processing speed. Depending upon the types of varia-
tions expected, it is critical to choose one or more read alignment tools to be applied
to the data. In addition, proper alignment can be challenging in the region with high
homologous sequences, but it can be improved by longer or paired-end reads.
The accuracy of variant calling depends on the depth of sequence coverage and the
bi-direction reads of the sequences. Most variant calling algorithms are capable of
detecting single or multiple base variations, while different algorithms may have
more or less sensitivity to detect insertions and deletions (indels), large copy num-
ber variants (CNVs), and structural chromosomal rearrangements (translocations
and inversions). Large deletions and duplications can be detected either by compar-
ing actual read depth of a region to the expected read depth or through paired-end
read mapping. Paired-end and mate-pair mapping can also be used to identify trans-
locations and other structural rearrangements. However, it is recommended to con-
fi rm these results with microarray comparative genomic hybridization (aCGH),
microarray CNVs, and other testing.
Variant Annotation and Filtering
Given the massive coverage of WES and WGS, WES identifi es tens of thousands of
variants while WGS identifi es several millions. It is impossible to manually assess