Biomedical Engineering Reference
In-Depth Information
2.2.2.7 Variant detection
Once reads are assembled on a reference sequence, variants can be detected by pairwise
comparisons between the reads and the reference sequence. If an assembly was generated
using the phredPhrap script (see Protocol 2.5), then Polyphred [19] can be used to detect
variants (see Protocol 2.6). This program, also developed at the University of Washington,
is one of the most widely used tools for mutation detection. It detects heterozygous and
homozygous SNPs as well as indels, scores each variant, and provides relevant genotypes
from the read basecalls. Both the phredPhrap suite and Polyphred , while intuitive, require
basic knowledge of Linux/UNIX. Alternatively, the NovoSNP tool [21] (Figure 2.1) can per-
form variant discovery and visualization in a more user-friendly interface (see Protocol 2.7).
While less scalable than Polyphred , NovoSNP runs on Windows, Mac or Linux computers
and requires only a reference sequence (FASTA file) and sequence traces (in binary format).
PROTOCOL 2.6 Variant detection in Phrap assemblies
with Polyphred
Equipment and reagents
Directory structures, Phred output and assemblies for sequence traces and reference
sequence from running phredPhrap (see Protocol 2. u )
The Polyphred program. v
Method
1Runthe phredPhrap script from the edit_dir subdirectory if you have not already done
so (see Protocol 2.5).
2Runthe Polyphred program w from the edit_dir subdirectory:
(a) cd edit_dir/
(b) polyphred -ace [ace_file] -refcomp [refseq_id] [options] x
3Reviewthe Polyphred output. y
Notes
u If the phredPhrap script ran successfully, there should be four subdirectories: chromat_dir,
edit_dir, phd_dir and poly_dir .Thereshouldbeonefilepertraceinthe chromat_dir, phd_dir
and poly_dir subdirectories. There should also be an assembly (.ace) file in the edit_dir folder.
v Available from http://droog.gs.washington.edu/polyphred [19].
w Recommended options for SNP detection include:
-t genotype: specifies output with Consed-compatible tags and SNP genotypes.
-quality 25: specifies the quality threshold to use bases for variant calling.
-score 25: specifies the score threshold for variant calling.
Search WWH ::




Custom Search