Database Reference
In-Depth Information
▪ A
single-nucleic polymorphism
(SNP), pronounced “snip,” is a single-character
change in the source code (e.g., from
ACT
G
ACTG
to
ACT
T
ACTG
).
▪ An
indel
is short for
insert-delete
and represents an insertion or deletion from the
reference genome. For example, if the reference has
CCTGACTG
and your sample
has four characters inserted — say,
CCTG
CCTA
ACTG
— then it is an indel.
▪ Only 0.5% of the source gets translated into the proteins that sustain your life.
That portion of the source is called your
exome
. A human exome requires a few
gigabytes to store in compressed binary files.
▪ The other 99.5% of the source is commented out and serves as word padding (
in-
whole genome
requires a few hundred gigabytes to store in compressed binary
files.
mented out by
epigenetic
factors like
DNA methylation
and
histone modification
,
not unlike an
#ifdef
statement for each cell type (e.g.,
#ifdef RETINA
or
#ifdef LIVER
). These factors are responsible for making cells in your retina
operate differently than cells in your liver.
▪ The process of
variant calling
is similar to running
diff
between two different
DNA sources.
These analogies aren't meant to be taken too literally, but hopefully they helped familiar-
ize you with some genomics terminology.