Geoscience Reference
In-Depth Information
possible states. Each nucleotide can contain either a purine (state
þ
1) or
pyrimidine (state
1) base. The genome of each haploid individual is then
represented by a sequence of L sites. Each individual i in a population
consisting of J individuals is represented as (S
i
1
, S
i
2
,
, S
i
L
), where S
i
u
is the
uth site in the genome of individual i. The genetic similarity between individ-
ual i and individual j can be defined as
...
L
X
L
1
q
ij
S
i
u
S
j
u
;
¼
ð
1
Þ
u
¼
1
with q
ij
1, 1] where 1 means the two individuals are genetically identically.
The genetic similarity in Eq.
(1)
can be written in terms of the fraction of
identical sites (f
ij
)
2
[
¼
1
L
Lf
ij
q
ij
f
ij
2f
ij
¼
L 1
1
:
ð
2
Þ
and f
ij
is:
1
þ
q
ij
f
ij
¼
:
ð
3
Þ
2
Each nucleotide in the offspring is inherited at random, thus ignoring linkage
between neighbouring nucleotides, but with a small probability of error
determined by the mutation rate. Say that the individual k inherited the
nucleotide in site u from its parent G(k): what is the probability that k will
have exactly the same nucleotide (i.e.
1) as G(k)? The probability of
no mutation and mutation in site u, respectively, is:
þ
1or
8
<
PS
G
ðÞ
¼
S
u
¼
1
m
u
;
u
ð
4
Þ
:
PS
G
ðÞ
S
u
u
¼
¼ m
:
u
In order to track divergence in the initial population with J individuals, we
have to calculate at each interaction event the similarity values between the
parents of the offspring k (i.e. G
1
(k)andG
2
(k) ) and each individual j in the
population. Which is the expected fraction of nucleotides in the offspring k
shared with each individual j in the population (E[f
kj
])? If we assume the
same mutation rate among nucleotides,
i
i
i
m
1
¼ m
2
¼
...
m
L
¼ m
, then from
(4)
this expected fraction is:
þ
PS
G
1
ðÞ
u
1
2
f
G
1
ðÞ
j
PS
G
1
ðÞ
u
S
u
f
G
1
ðÞ
j
S
u
Ef
kj
½
¼
¼
1
¼
ð
5
Þ
þ
PS
G
2
ðÞ
u
1
2
f
G
2
ðÞ
j
PS
G
2
ðÞ
u
S
u
f
G
2
ðÞ
j
S
u
þ
¼
1
¼
and after substituting
(4)
in
(5)
gives: