Biology Reference
In-Depth Information
replicate experiments, the estimated value of the true expression level, ,
an d the size of the measurement e rror, dq, can be defined as
and . is discretized with a relatively
small bin size so as to maintain a good resolution while having suffi-
cient data points per bin. The results are insensitive to the exact choice
of the bin size. For a given , the average of dq between two experi-
ment s should be zero: . Any significantly nonzero value of
is caused by systematic experimental errors whose source is
beyond the scope of the current discussion. This error typically appears
as a departure from the diagonal of scatter plots such as those of figure
4.2. A hint of it can be seen at the higher values of figure 4.2b. Even
though this was not a significant problem for the data sets presented
here, such errors have been comp en sated for whenever they occurred
by subtracting any nonzero from dq for each of the replicate
experiment pairs in all of the subsequent analysis.
Within each group G k
θ
θ
θθθ
K (
+
2 )/2
δθ
K (
θ
θ
)/
2
1
1
2
θ
<
δθ θ
|0
> =
< δθ θ
|
>
< δθ θ
|
>
θ
1,2), the distribution of dq for a given
can be obtained from each pair of replicate experiments; these distribu-
tions are found to be highly consistent with each other. Better statistics
are obtained by using the gene expression values from all the pairs of
replicate experimen ts in G k to construct the noise distribution:
. In figure 4.3a, the noise distribution func-
tions for different values of q 0 are shown. One may use the second-order
moment to quantify the strength of the noise and its dependence on the
value of the expected expression level q
( k
=
P
(|)
δθ θ
=
Prob
(|
δθ θ
=
θ
)
k
0
k
0
:
0
=
2
σ
()
θ
δθ
2
P
(|)
δθ θ
d
δθ
(1)
0
0
k
In figure 4.3c, the dependence of s
on q 0 is shown. For reference, s 3 ,
the difference in gene expression between pairs of experiments in G 3 , is
calculated in the same way as were s 1,2 , and is plotted in figure 4.3c as
well. It is interesting that s 3 is consistently larger than s 2 for ,
indicating the existence of signal beyond noise even for the small dif-
ferences between the same cell line from different cultures.
For a given q 0 one can define the rescaled noise
2
θ 0
1
δθ
K
δθ
/()
k
σ
θ
and
0
obtain the distribution function for dq ' : Q k (dq
|q 0 ). In doing so, one finds
that except for very small values of q 0 , , the Q k (dq
|q 0 ) collapse onto a
Φ
(
δθ ′
)
single curve
independent of q 0 and k , as shown in figure 4.3b
(for k
2 only). Equivalently, this means that the distribution for dq ' can
be well approximated by:
=
1
δθ
σθ
P k
(|)
δθ θ
Φ
(2)
0
σθ
()
()
k
0
k
0
for , which includes more than 90% of the data. The rescaled
distribution function is found to have an exponentially decaying tail, in
θ 0
1
 
Search WWH ::




Custom Search