Biology Reference
In-Depth Information
Depending on the position of the signature relative to its asso-
ciated gene, the signature is given a category [13] indicative of
the quality of the association.
RESULT OF AN MPSS RUN AND NOMENCLATURE
The net result of an MPSS run is a list of 17-mer signatures and the
count of beads having that signature. MPSS sequencing is typically
done in replicate. For a given biological sample, loaded beads are taken
in fixed aliquots, and independently sequenced k times with the TS and
with the FS protocol ( k
2-4). These are called MPSS or sequencing
replicates. All of these sequencing replicates correspond to the same
biological sample.
From the several replicate measurements, a transcripts per million
(tpm) measure for each signature is computed. First, for each signature i ,
either the TS or FS data is chosen by selecting the stepper that counted
the most beads for that signature across all available experiments . Since a
signature may resist sequencing by one or the other stepper protocol,
the stepper with the largest count is most likely to be better suited for
measuring that signature. Once the stepper is chosen for each signature,
the values of the k independent sequencing replicates are combined
to give an aggregate tpm value ,
where the n i 's and the N 's are the bead counts for the given signa-
ture i and the total number of sequenced beads in each MPSS run,
respectively. If, for a given signature, n ij
=
K ((
6
t
n
++
n
)/(
N
++
N
))
×
10
i
i
1
ik
1
k
0, then the MPSS replicate j
is excluded from both the numerator and the denominator. The justifi-
cation for this will be given later in the discussion of the statistics of
zero measurements. In addition to the aggregate tpm, one may also
define the tpm value obtained from a single replicate measurement as
t ij K (n ij / N j )
=
10 6 . As is the case with DNA microarray data, experimen-
tally observed tpm values can span several orders of magnitude and,
thus, it is again useful to define q ij K log 10 t ij and q i K log 10 t i .
×
DATA SETS USED IN THIS STUDY
Human breast cancer cells. Estrogen receptor-negative BT-20 cell lines
[16] were grown. Two distinct poly(A) + mRNA samples (A and B) were
collected from plated cells and used to generate two signature/tag
libraries. One of these two libraries was split in two parts and was used
to generate two sets (A1 and A2) of loaded microbeads. The other
library was used to generate one set of loaded microbeads (B). After
loading, each set of beads was independently processed in multiple
MPSS runs (see figure 4.7).
Macrophage samples and data. Plastic adherent monocytes were
isolated from peripheral blood mononuclear cells collected from buffy
Search WWH ::




Custom Search