Biomedical Engineering Reference
In-Depth Information
where m
and s
is the median of
{
x i ; i
=
1
...
N
}
is the median absolute deviation,
m |
{|
}
defined as the median of
. It can be shown that if x i is nor-
mally distributed, NZ score is the same as the commonly used Z score, defined as
Z i =(
x i
; i
=
1
...
N
s,wherem is the mean and s is the standard deviation. NZ score
was used as an alternative to the Z score as it is resistant to outliers which can
occur frequently in high throughput screening; screen hits are by definition such
outliers.
As an example, a positive S-phase NZ score indicates that the percentage of S-
phase cells is higher than usual for the particular well and that treatment has likely
resulted in a delay in S-phase progression. A negative S-phase NZ score indicates
that the percentage of S-phase cells is low, and that treatment has likely resulted
in a block at S-phase entry. Together, the four NZ -score numbers (one for each of
G1, S, G2, M) give the cell cycle profile of a particular treatment.
x i
m
) /
9.8 Factor Analysis
A typical HCS experiment might generate gigabytes of numbers extracted from the
images describing the amount and location of biomolecules on a cell-to-cell basis.
Most of these numbers have no obvious biological meaning; for example, while
the amount of DNA per nucleus has obvious significance, that of other nuclear
measures, such as DNA texture, or nuclear ellipticity, are much less clear. This leads
biologists to ignore the nonobvious measurements, even though they may report
usefully on compound activities. A standard method in other fields for analyzing
large, multidimensional datasets is factor analysis. It allows a large data-reduction
but retains most of the information content, and quantifies phenotype using data-
derived factors that are biologically interpretable in many cases. For this reason,
factor analysis is highly appropriate to high content imaging, as it seeks to identify
these underlying processes [27].
HCS data are contained in an n
×
m matrix, X consisting of a set of n image-
based features measured on m cells. In mathematical terms, the so-called Common
Factor Model posits that a set of measured random variables X is a linear function
of common factors, F and unique factors, e:
X
=
LF
+
e
In HCS the common factors in F reflect the set of major phenotypic attributes
measured in the assay. The loading matrix L relates the measured variables in
X to F . e is a matrix of unique factors and is comprised of the reliable effects
and the random error that is specific to a given variable. Rooted in this model is
the concept that the total variance of X is partitioned into common and specific
components. Therefore, after fitting the factor model and performing the rotations,
we estimate the common attribute F on each of the k factors for each observation
(i.e., cell) using a regression equation derived from the factor model (Figure 9.8)
This is accomplished using the score procedure in SAS [28]. The complete factor
structure and underlying phenotypic traits are outlined in Figure 9.8(d). As we
can see in this case, the top six common factors have significant value and each
contains a specific interpretable attribute.
 
Search WWH ::




Custom Search