Biomedical Engineering Reference
In-Depth Information
interpret flow cytometry data (flowCore) [152]. flowCore consists of tools or packages
that provide a workflow for rigorous and robust analysis. Critical steps include data
handling import and normalization (flowCore), visualization of suitable data struc-
tures that support the application of similar operations to a collection of samples
(flowViz) [153], and identification of cell populations while handling the issues of
outlier identification and data transformation (flowClust). flowClust offers various
tools to summarize and visualize the outcomes of clustering resources [154].
Importantly, the Bioconductor environment provides a developer framework for
incorporating tools and models that are also open source; hence, the community is
able to both contribute to and benefit from the evolving package and importantly keep
a pace with ongoing technological advancement.
The current emphasis for data analysis has been the use of computational models to
identify subpopulations and enable unsupervised analysis. Many groups are addres-
sing the development of automated, high-dimensional analytical methods. However,
flow cytometry data are noisy, asymmetric, and often contain outliers, and these issues
offer the primary hurdles for true automated analysis. Recent studies have explored
Gaussian mixture modeling to select an “optimal” number of components in the
model and partition data sets [155]. A direct multivariate finite mixture modeling
approach has been described, using skew and heavy-tailed distributions, to address
the complexities of flow cytometric analysis. The approach permits the handling of
high-dimensional cytometric data with the capacity to detect rare populations. The
group at the Broad Institute of MIT and Harvard has demonstrated the approach for
modeling the presence of outliers and skew, to perform the critical task of matching
cell populations across samples to facilitate downstream analysis [156]. These
advances in modeling of flow cytometry data operating within a solid data processing
pipeline and supported with fast processing speeds provide the means for achieving
a true discovery environment for high-throughput flow cytometry data.
2.6.3
Introducing the Need for Data Standards and Formats
The value of data is only as good as its annotation and accessibility: it must be properly
curated and archived in a software or machine-readable format (see editorial in
Nature Cell Biology [157]). The long-standing and enduring goal of the Flow
Cytometry Data File Standard (FCS) has been to output data from any flow cytometry
instrument and hence facilitate the development of software for reading and writing
standardized formatted cytometry data. Listmode files consist of a complete listing of
all events corresponding to all the parameters collected, as specified by the acquisition
settings. This file follows a format specified by the FCS 3.0 standard [158].
Raw listmode data files can be opened or replayed using any program designed for
analysis of flow cytometry data. Data File Standard for Flow Cytometry, Version
FCS 3.1 is scheduled for release in 2010 and includes a mechanism for handling time
series parameters.
However, in recent years, data handling approaches developed for genomic and
proteomic studies are now being applied to flow cytometry supported by data
standards and bioinformatics tools that enable robust management and mining of
Search WWH ::




Custom Search