Information Technology Reference
In-Depth Information
Fig. 6.6 Basic microarray data analysis workflow
6.2.3 Variable Microarray Data Analysis Pipeline
Figure 6.7 shows a variable workflow model that gives an impression of how
the SIBs in the library can be combined into different microarray analysis
pipelines. Similar to the variable the multiple sequence alignment workflow
discussed in Section 3.2, the SIBs in the model are pre-configured to be readily
executable, and the currently intended analysis steps can be included simply
by redirecting branches. The boxes in the figure represent principal steps of
microcarray data analysis workflows, and contain different (combinations of)
SIBs that realize corresponding tasks:
1. In the (naturally mandatory) input data loading step, the microarray raw
data and the corresponding meta-data is loaded. It can be selected if one
of the benchmark data sets is used that are readily available on the jETI
server, or if the input data from the local file system is used.
2. Preprocessing is also mandatory. Here, AffyExpressPreprocess can be
used to create an ExpressionSet object from the input data, or one of
RMA , GCRMA , Threestep and Express followed by CreateExpressionSet .
3. It is recommended to apply one or more filtering steps to the expression
values before applying further analyses.
4. Optionally, the expression values can be visualized in a HTML table (cre-
ated by Annaffy aafTableInt andthenstoredtothelocalfilesystem).
5. Statistical analysis, for instance in order to identify the top differentially
expressed genes, is then again considered mandatory. Optionally, a tex-
tual representation of the results can be written and stored to the local
file system.
6. Finally, it is useful to retrieve further information about the top differen-
tially expressed genes, for instance probe annotations (via Annaffy
aafTableAnn )or related PubMed articles (via GetPubMedAbstracts ),and
store the resulting (HTML) files.
The SIBs in the data loading, preprocessing, filtering, statistical analysis
and annotation boxes that are highlighted by the light-gray box in the figure
correspond to the basic analysis workflow described above. The workflow that
is defined by the branches as shown in Figure 6.7 instead reads the input
data from the local file systems, uses Threestep for the prepocessing and the
GenefilterKOverA for expression value filtering. Then it creates and stores
an HTML table from the expression values, before the differential expression
analysis is carried out. Finally, an annotation table is created and stored.
 
Search WWH ::




Custom Search