Biology Reference
In-Depth Information
Transcriptional Profiling by Oligonucleotide Array
Transcriptional profiling by Affymetrix microarray was done using coa and wild-
type pistils. In particular, emphasis was given to the low-level analysis of the mi-
croarray data, because the low fold change cut-off used for the embryo sac dataset
could potentially introduce a large number of false positives. We chose to use
three independent statistical packages (dCHIP, gcRMA and Gene Spring), with
the most and least stringent being dCHIP and Gene Spring analysis, respectively.
For dCHIP analysis, only those genes within replicate arrays called 'present' with-
in a variation of 0 < median (standard deviation/mean) < 0.5 were retained for
downstream analysis. By setting P to < 0.1 and differential fold change expression
cut-off to 1.28-fold, we could predict that the median FDR ranges from 1% (spl
dataset) to 3% (coa dataset) in the dCHIP analysis. The dilution of gametophytic
cells in an excess of sporophytic tissues was higher in coa samples than in spl
samples (discussed in Results, above), which may be the reason for the increase
in the FDR. In such cases, standard error values of the signal averages provide an
indication for manual omission of false positives. In the analysis using gcRMA,
pre-processed signal values were statistically analyzed using an empirical bayesian
approach and the FDR was calculated for each gene using the options imple-
mented in the Bioconductor software version 2.3.0 [79]. Only those genes with a
FDR below 0.05 were considered to be differentially expressed. Manual omission
of false-positive findings is possible in this type of analysis, if the standard error
estimates of the mean RMA values (signal) and the absolute FDR values are to
be used as indicators of false discovery. The sporophytic datasets did not impose
such problems because the fold change cut-off was set to twofold as a stringent
baseline, in addition to the analysis using three statistical methods.
Bioinformatics Analyses
The candidate genes were functionally classified according to the Gene Ontol-
ogy data from TAIR or published evidence where appropriate. Annotations
were improved mainly for the transcription factors from the Arabidopsis Gene
Regulatory Information Server [80]. The secreted proteins were chosen based
on the protein sequence analysis using TargetP with the top two reliability
scores out of five [81]. A total of 32,349 maize and wheat EST sequences
extracted from libraries specific for the embryo sac, egg, central cell, and early
endosperm were obtained from various sources. The pools of EST sequences
were converted to local BLASTable databases using NCBI software [82]. A
PERL script was written to perform the mapping of A. thaliana female gameto-
phyte transcriptome data to the EST datasets. An EST sequence is considered
similar to an Arabidopsis protein if it matches at an e-value cutoff threshold
Search WWH ::




Custom Search