Biology Reference
In-Depth Information
7.2.1. Prototype-based coexpression module analysis
To identify coexpression modules associated with specific biological
processes, we have devised a supervised approach where a small number
of “prototype” genes, each representing an important disease process,
are selected based on biological knowledge about breast cancer 53 and
previous results of expression studies. Each prototype forms the core
of a coexpression module. Expression values of these prototype genes
are then used simultaneously as explanatory variables in a regression
model to group other genes according to their coexpression with the
respective prototype. The modules are created by adding genes based
on the association of expression with prototype expression.
For breast cancer, we have identified five key processes: estrogen
receptor signaling, ERBB2 amplification, proliferation, invasion, and
immune response. We represent these processes with prototype genes,
respectively, ESR1, ERBB2, AURKA (aurora-related kinase 1; also known
as STK6 or STK15), PLAU (urokinase-type plasminogen activator; uPA),
and STAT1 (signal transducer and activator of transcription 1). Other
choices of well-known genes for the prototypes do not affect the overall
conclusions.
To identify genes associated with each prototype, we use the follow-
ing meta-analysis scheme:
(1)
within a dataset and for each gene separately, fit a multiple regres-
sion which models expression as a function of prototype expression;
(2)
carry out a t -test for each coefficient, yielding an approximate
z -score;
(3)
combine z -scores across studies as above using the inverse normal
method; and
(4)
select for each prototype the genes most strongly associated.
7.2.2. Model for identifying coexpression modules
The expression levels of the prototype genes on the log 2 scale are used as
explanatory variables in a multiple regression with Gaussian error, using
Search WWH ::




Custom Search