Information Technology Reference
In-Depth Information
Attribute
Type
Description
Unique identifier. Denomination of the microarray
sample.
NAME
Alphanumeric
#Age
Numeric
Age of the patient.
#Sex
Enumeration
Possible values are: male , female .
The FAB classification is a morphological
characterization of leukemia. The WHO classification
incorporates the results of chromosome and genetic
research developed during the past 25 years after the
FAB classification.
#FAB/WHO
Alphanumeric
FISH studies are used to delineate complex chromosome
rearrangements, diagnose microdeletion syndromes, or
demonstrate the presence of molecular rearrangements
characteristic of certain hematologic malignancies.
Gene name 1 Alphanumeric Human gene name or identifier.
Gene value 1 Numeric Microarray gene expression value for the associated gene
identifier.
…. … The number of gene name-value pairs depends on the
microarray type (Affymetrix HG-U133A/B/Plus, etc.).
Gene name n Alphanumeric Human gene name or identifier.
Gene value n Numeric Microarray gene expression value for the associated gene
identifier.
Class Alphanumeric Type of disease.
Table 2. Internal representation of a microarray sample in the geneCBR system (the symbol
'#' represents an optional feature)
During the retrieval stage, the original case vectors are transformed into fuzzy microarray
descriptors (FMDs). Each FMD is a comprehensible descriptor of the sample in terms of a
linguistic label for each gene expression level (central part of Figure 5). This transformation
is carried out by means of an accurate fuzzy discretization process (Díaz et al. 2006). Based
on the FMD representation created from the case base, a set of fuzzy patterns (FP) is
constructed that represents the main characteristics of the a priori known classes (top-left
square in Figure 5). Each class in the system is then represented by a FP that holds the fuzzy
codification of gene expression levels for those genes that were flagged as relevant for this
class. Several FPs are generated from the data in a supervised way, each one representing a
group of FMDs for pathological (or generic) specy.
The retrieval stage in the geneCBR system uses the FPs in order to select the most
representative genes given a new patient. This phase can be thought of as a gene selection
step, in which the aim is to retrieve the list of genes that might be most informative given a
new sample to classify. Since it is highly unlikely that all the genes have significant
information related to cancer classification and the dimensionality would be too great if all
the genes were used, it is necessary to explore an efficient way to obtain the most suitable
group of genes. In order to make this selection, our geneCBR system selects those fuzzy
patterns from its case base which are the nearest to any new case obtained. Then, for each
one of the selected FPs, the geneCBR system computes its associated DFP (a pattern which
only includes the genes that are necessary in order to discriminate the novel instance from
other different classes). Finally, the selected genes for the new case are obtained by joining
together the genes belonging to the DFPs considered.
#FISH studies
Alphanumeric
Search WWH ::




Custom Search