Biology Reference
In-Depth Information
goals and practices of this project parallel those of the previous two MAQC projects (i.e.,
MAQC-I and -II) with the overall aim of establishing a set of recommendations for applying
NGS technologies in the areas of clinical research, patient care and safety evaluations includ-
ing toxicogenomics.
6.6 P UBLICLY AVAILABLE TOXICOGENOMIC DATAB ASES
With the wealth of the genomic data generated from many microarray experiments, inves-
tigators quickly realized that databases and analytical tools were essential to effectively man-
age and condense the data into a more manageable form. In addition to the challenges in
the data analysis and interpretation of large databases, there is a consensus among the sci-
entific community for the need of a predictive toxicogenomic database [213-215] . Building
on the momentum gained from leveraging databases and computational algorithms for
genome sequencing efforts, engineers, statisticians, mathematicians, and computer scientists
have developed analytical tools and shared resources for microarray gene expression data.
These data warehouses provide a means for the scientific community to publish and share
data from large-scale experiments in order to advance understanding of biological systems.
The data repositories would also serve as a resource for data mining and discovery of expres-
sion patterns common to certain experimental conditions, phenotypes and diseases. Such
repositories could also serve the regulatory community as a body of knowledge that could be
compared with toxicogenomics data submitted as part of the compound registration process.
Importantly, some public repositories consist in the promotion of international standards in
data organization and nomenclature.
Although several reports have described software for managing genomics / transcript pro-
filing data at the local or laboratory level, there are compelling reasons for the establishment
of public databases that house not only such transcript profiling data but also the corre-
sponding classical toxicological endpoints. The utility of such a public toxicogenomics data
repository largely depends on the proper functional structuring of the data in a relational
schema that allows efficient extraction of relevant information from it. Data should undergo
rigorous curating, and provenance and experimental metadata should be duly incorporated.
Ideally, tools will be integrated for quality assurance (QA), annotation, flexible query, graph-
ical display, and a broad array of statistical, pattern recognition, and machine learning ana-
lytics for interpretation and model construction. Generally, the value of the repository will
expand enormously as the types and numbers of drugs and chemicals as well as the number
of associated data types and endpoints increase. More valuable still will be gene expression
data collected alongside corresponding conventional toxicological endpoints such as organ
weight, clinical chemistry, hematology and histopathology. A high quality database and
robust software with appropriate algorithms for the comparison of complex gene expres-
sion fingerprints are vital for the interpretation and utilization of the toxicogenomic data. By
combining conventional toxicology phenotypes, validated signatures can then be used for
predictive and mechanistic toxicity studies.
Several toxicogenomic databases are currently being built [216-218] and will be briefly
reviewed in this chapter. Summaries of the corresponding experiments designs are given
in Table 6.2 .
 
Search WWH ::




Custom Search