Biomedical Engineering Reference
In-Depth Information
• What is the best indication for a given compound?
• How should a researcher design a trial based on the experience from
previous internal and public trials?
• How should a researcher stratify a disease based on clinical data?
• Is there support for a target of interest based on clinical data?
Collecting use cases early in the project and periodically revalidating and
refi ning with the stakeholders is important for a project with longer
timelines.
After collecting the use cases we had the fi rst prototype deployed with some
basic data in three months. After the successful fi rst demonstration we used
the agile software development methodology to build iterations and demon-
strate to business partners for feedback and defi ning the next iteration. The
typical cycle time was about 3-5 weeks. The fi rst full deployment of the system
was 12 months after the fi rst demonstration of the prototype. By this time we
had developed a detailed data governance model in collaboration with data
owners, developed publication strategy and training materials, and loaded 10
trials; basic data mining and analysis workfl ows were available for biologists
and physicians.
16.5
CONTENT
At the time of writing the system is one year old and it has more than 30
internal trials with access to deidentifi ed clinical, laboratory chemistry, genom-
ics, protein profi ling, metabolomics, proteomics, fl ow cytometry, protein assay,
and single-nucleotide polymorphism (SNP) data at the subject level and 34
public studies with phenotype and genomics data aligned. Furthermore, subset
analyses (A versus B comparisons or contrasts) of gene expression or protein
profi ling data is available for 10 internal studies and more than 9000 public
sets. We have curated more than 100,000 biomarker assertions and we also
loaded almost 100 studies from the Dana Farber Cancer Institute curated
collection.
The data warehouse also provides integrated access to internally developed
tools such as an integrative pathway and gene set enrichment analysis tool
called Pictor and a gene index and gene information integration resource
called Hydra and several third-party tools such as GeneGo's MetaCore and
Ariadne Genomics Pathway Studio.
A set of standard dictionaries, ontologies, and curated metadata provides
the master data backbone of the data warehouse, including gene and protein
names and synonyms from Entrez, gene name mapping vocabulary for
Affymetrix, Illumina and Agilent gene expression probeset ID, SNP identifi ers,
pathways from the Gene Ontology Consortium (GO) [11], Kyoto Encyclopedia
of Genes and Genomics (KEGG) [12], GeneGo, Ingenuity, Ariadne, and
MSigDB [13], diseases from MeSH [14] and the International Classifi cation
Search WWH ::




Custom Search