Biomedical Engineering Reference
In-Depth Information
expand in their capabilities and performance, the integration of chemistry
and biology databases is likely to offer even greater opportunities to benefi t
the process of drug discovery. Efforts to expand the existing structure-centric
communities for biomedical researchers with key information relevant to
drug discovery which is precompetitive will bring benefi t in terms of access
and discoverability of data. It will be very important to distinguish which
precompetitive data can be of most value so that users are not swamped with
data overload. We also need discovery tools to fi lter the data as an obvious
consequence of making more data available is that it creates a potential fi l-
tering problem. New discovery mechanisms and tools will be needed to both
identify the right data and critique its quality and relevance to a specifi c
problem. While a natural response is to attempt to reduce or fi lter the data
that get published, the long-term future must lie in applying the lessons from
the wider Web in building effective search and discovery tools. The transition,
however, is likely to be diffi cult and manual, and semi-automated data cura-
tion will play a big role in easing that transition. A major limitation of
approaches to capturing the public information is that most data will be in
publications and, until the publishers make these data semantically accessible,
it will not be easily mined other than by manual extraction. While there have
been, and continue to be, many efforts to improve the underlying mechanisms
of scientifi c publishing to make data extraction easier, this is likely to be a
slow process and large quantities of information will remain in the legacy
literature. There is however already a considerable amount of data for drugs
on the market that could be extracted from various online databases and that
could be valuable for developing computational models, for example. Text-
mining tools have already been developed that can be partially successful in
aggregating these data, but it would be preferable if instead of harvesting
these data out of publications and patents drug companies, researchers, and
health authorities could supply the data in a homogeneous standardized
format and in a coordinated fashion. International funding agencies are pres-
ently tendering for the development of systems that could facilitate these
kinds of data-sharing opportunities as pharmaceutical companies acknowl-
edge that the cost burden that they need to assume to aggregate these data
is too high and, since it is precompetitive in nature, collaborative efforts across
the life sciences should facilitate data access.
While much of the biological data used in drug discovery can be used to
generate computational models in each company, this is also true for other
data generated at different stages of drug discovery and development.
Computational models reported in publications and in the public domain are
hardly accessible in terms of testing the models against internal data sets.
Similarly, models are rarely shared between companies or even between
researchers, and there has been little research or efforts invested to facilitate
this [6]. The following sections represent those areas we think are challenges
that would likely also benefi t from collaborative computational approaches.
Search WWH ::




Custom Search