Biomedical Engineering Reference
In-Depth Information
conferences, science policy and the promotion of chemistry to the public.
The information-handling requirements of the publishing division have
always consumed the largest proportion of the available software
development resources, traditionally dedicated to enterprise systems to
develop robust and well-defi ned systems to deliver published content
to customers. Internal adoption of open source solutions was initiated
with the development of Project Prospect [1], and then extended with the
acquisition of ChemSpider [2]. ChemSpider delivered both a platform
incorporating much open source software, staff expertise in
cheminformatics, as well as new and innovative functionality. The small
but agile in-house development team have combined commercial and
free/open source software tools to develop the platforms necessary to
deliver capabilities to the user community. This topic chapter will review
the systems that have been developed in-house, what they will deliver to
the community, the challenges encountered in utilizing these tools and
how they have been extended to make them fi t-for-purpose.
3.2 Project Prospect and open ontologies
RSC began exploring the semantic markup of chemistry articles, together
with a number of other publishers in 2002, providing support for a
number of summer student projects at the Centre in Cambridge
University. This work led to an open source Experimental Data
Checker [3], which parsed the text of experimental data paragraphs
and performed validation checks on the extracted and formatted
results. This collaboration led to RSC involvement, as well as collaboration
with Nature Publishing Group [4] and the International Union of
Crystallography [5], in the SciBorg project [6]. The resulting development
of OSCAR [7] (Open Source Chemistry Analysis Routines) as a means of
marking up chemical text and linking concepts and chemicals with
other resources, was then explored and was ultimately used as the
text mining service underpinning the award-winning 'Project Prospect
[1]' (see Figure 3.1).
It was essential to develop both a fl exible and cost-effective solution
during this project. Software development was started from scratch,
using standards where possible, but still facing numerous unknowns.
Licensing a commercial product for semantic markup would have been
diffi cult to justify and also risked both infl exibility and potential
limitations in terms of rapid development. As a result, it was decided to
￿ ￿ ￿ ￿ ￿
 
Search WWH ::




Custom Search