Biomedical Engineering Reference
In-Depth Information
this hypothesis. SMW offered many of the capabilities we needed
'out of the box' - certainly enough to produce a working prototype.
In addition, the knowledge that this same software underlies
both Wikipedia and our internal corporate wiki suggested that
(should we be successful), developing a production system should be
possible.
Familiarity: as the majority of scientists within the company were
familiar with MediaWiki-based sites, and many of our specifi c target
customers had set up their own instances, we should not face too high
a barrier for adopting a new system.
Extensibility: although SMW had enough functionality to meet
early stage requirements, we anticipated that eventually we would
need to extend the system. The open codebase and modular design
were highly attractive here, allowing our developers to build new
components as required and enabling us to respond to our customers
quickly.
Semantic capabilities: a key element of functionality was the ability to
provide summarisation and taxonomy-based views across the proteins
(described in detail below). This is actually one of the most powerful
core capabilities of SMW and something not supported by many of the
alternatives. The feature is enabled by the 'ASK' query language [8],
which functions somewhat like SQL and can be embedded within wiki
pages to create dynamic and interactive result sets.
Data sourcing
Using a combination of user guidance and access statistics from
legacy systems, we identifi ed the major content elements required for
the wiki. For version one of Targetpedia, the entities chosen were:
proteins and protein targets, species, indications, pathways, biological
function annotations, Pfi zer people, departments, projects and research
units.
For each entity we then identifi ed the types and sources of data the
system needed to hold. Table 17.1 provides an excerpt of this analysis for
the protein/target entity type. In particular, we made use of our existing
infrastructure for text-mining of the biomedical literature, Pharmamatrix
(PMx, [9]). PMx works by automated, massive-scale analysis of Medline
and other text sources to identify associations between thousands of
biomedical entities. The results of this mining provide a rich data source
to augment many of the areas of scientifi c interest.
￿ ￿ ￿ ￿ ￿
 
Search WWH ::




Custom Search