Evaluation of a Fuzzy Ontology-Based Medical Information System

abstract

Evidence-based medicine (EBM) requires appropriate information to be available to clinicians at the point of care. Electronic sources of information may fulfill this need but require a high level of skill to use successfully. This paper describes the rationale and initial testing of a system to allow collaborative search and ontology construction for professional groups in the health sector. The approach is based around the use of a browser using a fuzzy ontology based on the National Library of Medicine (NLM) Unified Medical Language System (UMLS). This approach may provide high quality information for professionals in the future.

Introduction

Evidence-based medicine (EBM) (Sackett, Richardson, Rosenberg, & Haynes, 1997) has become increasingly important in the modern healthcare industry. Indeed, the concept of basing practice on evidence is even extending to the software engineering domain (Kitchenham, Dyba, & Jorgensen, 2004). Care that is not based on evidence has become increasingly indefensible from professional, safety, and economic points of view. Electronic access to high quality information can improve the professional knowledge of clinicians (Leung et al., 2003), and is very popular (Westbrook, Gosling, & Coiera, 2004). However, there are a number of difficulties associated with providing high-quality information to support EBM.

Assessing and finding appropriate information is difficult and can be time-consuming. This is partly due to the continuing difficulty users have in navigating the interfaces used by various systems and also because of the lack of training available. Indeed, if the concept of just-in-time information retrieval, as an aid to clinical decision-making at the point of care is to be realised (Gardner, 1997), then complex time-consuming strategies performed by trained users are not possible. Recent work, looking at the usage of the Clinical Information Access programme (CIAP) in New South Wales (Gosling, Westbrook, & Coiera, 2003) has emphasised cultural barriers to using online sources of information in a clinical setting, and this includes a perceived lack of skill in information retrieval by clinicians.

In assessing the usefulness of information sources, a framework to identify the aspects that are important needs to be established. Three dimensions have been identified, including information quality, clinical relevance, and clinical usefulness, based partly on the work of Sackett et al. (1997), and some of the limits used in PubMed and other information sources. The aspects of each dimension are outlined in Tables 1 to 3.

Diversity

Both the users and sources of information are characterised by diversity, and existing examples of information portals reflect this. The CIAP system, described by Moody and Shanks (1999), is particularly interesting as a “top-down” approach to providing evidence at the point of care, that is, the project was driven by the funding authority by the New South Wale s health department rather than a “bottom-up” approach driven by clinical units. Having multiple database systems with many different interfaces and means of searching can only increase the obstacles to effective use of these tools. Even the CIAP system has over 40 different, searchable, databases available, each with its own interface, not to mention the individual journals, and tools such as Google.

Table 1. Information quality

Aspect	Comments
Peer-review	World wide Web (WWW) sites as well as journals may now have peer-review in place.
Randomised Controlled Trial (RCT)	This is the gold standard for clinical interventions although many interventions have not been subjected to this process. There are also issues of the quality and power of a trial. In some cases meta- analysis can cause smaller trials to lose credibility.
High citation number	This is more of a rule of thumb than an absolute factor. If the source is frequently cited then it indicates that large numbers of authors have found it relevant. It is perfectly possible that a particularly bad study may have a high citation index, or that the index may be inflated for other reasons such as age of the reference. It is possible to infer that references cited in ‘good’ documents are more likely to be good themselves but this is dangerous to extend too far.
Recent	This depends on the rate of change of the field. Documents in very active research areas are likely to have a shorter useful life than those in inactive areas.
Significant result	A document containing information that a treatment or diagnostic method is effective, and that this effect is large, is likely to be more useful than one that does not. If there is a traditional treatment that is shown to be ineffective then this also is significant.
Authoritative Source	For electronic sources of information the Health on the Net Code of conduct can give some guidance -otherwise, inclusion by indexes or directories e.g. MEDLINE or Cochrane can lend authority. The author affiliation can be an important issue here. An automated system for “authoritativeness” is described by (Farahat, Nunberg, Chen, & Heylighen, 2002).
Usability	Traditional web usability, for example Neilson’s heuristics(Neilson, 2000), and also in terms of technical issues such as plug-ins media etc.

Aside from the differences in professional education —which will influence the use of preferred search terms — along with the clinical usefulness indicators, users may also have fundamental differences in their understanding of the meaning of terms. Then to share understanding of the meaning of search terms has been a driver in the use of ontologies (Noy & McGuinness, 2001), and indeed Musen (2001) assigned ontology use and creation as the central role in medical informatics. A general view of a system to support reaching for useful medical information is illustrated in Figure 1. Key elements include the use of multiple information sources accessed through a single browser, an ontology-supported scheme for query expansion and refinement, and the identification of users as members of a professional group with expertise in particular domains and five levels of expertise based on the Dreyfus (Dreyfus, Dreyfus, & Athanasiou, 1986) classification (novice, advanced beginner, expert).

The next section deals with the methods used to construct a system to see if such an approach is valuable. The third section describes the case study and prototype usability testing. The fourth section includes the results of the evaluation, and the final section discusses the significance of such an approach.

Table 2. Clinical relevance

Aspect	Notes
Human	Although animal studies, or theoretical ones may be of great use – for example in the case of poisoning or electric shock, human studies are often essential
Correct Sex	Included in this is whether the interventions are safe for pregnant women, and the variation in body sizes and compositions between the sexes, along with other issues related to gender.
Age group relevance	Various age bands are used, or bands that reflect characteristics of the individual rather than his or her ^age.
Speciality is appropriate	Information designed for one medical specialty may not be appropriate for others, for example between pathologists and other clinicians. Similarly the information requirements of different clinical groups e.g. Physiotherapists and Surgeons treating a patient with an artificial hip may have different needs.
Appropriate language	Is this information in a suitable language for use by clinicians, or is it designed for lay people? The requirements for precision and readability will vary according to the intended audience.

Table 3. Clinical usefulness

Aspect	Comments
Appropriate to stage of encounter (e.g. therapy, diagnosis, etc.)	This also excludes information that is purely of a research nature, if better information for the clinical decision is available. However such information can be useful if it casts doubt on current clinical practice, or can help explain otherwise unexpected results.
Deals with available tools	This includes such aspects as whether the drugs or procedures involved are licensed or available in the location, and acceptable in terms of cultural factors and cost.
Suitable format	Are the documents or information sources able to be read by the user; correct language, is a machine reader available. Concrete example of this includes different varieties of microfiche, or PDF files that may require large bandwidths for download.
Available in a timely fashion	Broadly the information may be available immediately (read off the screen- a time period of seconds), quickly (within the library or searching area – a time period of minutes), after a short pause (if documents need to be retrieved from a nearby site- a time period of hours) or after a long time (if the document needs to be specially ordered or generated- a time period of days)
Useful for exclusion	I.e. the information source confirms that a potential diagnosis or treatment is not correct

Figure 1. The overall system

methods

A prototype system was designed, and tested for usability and usefulness.

system Design

The components of the prototype are illustrated in Figure 2. The case study was performed using a prototype version of the system, built in Visual Basic.NET, using an SQL server 2000 as the database (Microsoft). The Google application program interface (Google, 2004) was used along with the entrez eUtilities of PubMed (National Library of Medicine, 2003) to provide two data sources. These are part of an increasing number of Web Services that are being made available via Simple Object Access Protocol (SOAP) that allows XML-based communication and service control over the Internet.

Fuzzy Ontology

Ontologies are extremely important in medical informatics (Musen, 2001), but multiple ontologies (Noy & Musen, 2004) can be difficult to maintain and combine. In addition, terms are often multiplying within ontologies, thus making the mapping between a query term and the intended location in an ontology difficult.

Figure 2. The system components

The concept of a “fuzzy ontology” was introduced in Parry (2004b). Effectively, this approach reuses a current ontology, in this case, the MeSH hierarchy (U.S. National Library of Medicine, 2001), and assigns a particular membership value to each multiple-occurring term in each location.

Table 4 demonstrates how these issues arise in existing ontologies. “Pain” occurs in five locations in the MeSH ontology. Because the term is located in a number of different places, query expansion for this term is difficult, because there is a wide range of numbers of “related” terms. For this reason, the MeSH hierarchy was adapted in order to allow users to assign membership values to term, location pairs via a machine learning system described in Parry (2004a). The case study was designed partly to investigate different methods of learning these relations, but all of the searches were performed using the original MeSH hierarchy terms without the use of fuzzy, ontology support.

Many issues arise from the use of multiple ontologies, including the difficulties associated with communicating between ontologies and the need for maintenance of large numbers of ontologies. The fuzzy ontology as described is partly suggested in order to allow a common framework, or base ontology, with different membership values associated with different users and groups. It should be noted that because of the learning methods involved, only “is-a” type relations are currently used, based on the currently existing MeSH hierarchy.

Table 4. Multiple occurrence examples — “Pain”

Term	Concept ID	Parent	Depth	Root term
Pain	G11.561.796.444	Sensation	4	Musculoskeletal, Neural, and Ocular Physiology
Pain	F02.830.816.444	Sensation	4	Psychological Phenomena and Processes
Pain	C23.888.646	Signs and Symptoms	3	Pathological Conditions, Signs and Symptoms
Pain	C23.888.592.612	Neurologic Manifestations	4	Pathological Conditions, Signs and Symptoms
Pain	C10.597.617	Neurologic Manifestations	3	Nervous System Diseases

Another advantage of this approach is completeness. Rather than impose an arbitrary standard of the importance of a particular location in the ontology, which is required in a crisp ontology to avoid too many examples of a term appearing in the ontology, the term or object can be located in all relevant locations

Most importantly, for searching processes, the use of a fuzzy ontology for the mapping of search terms allows the relative weight of each term in the required output to be calculated. By allowing these weights to be calculated accurately, it removes the bias associated with multiple-located terms being used for searching. If a term is located in multiple locations in a crisp ontology, and is used for query expansion purposes, say by including offspring, then the danger is that the large number of relatively irrelevant expansion terms outweigh those which are useful.

In particular, the use of a fuzzy ontology approach allows the convenient representation of the relationships in a domain according to a particular view without sacrificing commonality with other views; the ontology framework is common, just the membership values are different.

Finally, this approach holds out the possibility that the representation of a potentially very large ontology, can be compressed. If whole areas are not required, the relations to the core can be set to zero. Unwanted intermediate levels can also be removed, with lower-level terms only communicating directly with higher levels. This aspect removes the need to create artificial groupings to avoid orphaned terms. At the limit, a fuzzy ontology, with all membership values = 0 or 1, will have each term or object located in one location only and will behave in exactly the same way as a crisp ontology. A scheme for visually describing the fuzzy ontology is shown in Figure 3.

case study

The setting was an academic department of Obstetrics and Gynecology, and only elements of the MeSH tree relevant to this domain were included. Ethical approval was obtained, and eight users were allocated an hour each to use the system. During user number 8′s study, the database was corrupted, and the subject was unable to complete any testing. As such, the subject was therefore not included in any analysis. This is a small number of users but Neilson (2000) points out that usability testing can often be successful with small groups. In fact, the whole department only comprised less than 10 faculty staff; as with many systems designed for professional use, the pool of users is actually quite small. Results of their searches were presented in the results browser shown in Figure 4.

Figure 3. The fuzzy ontology

Table 5. Comparison of crisp and fuzzy ontologies

Aspect	Fuzzy ontology	Crisp ontology
Multiply-located terms	Does not occur	Issue for disambiguation
Query expansion	Depends on membership value.	Depends on location only
Customisation	Simple, based on modification of membership values	Requires new ontology and/or ontology sharing.
Intermediate locations for grouping	Unnecessary	Needed for construction – may be useful
Storage required	Depends on the number of terms in the ontology and the membership values of the relations, can be smaller or larger than crisp.	Depends on number of terms in the ontology
Knowledge representation	Related to use	Related to structure.

The users were asked to perform the following tasks:

1. Log into the system and select the appropriate demographic and area of interest.

2. Perform a search using the “obstetric” keyword on the Google interface. This was done mostly to familiarise the users with the system, in particular, the appropriate use of the mouse and the use of the + anchor in lists to expand them, as in Windows Explorer.

3. They then performed another search using terms of their own choosing, again using the Google interface.

4. With the open browser windows, they were asked to rate a number of the pages shown in terms of usefulness via the slider. They were also asked to perform an analysis on pages that they rated highly. In most cases, this amounted to around five pages. 5. They were then asked to perform a similar task using the full MeSH tree.

Figure 4. The results browser

Results

The results of the study are based around the usability questionnaire responses and user comments. The questionnaire used in the study was adapted from one of those generated from the site provided by Perlman (2001). This questionnaire was originally reported in Davis (1989) and has subsequently been used in a number of studies. This questionnaire focuses on the use of a system for work-related tasks, and the scale runs from -2 to +2, to allow a 0 for neutrality. All of the questions are phrased so that a positive result implies satisfaction with the system. One of the most interesting aspects of this questionnaire is that it specifically links ease of use and usefulness. The original work suggested that increased perceived ease of use has a causal influence on perceived usefulness. However, more recent work (Segars & Grover, 1993) appears to suggest that this analysis is not complete. It is suggested that, in turn, an information system that is perceived as useful must be retrieving useful information.

Overall, the perceived usefulness was rated as X = 1.16 (SD 1.21), and the perceived ease of use at X = 1.53 (SD 0.29). Ease of use could be expected to rate more highly as the situation for testing was somewhat artificial.

Comments about the system were recorded, and general satisfaction seemed quite high. Of particular interest was the ease of use of the analysis system, despite the fact that there were bugs in this version, which allowed duplicate words to occur in the pick list. There was certainly a preference towards identifying positive (very, somewhat relevant) rather than negative (irrelevant, unwanted) words. The users preferred to analyse those documents they found useful, and tended to ignore those they found useless.

One aspect of particular benefit was the presentation of the derived MeSH keywords, which allowed a user to reconsider his or her search before it began. General observations of users included the fact that they found dealing with large numbers of windows a little confusing. By attempting to improve visibility, the use of multiple windows tended to remove the obvious focus. Mouse movement became more uncertain when there were overlapping windows, and the users were often uncertain as to the difference between closing and minimising windows. In many cases, the users maximised the active window.

Table 6. User group details

Number	Job description	Professional group	Computer experience		Gender	Age Range
1	Senior Academic	Doctor, interest in MFM	Moderate		Male	50+
2	Senior Academic	Doctor interest in MFM	Moderate		Female	50+
3	Junior Academic	Doctor, interest in REI	High		Female	30+
4	Research Midwife	Midwife background, clinical researcher	Moderate		Female	50+
5	New Consultant	Doctor, General Obstetrics and Gynaecology		Moderate	Female	30+
6	Senior Academic	Doctor, interest in Contraception		High	Female	40+
7	Junior Academic	Doctor, interest in Infertility		Moderate	Female	30+
8	New Consultant	Doctor, General Obstetrics and Gynaecology		Moderate	Female	30+

One of the recurring themes was the uncertainty of whether such a system was primarily for medical professionals or for patients. When browsing the documents recovered via Google, the users were sometimes surprised to find what they regarded as legitimate medical pages among the obviously patient-centred ones. This is an unexpected benefit of using multiple search engines — multiple search strategies are used simultaneously. Various meta-engines already use this approach, but they currently do not appear to use non-commercial data sources such as PubMed.

DISCUSSION

Finding and applying appropriate information is one of the key tasks of the knowledge worker (Kidd, 1994). There exists a vast body of knowledge in electronic form for workers and patients in the health sector. However, finding appropriate knowledge is difficult and time-consuming. Fears of inappropriate information being provided abound (Eysenbach, 2002). In order to fully realize the potential benefits of electronic knowledge sources, the sources must be appropriate for their use and usable by the potential beneficiaries. Understanding the knowledge requirements of users in this domain and providing appropriate tools for such users remain great challenges for informatics professionals. This paper has attempted to set up a framework for future research in the area of appropriate knowledge sources based around a user perspective. The importance of delimiting different user groups within the health sector has also been identified. In addition, a prototype system for combining knowledge from different sources in an integrated way has been tested for usability and potential usefulness. The challenges of using diverse information sources from the Web have been raised in Allan et al. (2003), and this area remains a particularly important area of information retrieval research. Other work has been done recently on the usability of medical information sources (Alexander, Hauser, Steely, Ford, & Demner-Fushman, 2004), and improvements are certainly possible. The results of this study suggest that an integrated knowledge discovery system for a medical professional is desirable and that the prototype represents a useful start in this direction. The results for ease of use compare favourably with similar scores in the technology acceptance model, that is, in Henderson and Divett (2003), dealing with electronic shopping, where ease of use was 1.25 and usefulness was 0.96 when converted to the same scale as used in this work. It is hoped that further research in this area will continue, in particular, in the following areas; the replacement of the executable form of the system with a browser-based client server system that will allow much larger user groups to interact with it and the provision of a substantial base for learning about group preferences. Mobile and wireless information retrieval may be more appropriately integrated into clinical workflow especially by means of “information appliances” (Eustice, Lehman, Morales, Munson, Edlund, & Guillen, 1999), recent work in this area (Bur-dette, Herchline, & Richardson, 2004) suggests that these devices may be particularly suitable for hospital use. The integration of information sources which provide their data via Web Services is also rapidly becoming accepted in the world of digital libraries (Fu & Mostafa, 2004). Integrating information from diverse sources via ontologies is also becoming increasingly important especially in the context of the ” Semantic Web” (Berners-Lee, Hendler, & Lassila, 2001). The Ontology Web Language (OWL) (Smith, Welty, & McGuinness, 2004) could also be modified to support a fuzzy ontology, and it has already been recognised that storage of such ontologies on the Web can allow effective knowledge sharing (Haarslev, Lu, & Shiri, 2004). Finally, more research needs to be undertaken in the use and standardization of aspects of information reliability, usefulness, and relevance to improve research and classification in this area especially from the perspective of the clinical worker.

Table 7. Initial group satisfaction

Question	User 1	User 2	User 3	User 4	User 5	User 6	User 7	Mean
Perceived Usefulness
1 (Quick)	0	2	1	1	1	2	2	1.26
2 (Performance)	0	1	0	0	1	2	2	0.86
3 (Productivity)	2	1	1	0	1	2	2	1.29
4 (Effectiveness)	1	1	0	0	1	2	2	1.00
5 (Easier)	1	1	0	1	1	2	2	1.14
6 (Useful)	1	1	2	1	1	2	2	1.43
Perceived Ease of Use
7 (Easy to Learn)	2	1	2	2	2	2	2	1.86
8 (Easy to Control)	-1	2	1	2	2	1	2	1.29
9 (Clear Interact)	2	2	1	2	1	1	2	1.57
10 (Flexible)	-1	2	1	2	1	1	2	1.14
11 (Skill)	2	2	2	2	1	2	2	1.86
12 (Easy to Use)	0	2	1	2	1	2	2	1.43