Evaluation of Health Information Systems: Challenges and Approaches

ABSTRACT

This topic summarizes the problems and challenges that occur when health information systems are evaluated. The main problem areas presented are the complexity of the evaluation object, the complexity of an evaluation project, and the motivation for evaluation. Based on the analysis of those problem areas, the topic then presents recommendations of how to address them. In particular, it discusses in more detail what benefits can be obtained from applying triangulation in evaluation studies. Based on the example of the evaluation of a nursing documentation system, it shows how both the validation of results and the completeness of results can be supported by triangulation. The authors hope to contribute to a better understanding of the peculiarities of evaluation in healthcare, and to provide information how to overcome them.

INTRODUCTION

It is hard to imagine healthcare without modern information and communication technology (ICT). It is evident that the use of modern information technology (IT) offers tremendous opportunities to reduce clinical errors, to support healthcare professionals, and to increase the efficiency of care, and even to improve the quality of patient care (Institute of Medicine, 2001).

However, there are also hazards associated with ICT in healthcare: Modern information systems (ISs) are costly, their failures may cause negative effects on patients and staff, and possibly, when inappropriately designed, they may result in healthcare professional’s spending more time with the computer than with the patient. This all could have a negative impact on the efficiency of patient care. Therefore, a rigorous evaluation of IT in healthcare is recommended (Rigby, 2001) and is of great importance for decision makers and users (Kaplan & Shaw, 2002). Evaluation can be defined as the decisive assessment of defined objects, based on a set of criteria, to solve a given problem (Ammenwerth et al., 2004).

The term ICT refers to technologies as such. Whether the use of these technologies is successful depends not only on the quality of the technological artifacts but also on the actors (i.e., the people involved in information processing and the organizational environment in which they are employed). ICT embedded in the environment, including the actors, is often referred to as an IS in a sociotechnical sense (Berg, Aarts, & van der Lei, 2003; Winter et al., 2001).

Many different questions can lead the evaluation of IT. Within evaluation research, two main (and often rather distinct) traditions can be found: The objectivist (positivistic) and the subjectivistic tradition (Friedman & Wyatt, 1997), which are related to the dominant use of either quantitative or qualitative methods.

Despite a large amount of published evaluation studies (e.g., van der Loo, 1995) found over 1,500 citations on evaluation of healthcare IT between 1967 and 1995, and Ammenwerth and de Keizer (2004) found 1,035 studies between 1982 and 2002; many authors report problems during evaluation. One of the main problems frequently discussed is the adequate choice of evaluation methods. While objectivistic researchers tend to concentrate on quantitative methods, subjectivistic researchers mainly rely on qualitative methods. Sometimes, a mixture of methods is applied. For example, qualitative methods are used to prepare quantitative studies, or quantitative measurements are used to support qualitative argumentation. However, there is still usually one tradition which dominates typical evaluation studies, leading to a focus either on quantitative or qualitative methods.

Many researchers point to the fact that this domination of one method or tradition may not be useful, but that a real integration of various methods from both traditions can be much more helpful to get comprehensive answers to given research questions. The integration of the complementary methods (and even beyond this, of data sources, theories and investigators), is discussed under the term triangulation.

In this topic, we first want to review some of the underlying reasons that make evaluation of healthcare IT so difficult. We will structure the problems into three main problem areas: the complexity of the object of evaluation, the complexity of the evaluation project, and the motivation to perform evaluation. We will discuss means how to overcome the discussed problems.

As one more detailed example, we then discuss what benefits can be obtained from applying triangulation in an evaluation study. Based on the example of the evaluation of a nursing documentation system, we show how both the validation of results and the completeness of results can be supported by triangulation.

TYPICAL PROBLEMS IN EVALUATION OF IT IN HEALTHCARE

First Problem Area: Complexity of the evaluation object

When understanding IT as part of the IS of an organization, it is clear that evaluation requires not only an understanding of computer technology, but also of the social and behavioral processes that affect and are affected by the technology. This complexity of the evaluation objects has some important consequences. First, the introduction of IT takes time. It is not enough to implement the technology and then to immediately measure the effects. Users and workflow need a lot of time to get used to new tools and to completely exploit the new possibilities (Palvia, Sharma, & Conrath, 2001). Thus, evaluation results can develop and change during this first period of use. Then, even after an introduction period, the evaluation object may steadily change (Moehr, 2002; moving evaluation target). For example, the use of IT may be affected by changes in work organization, or in staff. It is nearly impossible to reach a stable situation in a flexible healthcare environment which makes evaluation results dependant of the point in time where the evaluation took place. In addition, each IS in our definition is quite unique. While the IT may be similar in various departments, workflow, users and used functionality may be different. In addition, the organization of its introduction as well as the overall user motivation may differ. Thus, even when the same IT is introduced, its effects may be varying (Kaplan & Shaw, 2002). The influence of such factors on the results of an evaluation study is often hard to disentangle (Wyatt, 1994), posing the problem of external validity (Moehr, 2002): Many evaluation studies may be valid only for the particular institutions with their specific IS.

The complexity of the evaluation object is an inherent attribute in healthcare IT evaluation and cannot be reduced. However, there are some ways to handle this problem in evaluation studies. To address the problem of external validity, the IT and its environment that is going to be evaluated should be defined in detail before the beginning of the study. Not only the software and hardware used should be described, but also the number of users and their experience and motivation, the way IT is introduced and used, the general technical infrastructure (e.g., networks) and any further aspects that may influence the usage of IT and its effects. The functionality and the way it is really used should also be of importance. Only this information may allow interpretation of the study results and comparison of different locations. Then, to address the problem of the moving evaluation target, all changes in the IT and its interaction with the users should be carefully documented during the study. For example, change s in workflow, in staffing, or in hardware or software should be documented with reference to the ongoing evaluation. This permits the explanation of changes and differences in effects measured during the study period. Another approach to address the problem of the moving evaluation target may be to define smaller evaluation modules. This would allow the evaluation design or evaluation questions to be adapted to changes in the environment. Each module answered a question related to a defined phase of the introduction of the IT. In addition, an evaluation must be planned in a long-term perspective in order to allow the users and the environment to integrate the new IT. Hence enough resources for long-term evaluation (e.g., over several months or even years) should be available.

second problem Area: complexity of the Evaluation Project

Evaluation of IT is performed in the real and complex healthcare environment, with its different professional groups, and its high dependency on external influences such as legislation, economic constraints, or patient clientele. This poses problems to the evaluation projects, meaning the planning, executing and analyzing of an IT evaluation study. For example, the different stakeholders often have different conceptions and views of successful IT (Palvia et al., 2001). The different stakeholder requirements can serve as a frame of reference for evaluation during the early phases of the IT life cycle, but also guide evaluations during later phases. In each case, multiple-stakeholder views may lead to a multitude of (possibly conflicting) evaluation questions (Heathfield et al., 1999).

Depending on the point of view adopted, the evaluation will require different study designs and evaluation methods. The evaluation researcher must decide, for example, on the evaluation approach, on the adequate evaluation methods (e.g., quantitative vs. qualitative), and on the study design (e.g., RCT vs. observational study). Each has its own advantages and drawbacks ( Frech-tling, 1997; Moehr, 2002), making their selection a rather challenging endeavor. This multitude of possible evaluation questions and available evaluation methods makes the planning of an evaluation study quite complex.

The complexity of the evaluation project has several consequences. First, the overall success of IT is elusive to define (Palvia et al., 2001), and it is therefore often difficult to establish clear-cut evaluation criteria to be addressed in a study (Wyatt, 1994). Each stakeholder group may have individual questions, and a universal evaluation in terms of absolute or relative benefits is usually not feasible (or, from a more subjectivistic point of view, not even possible). It is also unrealistic to expect that the IT itself will have a direct and easy to measure effect on the outcome quality of patient care, like in a drug trial (Wyatt, 1994). Thus, indirect measures are often used such as user satisfaction or changes of clinical processes, which, however, do not give a really complete picture of the benefits of IT. Often, changes in the evaluation questions may occur during the study (e.g., based on intermediate evaluation results, new insights, changes in stakeholders’ opinions, or changes of the IT [scope creep]; Dewan & Lorenzi, 2000). Changes in study questions, however, may be difficult to balance with study resources. Finally, the selection of adequate evaluation designs and evaluation methods is often regarded as a problem during evaluation studies. Evaluators may not be sufficiently aware of the broadness of available approaches, or be too deeply embedded in either the qualitative or the quantitative paradigm, neglecting the possible contributions of the complementary approach. Thus, inadequate methods or study designs may be chosen which may not be able to answer the original study questions.

The following suggestions may be useful in order to deal with the complexity of the evaluation project. First, it is recommended that the general intention of the evaluation and the starting point should be agreed early on. In principle, evaluation should start before the new IT is implemented, in order to allow for early gathering of comparative data, and then continue during all phases of its life cycle (VATAM, 2000). Then, the areas of evaluation should be restricted to aspects which are of most importance to the involved stakeholders, and which can be measured with the available resources. A complete evaluation of all aspects of a system (such as economics, effectiveness, and acceptance) is usually not feasible. A balance between the resources of a study and the inclusion of the most relevant aspects has to be found. In addition, sufficient time should be invested into the definition of relevant study questions. All involved stakeholder groups should discuss and agree on the goals of evaluation (VATAM, 2000). The selected study questions should be relevant for decision-making with regard to introduction, operation or justification of IT (Ammenwerth et al., 2004). Conflicting goals should be discussed and solved, as they are not only problematic for an evaluation, but for the overall management of new IT. Fourth, when new evaluation questions emerge during the study, they should only be included in the study design when it is possible without creating problems. Otherwise, they should be tackled in consecutive studies. Each shift in evaluation questions must thoroughly be documented. For each study question, adequate methods must be chosen. A triangulation of methods may be useful to best answer the study questions (Heathfield, Pitty, & Hanka, 1998). For example, to address the effects of a nursing documentation system, both quantitative methods (time measurement, user acceptance scales, documentation quality measurement) as well as qualitative methods (focus group interviews) were used. We will discuss this example later on in more detail.

Third Problem Area: Motivation for Evaluation

An evaluation study can normally only be conducted when there is sufficient funding, and a sufficient number of participants (e.g., staff members, wards). Both these variables depend on the motivation of stakeholders (e.g., hospital management) to perform an evaluation. Sometimes, this motivation is not very high, because, for example, of fear for negative outcome, or of fear for revealing deficiencies of already implemented technology (Rigby, Forsstrom, Roberts, & Wyatt, 2001). In addition, the introduction of IT in an organization is a deep intervention that may have large consequences. It is thus often very difficult to organize IT evaluation in the form of an experiment, and to easily remove the system again at the end of the study in case the evaluation was too negative.

Even with a motivated management, it may be difficult to find suitable participants. Participating in a study usually requires some effort from the involved staff. In addition, while the users have to make large efforts to learn and use a new, innovative system, the benefit of joining a pilot study is usually not obvious (the study is conducted in order to investigate possible effects), but participation may even include some risks for the involved staff such as disturbances in workflow. In summary, due to the given reasons, the hospital management as well as involved staff members is often reluctant to participate in IT evaluation studies.

The described problem has consequences for the study. Without the support and motivation of the stakeholders to conduct an evaluation study, it will be difficult to get sufficient resources for an evaluation and sufficient participants willing to participate. Second, due to the given problems, the study organizer tends to recruit any participant who volunteers to participate. However, those participants may be more motivated to participate than the “normal” user. This leads to the well-known volunteer effect, where results are better when participants are motivated. In addition, evaluation results are not only important for the involved units, but also for the overall organization or for similar units in other organizations. To allow transfer of results, the pilot wards or pilot users must be sufficiently representative for other wards or users. But, as each IT within its environment is quite unique (see Problem Area 1); it is difficult to find comparable or representative participants.

To increase the number of participants, two approaches should be combined. First, the responsible management should be informed and motivated to support the study. The result of an evaluation study may be important to decide on new IT, and to support its continuous improvement. Then, the possible participants could be directly addressed. It should be made clear that the study provide s the opportunity to influence not only the future development of IT in healthcare but also the own working environment. User feedback of study results may act as an important driving force for users to participate in the study. Offering financial compensation or additional staff for the study period may help to gain support from participants and from management. As in clinicaltrials, multicentric studies should be considered (Wyatt & Spiegelhalter, 1992). This would largely increase the number of available participants. This means however, that study management requires much more effort. A multicentric study design is difficult when the environment is completely different. In addition, the variation between study participants will be bigger in multicentric trials than in single-center ones. This may render interpretation and comparison of results even more difficult (cp. discussion in Problem Area 1).

Summary of General Recommendations

The above discussed problems and approaches will now be summarized in a list of 12 general recommendations for IT evaluation in healthcare:

1. Evaluation takes time; thus, take your time for thorough planning and execution.

2. Document all of your decisions and steps in a detailed study protocol. Adhere to this protocol; it is your main tool for a systematic evaluation.

3. Strive for management support, and try to organize long-term financial support.

4. Clarify the goals of the evaluation. Take into account the different stakeholder groups. Dissolve conflicting goals.

5. Reduce your evaluation questions to an appropriate number of the most important questions that you can handle within the available time and budget. If new questions emerge during the study, which cannot easily be integrated, postpone them for a new evaluation study.

6. Clarify and thoroughly describe the IT object of your evaluation and the environment. Take note of any changes of the IT and its environment during the study that may affect results.

7. Select an adequate study design. Think of a stepwise study design.

8. Select adequate methods to answer your study questions. Neither objectivist nor subjectivist approaches can answer all questions. Take into account the available methods. Consider being multimethodic and multidisciplinary, and consider triangulation of methods, data sources, investigators, and theories. Strive for methodical (e.g., biometrics) advice.

9. Motivate a sufficient number of users to participate. Consider multicentric trials and financial or other compensation.

10. Use validated evaluation instruments wherever possible.

11. Be open to unwanted and unexpected effects.

12. Publish your results and what you learned to allow others to learn from your work.

One of the most discussed aspects is the selection of adequate methods and tools (Point 6) and, here especially, the adequate application of multimethodic and multidisciplinary approaches (Ammenwerth et al., 2004). The interdisciplinary nature of evaluation research in medical informatics includes that a broad choice of evaluation methods is available for various purposes. In Sections II and III of this topic, several distinct quantitative and qualitative evaluation methods have been presented and discussed in detail. All of them have their particular application area. However, in many situations, the evaluator may want to combine the methods to best answer the evaluation questions at hand. Especially in more formative (constructive) studies, a combination of methods may seem necessary to get a more complete picture of a situation. To support this, the method of triangulation has been developed and will now be presented in more detail.

THE THEORY OF TRIANGULATION

The term triangulation comes from navigation and means a technique to find the exact location of a ship base on the use of various reference points. Based on this idea, triangulation in evaluation means the multiple employments of data sources, observers, methods, or theories, in investigations of the same phenomenon (Greene & McClintock, 1985). This approach has two main objectives: First, to support a finding with the help of the others (validation); second, to complement the data with new results, to find new information, to get additional pieces to the overall puzzle (completeness; Knafl & Breitmayer, 1991).

Triangulation is, based on work by Denzin (1970), usually divided into the following four types, which can be applied at the same time:

• Data triangulation: Various data sources are used with regard to time, space, or persons. For example, nurses from different sites are interviewed, or questionnaires are applied at different times.

• Investigator triangulation: Various observers or interviewers with their own specific professional methodological background take part in the study, gathering and analyzing the data together. For example, a computer scientist and a social scientist analyze and interpret results from focus group interviews together.

• Theory triangulation: Data is analyzed based on various perspectives, hypotheses or theories. For example, organizational changes are analyzed using two different change theories.

• Methods triangulation: Various methods for data collection and analysis are applied. Here, two types are distinguished: within-method triangulation (combining approaches from the same research tradition), and between-method triangulation (combining approaches from both quantitative and qualitative research traditions, also called across-method triangulation). For example, two different quantitative questionnaires may be applied to access user attitudes, or group interviews as well as questionnaires may be applied in parallel.

It should be noticed that the term triangulation is only used when one phenomenon is investigated with regard to one research question.

The term triangulation is often seen strongly related to the term multimethod evaluation; because methods triangulation is seen as the most often used triangulation approach. However, as we want to stress, it is not limited on the combination of methods, but also describes combination of data sources, investigators, or theories.

Example: Triangulation during the Evaluation of a Nursing Documentation System

Background of the Study

Nursing documentation is an important part of clinical documentation. There have been some attempts and discussions on how to support the nursing documentation using computer-based documentation systems.

In 1997, Heidelberg University Medical Center started to introduce a computer-based nursing documentation system in order to systematically evaluate preconditions and consequences. Four different (psychiatric and somatic) wards were chosen for this study.

In the following paragraphs, we will concentrate on those parts of the study that are relevant for the triangulation aspects of the study. Please refer to other publications for more details on methods and results, such as (Ammenwerth, Mansmann, Iller, & Eichstadter, 2003; Ammenwerth et al., 2001).

Three of the four study wards had been selected by the nursing management for the study. On all three wards, the majority of nurses agreed to participate. Ward B volunteered to participate. The four study wards belonged to different departments. Wards A and B were psychiatric wards, with 21 resp. 28 beds; Ward C was a pediatric ward for children under two years of age, with 15 beds; Ward D was a dermatological ward, with 20 beds.

Our study wards were quite different with regard to nursing documentation. In Wards A and B, a complete nursing documentation based on the principles of the nursing process-for details on nursing process, see, for example, Lindsey and Hartrick (1996)-had been established for several years. In contrast, in Wards C and D, only a reduced care plan was documented; documentation was mostly conducted in the ward office. Only in Ward C, major parts of documentation were also conducted in the patients’ rooms. The youngest staff member could be found in Ward D; the staff least experienced in computer use was in Ward C.

Study Design

The software PIK (Pflegeinformations-und Kom-munikationssystem, a German acronym for “nursing information and communication system”) was introduced on those four wards. The functionality covered the six phases of the nursing care process. The study period was between August 1998 and October 2001. Wards A and B started in 1998 with the introduction of the documentation system; Wards C and D joined in 2000.

The study consisted of two main parts: The objective of the more quantitative study was to analyze the changes in the nurses’ attitudes with regard to nursing process, computers in nursing, and nursing documentation system, after the introduction of the computer-based system. Standardized, validated questionnaires were applied based on Bowman, Thompson, and Sutton (1983), for nurses’ attitudes on the nursing process; on Nickell and Pinto (1986), for computer attitudes; on Lowry (1994), for nurses’ attitudes on computers in nursing; and on Chin (1988) and Ohmann, Boy, and Yang (1997), for nurses’ satisfaction with the computer-based nursing documentation system. We carefully translated those questionnaires into German and checked the understandability in a prestudy. We used a prospective intervention study with three time measurements: approximately three months before introduction (“before”); approximately three months after introduction (“during”); and approximately nine months after introduction (“after”).

The second part of the study was a more qualitative study. Here, the objective was to further analyze the reasons for the different attitudes on the wards. The quantitative study exactly described these attitudes, and the qualitative study was now intended to further explain those quantitative results. The qualitative study was conducted in February 2002, after the analysis of the quantitative study was finished. In this qualitative study, open-ended focus group interviews were conducted with up to four staff members from each ward (most of them already have taken part in the quantitative study), with the three project managers from each department, and with the four ward managers from the wards. Open-ended means that the interviews were not guided by predefined questions. We used two general questions that started the interviews (e.g., “How are you doing with PIK?” “How was the introduction period”)? The rest of the interview was mostly guided by the participants themselves, with relatively little control exerted by the interviewers.

All interviews were conducted by a team of two researchers. They took about one hour each. The interviews were audio taped and analyzed using inductive, iterative content analysis based on Mayring (1993). This means that the transcripts were carefully and stepwise analyzed, using the software WinMaxProf98.

In the following paragraphs, only those results of the quantitative and qualitative study relevant for the triangulation aspects of the study will be presented. Please refer to the already mentioned study publications for more details.

Results of Quantitative Analysis of user Attitudes

All in all, 119 questionnaires were returned: 23 nurses answered all three questionnaires, 17 nurses answered two, and 16 nurses answered one questionnaire. The return rates were 82% for the first questionnaire, 86.5% for the second questionnaire, and 90.2% for the third questionnaire. A quantitative analysis of the individual items of the questionnaires revealed unfavorable attitudes, especially in Ward C. In both Wards C and D, the nurses stated that the documentation system does not “save time” and does not “lead to a better overview on the course of patient care.” In addition, in Ward C, the nurses stated that they “felt burdened in their work” by the computer-based system and that the documentation system does not “make documentation easier.” In Wards A and B, the opinions with regard to those items were more positive.

The self reported daily usage of the computer-based documentation system was quite similar among all wards: about 1 to 2 hours a day during the second and third questionnaires, with highest values in Ward B and lowest values in Ward A. The self-confidence with the system, as stated by the nurses, was rather high on all wards during both the second and third questionnaire. The mean values were between 3 and 3.7 during the second questionnaire and between 3.4 and 3.8 during the third questionnaire (1=minimum, 4=maximum).

Statistical analysis revealed that the overall attitude on the documentation system during the third questionnaire was positively correlated to the initial attitude on the nursing process, to the attitude on computers in general and to the attitude on computers in nursing. Both computer attitude scores were in turn positively correlated to the years of computer experience. For details, see (Ammenwerth, Mansmann, et al., 2003).

Overall, the results of quantitative analysis pointed to a positive attitude on the computer-based nursing documentation already shortly after its introduction, which significant increase on three of the four wards later on. However, on ward C, the quantitative results revealed negative reactions, showing a heavy decline in the attitude scores during the second questionnaire. On ward C, the overall attitude of the computer-based system remained rather negative, even during the third questionnaire. What could be the reasons? In order to answer this question, a subsequent qualitative study was conducted.

Results of Qualitative Analysis of user Attitudes

This part of the study was conducted as planned. Overall, about 100 pages of interview transcript were analyzed. Details of the interviews are published elsewhere (Ammenwerth, Iller, et al., 2003); we will summarize only the main points.

In Ward C, some distinct features came up in the interviews that seem to have lead to low attitude scores at the beginning. For example, the nursing process had not been completely implemented before, thus the documentation efforts now were much higher. Documentation of nursing tasks covered a 24 hour day, due to the very young patients and their high need for care. Thus, the overall amount of documentation on Ward C was higher. Patient fluctuation was also highest in ward C. Nurses found it time-consuming to create a complete nursing anamnesis and nursing-care plan for each patient. The previous computer experience and number and availability of motivated key users was seen as rather low in Ward C. Then, during the introduction of the nursing documentation system, the workload was rather high in Ward C due to staff shortage, which increased pressure on the nurses. Finally, and most important, nursing documentation had previously at least partly been carried out in the patients’ rooms. However, during our study, computers were installed only in the ward office. No mobile computers were available, which, according to the nurses, lead to time-consuming and inefficient double documentation.

Interesting differences were found between the nurses and the project management. For example, the nurses stated in the interviews that they were not sufficiently informed on the new documentation system, while the project management stated to have offered information that had not been used. Another example is that the nurses felt that training was insufficient. In the opinion of the project management, sufficient opportunities had been offered. We will later see how this divergent information helps to complete the overall picture.

In Ward D, the attitude on the documentation system was high in the interviews. The nurses saw benefits, especially in a more professional documentation, which would lead to a greater acknowledgment of nursing. Standardized care planning was seen to make care planning much easier, without reducing the individuality of the patient. Overall, the ward felt at ease while working with the new documentation system.

In Wards A and B, the attitudes were also positive. The nurses stressed the better legibility of nursing documentation in the interviews. They said that time effort for nursing care planning was lower, but overall, time effort for nursing documentation was much higher than before. The interviews showed that the introduction period had been filled with anxiety and fear about new requirements for the nurses. Now, after some time, the nurses felt self-confident with computers. An interesting discussion arose on the topic of standardization. Most nurses felt that standardized care plans reduced the individuality of the care plans, and that they did not really reflect what is going on with the patient. Finally, those wards, too, mentioned insufficient teaching and support in the first weeks.

These rather short summaries, from the interviews, should highlight some distinct features of the wards, showing similarities (e.g., on insufficient teaching and fears at the beginning), but also differences (e.g., on the question on standardized care plans or time effort).

Application of triangulation in This study

After analysis of the quantitative study and the qualitative study, we now want to see how the different results can be put together to get a broader picture of the effects and preconditions of a nursing documentation system. We thus applied all four types of triangulation as described by Denzin (1970):

• Data triangulation: Various data sources were used: Within the quantitative study, data triangulation with regard to time was used as the questionnaires were submitted three times to the same users (data triangula-tion with regard to time). In addition, in the interviews, not only nurses but also project management and ward management were interviewed (data triangulation with regard to persons).

• Investigator triangulation: Within the qualitative study, the two interviewers had different backgrounds (one more quantitative coming from medical informatics, the other, more qualitative, coming from social science). Both acted together as interviewers, analyzed the transcript together, and discussed and agreed on results and conclusions.

• Theory triangulation: We learned from various complementing theories to better understand the results of our studies. For example, to explain the implementation phases, as well as from the change theory of (Lewin, 1947; unfreezing, moving, refreezing phase). With regard to user evaluation, we used the technology acceptance model (TAM) of Davis (1993), and the task-technology-fit model (TTF) of Goodhue (1995). • Methods triangulation: We applied be-tween-methods triangulation by applying both quantitative questionnaires and qualitative focus-group interviews to investigate user’s attitudes.

As stated in the introduction, triangulation has two main objectives: To confirm results with data from other sources (validation of results) and to find new data to get a more complete picture (completeness of results). We will now briefly discuss whether triangulation helped to achieve those goals.

Validation of results

Validation of results is obtained when results from one part of the study are confirmed by congruent (not necessarily equal) results from other parts of the study. In our example, some parts of the study showed congruent results:

First, both the questionnaire and the interviews focused on attitudes issues. In this area, both approaches lead to congruent results, showing, for example, favorable attitudes in three wards. In addition, both the questionnaires and the interviews showed problems with regard to the user satisfaction with the nursing documentation system in Ward C. However, as the interviews were conducted later, they could better show the long-term development in the wards. Hence, both data sources thus showed congruent results.

Second, we found congruent results of the two scales attitudes on nursing process and attitude on the computer-based nursing documentation system within the standardized questionnaires. Both focus on different attitude items, both showed comparable low results in Ward C and higher results on the other wards, pointing to congruent measurements.

Those two selected examples show how results of some parts of the study could be validated by congruent results from other parts of the studies.

Completeness of results

Besides validation, triangulation can increase completeness when one part of the study presents results which have not been found in other parts of the study. By this new information, the completeness of results is increased. The new information may be complementary to other results, or it may present divergent information.

In our study, both questionnaires and interviews presented partly complementary results, which led to new insights. For example, impact of the computer-based documentation system on documentation processes and communication processes had not been detected by the questionnaire (this aspect had not been included in the questions). However, the documentation system seems to have influenced the way different healthcare professionals exchanged patient-related information. This led to some discussion on this topic on all wards in the interviews and seems to have had an impact on the overall attitude. Those effects only emerged in the group interviews (and not in the questionnaires); enlarging the picture of the effects of the nursing documentation system and helping to better understand the reactions of the different wards.

Another example is the complementarity of the results in the interviews and questionnaires in Ward C. The interviews were done some time after the questionnaires. Thus, during this time, changes may have occurred. The change theory of Lewin (1947) stated that organizational changes occur in three phases: unfreezing (old patterns must be released, combined with insecurity and problems), moving (new patterns are tested), and refreezing (new patterns are internalized and seen as normal). The low attitude scores in Ward C, even at the last measurement point, indicate that the ward was in the moving phase during this time. During the interviews, the stress articulated by the nurses seems to be less severe. This can be interpreted as Ward C’s slowly changing from the moving into the refreezing phase.

Triangulation can thus help to get a more complete picture of the object under investigation. Often, especially when applying various methods during the investigation, the results will not be congruent, but they may be divergent (e.g., contradictory). This is an important aspect of triangulation, as divergent results can especially highlight some points, present new information, and lead to further investigation.

In our study, we found some divergent results. For example, during the group interviews, nurses from one ward stressed that they do not see a reduction in effort needed for documentation by the computer-based system. However, in the questionnaires, this ward indicated strong time reductions. This differences can lead to the questions of whether time efforts are judged with regard to the situation without the nursing documentation system (where the amount of documentation was much lower, and so was the time effort), or with regard to the tasks that have to be performed (the same amount of documentation can be done much quicker with the computer-based system). This discussion can help one better understand the answers. Interesting differences of point of view could also be found between the staff and the project management of one ward in the group interviews. While the nurses of this ward claimed in the interviews that training was suboptimal, the project management stated that sufficient offers had been made. Those apparent contradictions may point to different perceptions of the need for training by the different stakeholders. Those insights may help to better organize the teaching on other wards.

As those (selected) examples show, triangulation helped us to obtain a better picture of the reaction of the four wards. The evaluation results also led to some decision on how to improve the technical infrastructure as well as how to better organize the teaching and support in some wards. All wards are still working with the computer-based nursing documentation system.

Discussion

Medical informatics is an academic discipline and, thus, evaluation is an important part of any system development and implementation activity (Shahar, 2002; Talmon & Hasmann, 2002). However, many problems with regard to healthcare IT evaluation have been reported. Wyatt and Spie-gelhalter (1992) as well as Gremy and Degoulet (1993) already discussed the complexity of the field, the motivation issue, and methodological barriers to evaluation. Examples of meta-analysis of IT evaluation studies confirm those barriers (e.g. Brender, 2002; Johnston, Langton, Haynes, & Mathieu, 1994; Kaplan, 2001).

In this topic, we elaborated on a number of important problems and categorized them into three areas: the complexity of the evaluation object, the complexity of the evaluation project with its multitude of stakeholders, and the motivation for evaluation.

A kind of framework to support evaluation studies of ISs may be useful to address the problem areas discussed in this topic. In fact, many authors have formulated the necessity for such a framework (e.g., Grant, Plante, & Leblanc, 2002; Shaw, 2002).One important part of such a framework is the call for a multimethod evaluation. While triangulation has long been discussed and applied in research (one of the first being Campbell & Fiske, 1959), the idea of the possible advantages of multimethod approaches or triangulation in more general terms is not really reflected in medical informatics literature.

In general, both quantitative and qualitative methods have their areas and research questions where they can be successfully applied. By triangulating both approaches, their advantages can be combined. We found that both complementary and divergent results from the different sources gave important new information and stimulation of further discussion.

In the past, there has been a more basic discussion about whether intermethods triangulation is possible at all. It is discussed that the epistemo-logical underpinnings between quantitative and qualitative research paradigms may be so different that a real combination may not be possible (Greene & McClintock, 1985; Sim & Sharp, 1998). However, this argumentation is not taking into account that a tradition of research has formed beyond subjectivistic and objectivistic paradigms. Evaluation methods are chosen accordingly to research questions and the research topic. Thus, the question of which methods to apply and how to combine them only can be answered with respect to the research topic and the research question and not on a general basis. Thus, as important as this discussion might be in the light of progress in research methods, evaluation researchers in medical informatics may be advised to start to select and combine methods based on their distinctive research question. This gives evaluation researchers a broad range of possibilities to increase both completeness and validity of results, independent of his or her research tradition.

Conclusions

Evaluation studies in healthcare IT take a lot of time, resources, and know-how. Clearly defined methodological guidelines that take the difficulties of IS evaluation in healthcare into account may help to conduct better evaluation studies. This topic has classified some of the problems encountered in healthcare IT evaluation under the three main problem areas of a) complexity of the evaluation object, b) complexity of the evaluation project, and c) limited motivation for evaluation. We suggested a list of 12 essential recommendations to support the evaluation of ISs. A broadly accepted framework for IT evaluation in healthcare that is more detailed seems desirable, supporting the evaluator during planning and executing of an evaluation study.

Focusing on methodological aspects, we have presented some basics on triangulation and illustrated them in a case study. The correct application of triangulation requires-as other evaluation methods-training and methodological experience. Medical informatics evaluation research may profit from this well-established theory.