In Silico Hypothesis Discovery - Translational Informatics: Realizing the Promise of Knowledge-Driven Healthcare

Information Technology Reference

In-Depth Information

using some sort of quantifi able metric. This is necessary as CI-based agents can

often generate thousands of hypotheses when reasoning over even a hundred or

more initial and terminal variables. It is unlikely that human beings will take the

time and expense (or have the energy and focus) to review and test all possible

hypotheses. In response to this need, we often go back to the source data or alter-

natively, look at published literature and the knowledge that can be extracted

from that literature (for example, the statistical distributions or co-occurrence of

two variables of interest in the data or literature respectively) to calculate a sup-

port metric. Such support metrics tell us how common or uncommon those data

or concepts are, and can be used to judge either the likelihood of the hypothesis

being testable and/or novel. Then, depending on our use case, we can apply such

metrics to prioritize or rank hypotheses for exploration and testing.

Phase 6 - Evaluation : Finally (and perhaps most importantly), we must evaluate

the output of CI-based agents using a variety of verifi cation and validation meth-

odologies. Such evaluations must incorporate multiple dimensions, include the

factual accuracy or validity of system output, its likelihood in terms of informing

novel hypotheses, and its overall utility as judged by the targeted end users.

Further details on specifi c approaches to addressing this particular need are pro-

vided in Sect. 8.4 .

8.4

Evaluating the Output of In Silico Hypothesis

Generation Tools and Methods

The verifi cation and validation of conceptual knowledge collections and the results

of intelligent agents that leverage such knowledge to reason over data sets is ide-

ally approached as an iterative and multi-method evaluation approach. First and

foremost, when designing and applying such evaluation plans, it is very important

to recognize and understand what types of process or outcomes measures are being

targeted. Attaining such an understanding, in the context of intelligent agent design,

requires us to differentiate between verifi cation and validation. To summarize the

defi nitions provided earlier, verifi cation is the evaluation of whether an intelligent

agent meets the perceived requirements of end-users, and validation is the evalua-

tion of whether that same agent meets the realized (i.e., “real-world”) requirements

of the end-users. The only difference between these techniques is that during veri-

fi cation, results are compared to initial design requirements, whereas during valida-

tion the results are compared to the requirements for the system that are realized

after its implementation.

Examples of verifi cation and validation criteria include the degree of interre-

latedness of the relationships discovered by the intelligent agent, the logical con-

sistency of those relationships, and multiple-source or expert agreement with the

results generated therein. Often, the degree of interrelatedness between relation-

ships generated by an intelligent agent for hypothesis discovery purposes is used

as a measure of its “quality”, with such “quality” being defi ned by the degree to

Search WWH ::

Custom Search

Home