Information Technology Reference
In-Depth Information
look up entries for HTN mentions and BP values. These gaps are addressed in the
new HTNSystem by adding additional components and modules. For example, the
system failed to identify and infer HTN information in "Initial blood pressure systolic
218 was reduced to 190 range". Adding additional Ruta rules based on error analysis,
the HTNSystem was able to identify these annotations. The previous system identi-
fied "blood pressure and improved with carotid sinus massage during exam on 01/96"
as blood pressure value of 01/96 which is actually a date value. These false positives
were filtered out using more robust context-based rules and filtering built into
HTNSystem components. As a result of this, the performance of the current HTNSys-
tem is significantly improved as it evident from F-measure score. It correctly identi-
fied 471 HTN mentions out of 537 achieving a recall of 0.8770 and a precision of
0.7863 with false positive and false negative counts at 128 and 66 respectively. The
overall performance metrics on test set are illustrated in Table 3 based on the pro-
vided gold standard annotations. This significant improvement in the system can be
attributed because of improved BP value extractor component and addition of new
context based BP value abbreviation and post processing components developed in
this work.
Table 3. TMUNSW & HTNSystem performance metrics for HTN mentions
System
TP
FP
FN
Precision
Recall
F-measure
TMUNSW
421
336
116
0.556
0.783
0.650
HTNSystem
471
128
66
0.7863
0.877
0.829
4
Discussion
In this study, we presented HTNSystem, a highly configurable and generic informa-
tion extraction system capable of extracting HTN information from records. HTNSys-
tem is a package of custom built components and MetaMap. The HTNSystem had a
good overall performance on the corpus, but the results may vary depending on the
corpus. The performance of the HTNSystem is very similar to the performance of
other systems. However, these systems extracted either HTN mentions or BP values
[5, 6, 12, 13]. The BP of a patient can be elevated due to various reasons and the
HTNSystem didn't consider HTN medications or treatment information to infer HTN.
We selected a few of such records and tried to identify rules or context patterns but
unfortunately didn't find enough information to classify those mentions as HTN. The
performance of the system can be further improved by testing the system on other
corpuses [12, 13]. The system demonstrates the feasibility of its application in identi-
fying HTN as a risk factor for other diseases and identifying cohorts based on HTN
information.
The HTN annotations made by HTNSystem were manually reviewed and was dis-
covered that the system missed few HTN mentions in sentences like "blood pressure,
and found it to be 220/140", in this case the system failed to identify BP value be-
Search WWH ::




Custom Search