Information Technology Reference
In-Depth Information
4.1
False Negative Error Cases
The test set contains some topmost section headings that are rarely used in EHRs.
These headings, such as “microbiology” and “habits”, only appeared a few times in
the training set. Due to the sparseness of these section names, it is difficult for the
machine learning-based section tagger to recognize these instances. We also observed
that some records adopted non-standard or idiosyncratic topmost section headings
along with abbreviations, which made it difficult to recognize them. Some non-
standard section headings or abbreviations found in the test set include: “All” for
“allergy” and “ROS” for “Review of Systems”.
4.2
False Positive Error Cases
Occasionally, the trained CRF-based section tagger recognizes non-section parts or
probable subsection headings of an EHR, which become the main source of FP cases.
For example, in the following snippet of a record: “The patient is a 75-year-old white
female with past medical history significant for throat cancer,” the tagger erroneously
recognized the non-section description “medical history” as a section heading. In
addition, some section headings, such as “laboratory”, can be topmost section head-
ings in one EHR, but are not topmost sections in the other EHRs. This may also con-
tribute to the occurrence of FP/FN cases.
5
Conclusion
This work presented a CRF-based method with a set of features developed for recog-
nizing section headings in EHRs. The experiment results showed that the proposed
CRF-based method evidently outperforms the dictionary-based approach in terms of
precision and F-scores. The proposed layout features, which captured the line break
information, can model the original layout given by medical doctors with the intention
of increasing readability. Implementing the layout features into our method resulted in
an improvement of the performance of section heading recognition, which can be
observed from the experiment results. Nevertheless, issues including the varieties of
section heading hierarchies among different EHRs and the arbitrary naming in section
headings, such as non-standard section heading abbreviations, still remain to be chal-
lenging problems for section heading recognition.
References
1. Aronson, A.: Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The
MetaMap Program. Journal of Biomedical Informatic 35, 17-21 (2001)
2. Denny, J.C., Miller, R.A., Johnson, K.B., Spickard III, A.: Development and evaluation of a
clinical note section header terminology. In: AMIA Annu. Symp. Proc., pp. 156-160 (2008)
3. Friedman, C., Shagina, L., Lussier, Y., Hripcsak, G.: Automated encoding of clinical doc-
uments based on natural language processing. J. Am. Med. Inform. Assoc. 11(5), 392-402
(2004)
Search WWH ::




Custom Search