Section Heading Recognition in Electronic Health Records Using Conditional Random Fields - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

name, hospital name, identity number, carbon copy, merged sections such as “assess-

ment and plan”, and so on.

In our experiment, the set1 data was used as the training set for the CRF model; the

set2 data set was used to develop features. Finally, the testing set was used to test the

performance of the developed model trained on set1.

3.2

Evaluation Schemes

The standard recall, precision and F-measure metrics (RPF) were used to evaluate the

performance of the developed CRF model and its comparison with a dictionary-based

method.

the

number

of

correctly

recognized

Section

Heading

chunks

Precision

=

the

number

of

recognized

Section

Heading

chunks

the

number

of

correctly

recognized

Section

Heading

chunks

Recall

=

the

number

of

true

Section

Heading

chunks

This work defines a correctly recognized section-heading chunk (a true positive

case) as a case in which the text span of the recognized section heading is completely

matched with the span of the manually annotated heading. Therefore, a false positive

(FP) case includes any unmatched section headings generated by the computer.

3.3

Experiment Configurations and Results

This work developed a dictionary-based method based on the maximum matching

algorithm as a baseline system to compare its performance with that of the CRF-based

method. Three dictionaries were used by the dictionary-based methods: the SecTag

section header terminology (the “SecTag” configuration), the section heading names

collected from the training 1 set (the “Training” configuration), and the union of

the two dictionaries (the “SecTag+Training” configuration). In addition, to study the

effect of the proposed layout features, this work trained two CRF models; one is the

model with all proposed feature sets (CRF-based+Layout Features), and the other

excluded the layout features (CRF-based).

The experimental results of the five configurations are shown in Table 2. The best

recall on both datasets was achieved by the dictionary-based method without using

the section names from SecTag. However, the CRF-based method noticeably outper-

forms the dictionary-based methods in terms of P and F-scores.

On the test dataset, the best CRF-configuration achieved a P-score of 0.955, which

outperforms the best configuration of the dictionary-based method by 0.496. With the

layout feature, both precision and recall of the CRF-based method can be improved

by 0.112 and 0.174, respectively. CRF-based methods with layout features achieved

the best performance in terms of PRF. Similar observations can also be examined on

the development set.

1 When testing on the test set, the section names from set 2 were also added into the dictionary.

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home