Recognition of Meter Values - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

are recognized with high confidence. Part (b) shows examples where recognition

failed. Segmentation problems seem to be the most frequent reason for substitu-

tions. If the digit is not tightly framed and some additional foreground structure

is present in the normalized image, it deviates from the typical appearance and is

therefore difficult to recognize. Missing digit parts also complicate recognition. Un-

usual context may as well cause substitutions, as in the example in the second row

of the figure. Here, the left digit has been correctly recognized as 4 by the block rec-

ognizer, but the right digit 0 occurs only very rarely next to a 4 in the dataset. The

digits 6 , 4 , and 1 are much more common in this context. Consequently, the digit is

recognized as 6 with medium confidence. While in this particular example, the use

of the context information does not seem to be beneficial, in general it facilitates

recognition, as can be concluded from the following control experiment.

The same network was trained with a context vector that was set to zero. Without

access to the context information, the classification performance degrades. The best

test performance for the left digit has now a substitution rate of 1.91%. The best

right digit classifier substitutes even 7.25% of the test images when all examples

are accepted. These figures show the importance of context for the recognition of

isolated digits.

7.5.3 Combination with Block Recognition

Digit recognition is not done for all examples, but only if the block recognizer is

not confident enough. If its classification confidence for one of the digits is below

a threshold ρ , this digit is preprocessed and presented to the digit classifier. When

ρ = 0 . 9 is chosen, 134 (12.2%) blocks are rejected from the 1,099 test examples.

82 (61%) left digits and 101 (75%) right digits are ambiguous.

The outputs of the digit recognizer need to be combined with the ones of the

block classifier. This is done by computing the average output v c = ( v b + v d ) / 2 ,

where v d denotes the output vector of the digit classifier and v b is the correspond-

ing section of the block classifier output. The digit's confidence is again set to the

difference between the most active and the second most active combined output. It

does not exceed the higher one of the two digit confidences.

Figure 7.18 illustrates some typical cases of output combination. If both classi-

fiers are confident and agree on the class, the combined output is confident. If both

classifiers disagree on the class, the output is not confident. If one classifier is silent,

↓

(a) (b) (c) (d)

Fig. 7.18. Combination of outputs from block classifier and digit classifier: (a) both classifiers

agree; (b) both classifiers disagree; (c) one classifier is inactive, while the other is confident;

(d) one classifier is undecided between two classes, while the other is confident.

Search WWH ::

Custom Search

Home