How Do You Train the Filter for a Task? - Logic-based Nonlinear Image Processing

Image Processing Reference

In-Depth Information

As the number of training samples N increases, the trained filter becomes closer to

the optimal, i.e.,

ψ

∝ →ψ

n.

The error between the optimum filter and the filter implemented within an n -

point window and trained on N training samples consists of two components.

n ,

[

]

[

]

E

∆

ψψ

,

=

∆

ψψ

,

+

E

∆

ψψ

,

(4.2)

nN

,

opt

n

opt

nN n

,

total error = constraint error + estimation error

The first component is known as the constraint error and is due to the filter being

restricted to an n -point window. The second component is known as the estimation

error and results from the fact that the number of training samples is finite. The con-

straint error is deterministic , i.e., it is fixed and repeatable for a given problem. The

estimation error is stochastic . This means that it is a statistical quantity and will

vary if the design process is repeated a number of times with different training data.

As has been seen in early examples, the constraint error reduces with increas-

ing n . The bigger the window, the more accurate the filter.

The estimation error reduces with increased training as can be seen in

Fig. 4.9(a). 1 Notice that the estimation error for smaller windows converges very

rapidly. However for some of the larger windows, the convergence is very slow

and even after 700,000 samples the 21-point window is showing a larger estima-

tion error than the smaller windows did at the start. This error is because the filter

is undertrained, i.e., the amount of training data is insufficient. The amount of

data required to reduce the estimation error to a reasonable level may be impossi-

bly large. When combined with the convergence error, the total error versus train-

ing data is shown in Fig. 4.9(b). The filters implemented in the smaller windows

converge very quickly. The filters implemented in larger windows eventually

converge to a lower error, but this can take a long time. For any given amount of

data, a different window size might give the lowest error. For example, after

100,000 samples, the 9-point window gives the best filter but by 200,000 samples

it has been superseded by the 13-point window. Eventually the 21-point window

will give the lowest error, but this is still a long way away. In fact even after

700,000 samples, the results of filtering with the 21-point window are still worse

than the original noisy image.

To illustrate this point, the results of Fig. 4.9 are presented differently in

Fig. 4.10(a). The total error for any given filter is plotted against window size for

fixed amounts of training data. For any size of training set, the error will fall to a

minimum as the size of the window increases, after which it will rise very rapidly.

Increasing the training set by an order of magnitude only serves to move the mini-

mum to a slightly larger value of window size.

Depending on the problem, a smaller window might be sufficient. In the case of

the graph in Fig. 4.10(b) the corrupting process was 5% salt-and-pepper noise. A

small window size (5 points) was capable of removing much of the noise and

Logic-based Nonlinear Image Processing

Search WWH ::

Custom Search

Home