Image Processing Reference
generally empty surrounding areas. As a beter analysis of corner projections and Hough lines
is integrated into our algorithm, it will become possible to classify inputs as definitely tradi-
tional or more irregular. If this classification can work reliably, the method could switch to a
much slower and generalized localization to produce beter results in this situation while still
quickly returning results for more common layouts.
FIGURE 16 Irregular NL layouts.
We have made several interesting observations during our experiments. The row and column
projects have two distinct paterns. The row projection tends to create evenly spaced short
spikes for text in each line of text within the NL while the column projection tends to contain
one very large spike where the NL begins at the left due to the sudden influx of detected text.
We have not performed any in-depth analysis of these paterns. However, the projection data
were collected for each processed image. We plan to do further investigations of these pat-
terns, which will likely allow for run-time detection and corresponding correction of inputs of
various rotations. For example, the column projections could be used for greater accuracy in
determining the left and right bounds of the NL while row projections could be used by later
analysis steps such as row division. Certain projection profiles could eventually be used to se-
lect customized localization approaches at run time.
During our experiments with and iterative development of this algorithm, we took note of
several possible improvements that could positively affect the algorithm's performance. First,
since input images are generally not square, the HT returns more results for lines in the longer
dimension, because they are more likely to pass the threshold. Consequently, specifying dif-
ferent thresholds for the two dimensions and combining them for various rotations may pro-
duce more consistent results.
Second, since only those Hough lines that are nearly vertical or horizontal are of use to this
method, improvements can be made by only allocating bins for those Θ and ρ combinations
that are considered important. Fewer bins means less memory to track all of them and fewer
tests to determine which bins need to be incremented for a given input.
Third, both row and column corner projections tend to produce distinct paterns which
could be used to determine beter boundaries. After collecting a large amount of typical pro-
jections, further analysis can be performed to find generalizations resulting in a faster method
to improve boundary selection.
Fourth, in principle, a much more intensive HT method can be developed that would divide
the image into a grid of smaller segments and perform a separate HT within each segment.
One advantage of this approach is to look for the skewed, curved, or even zigzagging lines
between segments that could actually be connected into a longer line. While the performance