Image Processing Reference
In-Depth Information
1 Introduction
Protein crystallography is one of the major research areas in the drug discovery industry since
it gives information about the 3D structure of the protein and its functionality [ 1 ] . Growing a
protein crystal structure is a complex process that comprises of several sensitive stages. Every
stage requires high atention since some parameters, such as pH, temperature, type, and por-
tion of the salt and the precipitant need to be set carefully. (Note that it is possible to gener-
ate millions of different solutions using different chemicals for protein crystallization process.)
Therefore, growing a crystal usually requires many trials, and most of the trials do not yield
a desired protein crystal [ 2 ] . In non-automated systems, hundreds of images of proteins need
to be checked manually by the experts in order to detect the crystal formation, which is a time
consuming process [ 1 ] . For this reason, detecting and analyzing of protein crystals using an
automated system is significantly important for the experts to save time and effort. There are
several examples of automated systems in use, and most of them use typically regional, geo-
metrical or texture features of the protein images to detect and classify crystals. In order to
extract the features correctly, accurate image segmentation is required for these systems.
Image binarization (thresholding) is one of the widely used preprocessing tasks in most of
the systems. Thresholding is an operation that converts grayscale image into a black and white
image using a threshold value τ . This τ value can be selected using different techniques. There-
fore, image thresholding can be mainly grouped into two based on the selection of τ value:
global and local thresholding. If an image is binarized using a single τ value, it is called “global
thresholding.” If τ value varies depending on pixel position due to some local features of the
image, then it is called “local thresholding” [ 3 ] . In the literature, there are many studies that
focus on different aspects of the problems. The studies focus on their own problem domain to
ind the best approach for binarization [ 4 ], and there is not an optimal solution that works for
all cases.
All thresholding techniques have been developed based on some assumptions, and all of
the methods have some strengths and weaknesses. Therefore, all methods may fail under
some circumstances. For example, Otsu's thresholding method, which is one of the popular
thresholding methods in the literature, is affected by the size of the objects in the image. If
the object size is too large or too small, this method will probably fail in segmentation oper-
ation. This may lead extraction of incorrect features for the system [ 5 ] . As another example,
most of document binarization methods assume that document has whitish background color.
This means that if the document has dark background, those methods will generate improper
binary images. Thus, there is no single thresholding method that can generate proper binary
images for all images in the datasets such as medical images, biological images, and especially
protein crystallization images.
Crystal images are anticipated to have characteristic features such as high intense regions,
clear edges, and proper geometric shapes. These features are mostly used in classification pro-
cess, but in some cases, they may not be observed clearly due to focusing or reflection prob-
lems in the image. Capturing clear images is an important step to extract reliable features as
well as to binarize images correctly. Most of protein images have non-uniform illumination,
low contrast, and noise since proteins are grown in a liquid solution. This makes thresholding
process more complex task. In this study, we investigated several thresholding methods for
protein crystal imagery. In most cases, while one method generates proper binary images for
some of the protein images, another method generates beter results for the others. Obviously,
a single type of thresholding technique is not enough to generate a useful binary image to use
in classification of the protein images. Thus, it is very important to select the correct binariz-
Search WWH ::

Custom Search