Automatic classification of protein crystal images - Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Image Processing Reference

In-Depth Information

ation conditions are required for successful crystallization [ 1 ]. High throughput systems have

been developed in recent years trying to identify the best conditions to crystallize proteins [ 1 ].

Imaging techniques are used to monitor the progress of crystallization. The crystallization tri-

als are scanned periodically to determine the state change or the possibility of forming crys-

tals. With a large number of images being captured, it is necessary to have a reliable classiic-

ation system to distinguish the crystallization states each image belongs to. The main goal is

to discard the unsuccessful trials, identify the successful trials, and possibly identify the trials

which could be helpful. The main interest for crystallographers is the formation of large 3D

crystals suitable for X-ray diffraction. Other crystal structures are also important as the crys-

tallization conditions can be optimized to get beter crystals. Therefore, it is necessary to have

a reliable system that distinguishes between different types of crystals according to the shapes

and sizes.

Many research studies have been done to distinguish crystallization trial images according

to the presence or absence of crystals [ 2 - 6 ] . In our previous work [ 7 ] , we presented classiica-

tion of crystallization trials into three categories (non-crystals, likely-leads and crystals). Saitoh

et al. [ 8 ] proposed crystallization into five categories (clear drop, creamy precipitate, granu-

lated precipitate, amorphous state precipitate, and crystal) and Spraggon et al. [ 9 ] described

classification into six categories (experimental mistake, clear drop, homogeneous precipitant,

inhomogeneous precipitant, micro-crystals, and crystals). Likewise, Cumba et al. [ 10 ] classi-

ied crystallization trials into six basic categories (phase separation, precipitate, skin effect,

crystal, junk, and unsure). In all these studies, the objective of the classification is to identify

the different phases of crystallization. In other words, classification of protein crystal images

according to the shapes and sizes of crystals has not been the main focus of the previous other

work.

For feature extraction, a variety of image processing techniques have been proposed. Cumba

et al. [ 2 ], Saitoh et al. [ 11 ] , and Zhu et al. [ 12 ] used a combination of geometric and texture

features as the input to their classifier. Saitoh et al. [ 8 ] used global texture features as well

as features from local parts in the image and features from differential images. Cumba et al.

[ 10 ] extracted several features such as basic statistics, energy, Euler numbers, Radon-Lapla-

cian features, Sobel-edge features, micro-crystal features, and GLCM features to obtain a large

feature vector. We presented classification using region features and edge features in Ref. [ 13 ] .

Increasing the number of features may not necessarily improve the accuracy. Moreover, it may

slow down the classification process. Therefore, finding a minimal set of useful image features

for classification is important.

This work introduces our technique for automatic classification of trial images consisting of

crystals. Our focus is on classifying crystallization trial images according to the types of pro-

tein crystals present in the images. Our previous study [ 13 ] atempted to classify trial images

according to the types of crystals (needle crystals, small crystals, large crystals, other crystals).

This work extends our earlier work [ 13 ] on crystallization trials classification. The main object-

five is to improve the classification accuracy with new additional image features and using an

alternate classification technique. Our feature extraction includes extracting edge related fea-

tures from Canny edge image, extracting blob related features from multiple binary images,

and extracting line features using Hough transform. The images are classified into four cat-

egories: needles, small crystals, large crystals, and other crystals. For the decision model, we

investigate random forest classifier in addition to decision tree classifier proposed in Ref. [ 13 ].

This chapter is arranged as follows. The following section describes the image categories for

the classification problem considered in this paper. Section 3 provides the system overview.

Section 4 describes the image processing and feature extraction steps used in our research.

Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Search WWH ::

Custom Search

Home