Information Technology Reference
In-Depth Information
11.2.2 User Evaluation and Popularity Prediction
The most obvious way of validating an AJS (at least one with learning capacities)
may be to employ a set of images pre-evaluated by humans. The task of the AJS
is to classify or “to assign an aesthetic value to a series of artworks which were
previously evaluated by humans” (Romero et al. 2003 ).
There are several relevant papers published in the image processing and computer
vision research literature that are aimed at the classification of images based on
aesthetic evaluation. Most of them employed datasets obtained from photography
websites. Some of those datasets are public, so they allow testing of other AJSs. In
this section we perform a brief analysis of some of the most prominent works of this
type.
Ke et al. ( 2006 ) proposed the task of distinguishing between “high quality profes-
sional photos” and “low quality snapshots”. These categories were created based on
users' evaluations of a photo website, so, to some extent, this can be considered as a
classification based on aesthetic preference. The website was the dpchallenge.com
photography portal, and they used the highest and lowest rated 10 % images from a
set of 60,000 in terms of average evaluation. Each photo was rated by at least 100
users. Images with intermediate scores were not considered.
The authors employed a set of high-level image features (such as spatial distri-
bution of edges, colour distribution, blur, hue count) and a support vector machine
classification system, obtaining a correct classification rate of 72 %. Using a combi-
nation of these metrics with those published by Tong et al. ( 2004 ), Ke et al. ( 2006 )
achieved a success rate of 76 %.
Luo and Tang ( 2008 ) employed the same database. The 12,000 images of the
dataset are accessible online 2 allowing the comparison of results. Unfortunately,
neither the statistical information of the images (number of evaluations, average
score, etc.) nor the images with intermediate ratings are available. The dataset is
divided into two sets (training and test), made up of 6,000 images each. The authors
state that these sets were randomly created. However, when one reverses the role of
the test and training sets (i.e. training with original “test” set and testing with the
original “training” set) the results differ significantly. This result indicates that the
test and training set are not well-balanced.
Additionally, Luo and Tang ( 2008 ) used a blur filter to extract the background
and the subject from each photo. Next, they employed a set of features related to
clarity contrast (the difference between the crispness of the subject region and the
background of the photo), lighting, simplicity, composition and colour harmony.
They obtained a 93 % success rate using all features, which clearly improved upon
previous results. The “clarity contrast” feature alone yields a success rate above
85 %. The authors pointed out that the difference between those results and the ones
obtained by Ke et al. ( 2006 ) can be derived from the application of metrics to the
image background regions and to the greater adequacy of the metrics itself.
2 http://137.189.97.48/PhotoqualityEvaluation/download.html .
Search WWH ::




Custom Search