Information Technology Reference
In-Depth Information
3R su s
Fig. 5 shows the variation of quality scores of three synchronized camera record-
ings. Visualization of frames at different points in time, indicated in Fig. 5, and
their corresponding quality scores are presented in Fig. 6. The frames shown in
Fig. 6(a), (b) and (c) are captured simultaneously. However, the views and the
quality scores are very different due to different camera positions. Fig. 6(d), (e)
and (f) show views from the same cameras as frames in Fig. 6(a), (b) and (c),
respectively, but captured later in time and thus with different views.
We apply the described video quality analysis on automatic mashup genera-
tion. The mashups are generated from synchronized multiple-camera recordings
by selecting 3 to 7 seconds long segments. The segment boundaries are deter-
mined based on the change in audio-visual content. The consecutive segments
in a mashup are selected from different recordings such that they add diversity
and high quality in the mashup content. The quality of a segment is computed
as the mean of the quality scores of the frames in the segment. The quality of a
mashup depends on the performance of our video quality analysis method, such
that a poor analysis of video quality would lead to a poor quality mashup. The
mashup quality could be objectively validated if the best and the worst-quality
mashups would be made available. However, there are no such existing methods
that ensure the definition and subsequent creation of such mashups or allow
an objective measure of a mashup quality. Therefore, we measure the perceived
quality of our mashup, by a subjective test against two other mashups: one
generated by a random selection of segments, i.e. random mashups , without con-
sidering the quality and another mashup generated manually by a professional
video-editor, i.e. manual mashups .
As a test set we use three multi-cam recordings, which are captured during
concerts by non-professionals and shared in YouTube. Each of the multi-cam
recordings contained 4 to 5 recordings with both audio and video streams (in
color). The duration of the recordings is between 2.4 and 5.6 minutes and their
frame rate is of 25 frames per second. The video resolution is 320
240 pixels. The
multiple-camera recordings and the mashups used in the test are made available
in website 1 , where the filenames C#, Naive, First-fit represent concert number,
random mashup and mashup generated by our method.
The random and our quality based mashups contain at least one segment from
all the given synchronized recordings and each segment is 3 to 7 seconds long.
The manual mashups are created by a professional editor. He was asked to create
mashups which are high in signal quality and nice to watch without any special
effects and temporal manipulations. It took approximately 16 hours to create 3
mashups from the given test set, using commercially available multi-cam editing
software. The considerable time and effort required for creating manual mashups
forced us to limit the size of the test set.
The subjective test involves 40 individuals, age between 20 and 30. The 9
mashups, generated using 3 methods and 3 concerts, are shown to the subjects
×
1 http://www.youtube.com/AutomaticMashup#p/u
 
Search WWH ::




Custom Search