Information Technology Reference
In-Depth Information
3.2 3D Video Object Segmentation for Packet Prioritization
Shape information is necessary to separate foreground and background objects in
the scene. In the case of color plus depth video, joint segmentation of the color
and depth map can be performed to generate the shape information. The study car-
ried out in [28] describes the generation of shape information using joint segmen-
tation of color plus depth images in cluttered backgrounds or with low quality
video input. In the presented work, a simple segmentation algorithm is applied to
generate shape data based on the threshold levels of the depth map sequence.
Depth pixel values greater than the mean pixel value of that depth map frame, D av
are set to the maximum pixel value of 255. Otherwise the pixel value is set to 0.
The shape information generated by using this method is noisy. Therefore, objects
which contain few pixels are removed and merged with surrounding area.
3.3 Quality-Driven Coding
A video object is made of rectangular or arbitrarily shaped sets of pixels of the
video frame, capable of representing both natural and synthetic content types, e.g.
a talking person without the background, or any graphics or text [29]. The source
video frame may therefore, consist of a number of video objects, each of which
can be separately encoded and transmitted into separate sub streams in an object
based video coding scenario.
Paarticular optimization schemes, depending on the underlying network proto-
cols, can be applied to separate the object's sub streams for prioritized communi-
cation. For instance, UEP could be used if different priority sub-carriers would be
available for transmission of sub streams. The 802.11e protocol, for multimedia
transmission with different QoS over WLAN, supports traffic classes with differ-
ent priority levels, as described in section 2.3.
Video objects are prioritized on the basis of their expected distortion estimates
and their relative depth in the scene. Each of the segmented objects is parsed at the
packet level for estimating the distortion expected during its transmission. Let the
input video sequence consists of M video frames. Each video frame is separated
into L number of objects. Each object is coded separately using its own segmenta-
tion mask and is divided into N number of video packets. If expected distortion
due to the corruption of n th packet of m th frame of l th video object is α ( l, m ,n ), then
the total expected distortion E ( D l,m,n ) of the m th frame becomes
L
1
N
1
∑∑
E
(
D
)
=
α
(
l
,
m
,
n
)
(2)
l
,
m
,
n
l
=
0
n
=
0
The optimization problem is to minimise E ( D l,m,n ). This is achieved by reformatting
the bit-streams according to their importance. Based on the distortion estimates
coupled with depth information, each packet is assigned an importance level I n ,
which is used to reallocate packets with higher importance to channel with higher
protection and vice-versa. The importance factor is calculated by multiplying the
expected distortion estimate for a packet with its cumulative depth factor, CD .
Search WWH ::




Custom Search