Quality-Driven Coding and Prioritization of 3D Video over Wireless Networks - High-Quality Visual Experience

Information Technology Reference

In-Depth Information

3.2 3D Video Object Segmentation for Packet Prioritization

Shape information is necessary to separate foreground and background objects in

the scene. In the case of color plus depth video, joint segmentation of the color

and depth map can be performed to generate the shape information. The study car-

ried out in [28] describes the generation of shape information using joint segmen-

tation of color plus depth images in cluttered backgrounds or with low quality

video input. In the presented work, a simple segmentation algorithm is applied to

generate shape data based on the threshold levels of the depth map sequence.

Depth pixel values greater than the mean pixel value of that depth map frame, D av

are set to the maximum pixel value of 255. Otherwise the pixel value is set to 0.

The shape information generated by using this method is noisy. Therefore, objects

which contain few pixels are removed and merged with surrounding area.

3.3 Quality-Driven Coding

A video object is made of rectangular or arbitrarily shaped sets of pixels of the

video frame, capable of representing both natural and synthetic content types, e.g.

a talking person without the background, or any graphics or text [29]. The source

video frame may therefore, consist of a number of video objects, each of which

can be separately encoded and transmitted into separate sub streams in an object

based video coding scenario.

Paarticular optimization schemes, depending on the underlying network proto-

cols, can be applied to separate the object's sub streams for prioritized communi-

cation. For instance, UEP could be used if different priority sub-carriers would be

available for transmission of sub streams. The 802.11e protocol, for multimedia

transmission with different QoS over WLAN, supports traffic classes with differ-

ent priority levels, as described in section 2.3.

Video objects are prioritized on the basis of their expected distortion estimates

and their relative depth in the scene. Each of the segmented objects is parsed at the

packet level for estimating the distortion expected during its transmission. Let the

input video sequence consists of M video frames. Each video frame is separated

into L number of objects. Each object is coded separately using its own segmenta-

tion mask and is divided into N number of video packets. If expected distortion

due to the corruption of n th packet of m th frame of l th video object is α ( l, m ,n ), then

the total expected distortion E ( D l,m,n ) of the m th frame becomes

−

∑∑

(

)

(

)

(2)

The optimization problem is to minimise E ( D l,m,n ). This is achieved by reformatting

the bit-streams according to their importance. Based on the distortion estimates

coupled with depth information, each packet is assigned an importance level I n ,

which is used to reallocate packets with higher importance to channel with higher

protection and vice-versa. The importance factor is calculated by multiplying the

expected distortion estimate for a packet with its cumulative depth factor, CD .

High-Quality Visual Experience

Search WWH ::

Custom Search

Home