Plane Sweeping in Eye-Gaze Corrected, Teleimmersive 3D Videoconferencing - Advances in Embedded Computer Vision

Graphics Reference

In-Depth Information

Fig. 6.5 Data flow and overview of our system architecture

Our software framework harnesses the powerful computational resources inside

the Graphics Processing Unit (GPU), achieving over real-time performance for Full

HD resolution images. Furthermore, although depicted as such in Fig. 6.4 , the dis-

tribution of the depth planes is not required to be uniform, which we elaborate on in

Sect. 6.4 for further computational complexity savings.

In comparison, competitive solutions such as the system of [ 6 ] implement their

framework on commodity CPUs, resulting in a very low frame rate when sufficient

visual quality is required. Others optimize only parts of the application, such as

multicamera video coding [ 5 , 16 ] for efficient data communication and real-time

view synthesis [ 9 , 21 , 29 ] on graphics hardware, but neither of them integrate and

optimize the end-to-end performance for eye-gaze corrected video chat.

The core functionality of our system is visualized in Fig. 6.5 and consists out of five

consecutive processing modules that are completely running on the GPU. In an initial

step (Sect. 6.3.1 ), the camera sensor Bayer patterns

ʹ 1 ,...,ʹ N are captured from a

total of N cameras C 1 ,...,

C N that are fixed on a custom built metal frame which

closely surrounds the screen (see Fig. 6.1 ). The first module computes the RGB-

images I 1 ,...,

I N , based on the method of [ 12 , 19 ], and performs lens correction

and image segmentation, as a form of preprocessing. The preprocessing module is

specifically designed to enhance both the quality and speed of the consecutive view

interpolation, and to ensure a high arithmetic intensity in the overall performance.

The secondmodule (Sect. 6.3.2 ) interpolates an image I v , as it would be seenwith a

virtual camera C v that is positioned behind the screen. The image I v is computed as if

camera C v captures the image through a completely transparent screen. Furthermore,

the view interpolation module produces a joint depth map Z v , providing dense 3D

information of the captured scene.

The synthesized image still contains a number of noticeable artifacts in the form

of erroneous patches and speckle noise. The third module (Sect. 6.3.3 ) is therefore

specifically designed to tackle these problems by detecting photometric outliers based

on the generated depth map.

Search WWH ::

Custom Search

Home