Stereo Vision Algorithms Suited to Constrained FPGA Cameras - Advances in Embedded Computer Vision

Graphics Reference

In-Depth Information

depicted in the figure. In fact, this module manages the video streams processed by

the pipeline and other data available inside the FPGA according to the configuration

commands issued by the host. Optionally, the hardware design could be equipped

with an Inertial Measurement Unit (IMU) made of a gyroscope, an accelerometer

and other additional digital devices such as a GPS receiver or a digital compass. The

IMU can be useful for robotic applications in order to integrate the measurements

provided by a visual odometry module based on SLAM (Simultaneous Localization

And Mapping) techniques. Another optional component of the camera, managed by

the softcore, is the Motor controller unit. This module enables control of multiple

stepper motors according to software commands issued by the host and can be useful,

for instance, for handling pan and tilt.

The upper side of Fig. 5.2 summarizes the main steps executed by the vision

processing pipeline. Once the raw images provided by the image sensors are sent to

the FPGA, they are rectified in order to compensate for lens distortions. Moreover,

the raw stereo pair is put in standard form (i.e., epipolar lines are aligned to image

scanlines ). Both steps require a warping for each image of the stereo pair, which

can be accomplished by knowing the intrinsic and extrinsic parameters of the stereo

system [ 31 ]. Both parameters can be inferred by means of an offline calibration

procedure. Once the rectified images are in standard form, potential corresponding

points are identified by the stereo matching module as will be discussed in the next

sections. Unfortunately, since not all of the correspondences found by the previous

module are reliable, a robust outlier detection strategy is crucial. This step typically

consists of multiple tests aimed at enforcing constraints to the inferred disparity maps

(e.g., left-right consistency check, uniqueness constraint, analysis of the cost curve,

etc.), the input images, or the matching costs computed by the previous matching

engine according to specific algorithmic strategy. The filtered disparity map is then

sent to the Data/Video manager. This unit contains a small FIFO synthesized in the

FPGA logic, and it is mainly focused on packaging selected video streams and other

relevant data. This data is then sent to the communication front end implemented

in the FPGA logic and directly connected to the external communication controller

that in the current prototype is an USB 2.0 controller manufactured by Cypress.

The host computer, once it has received the disparity map, will compute depth

by triangulation according to the parameters inferred by the calibration procedure.

Although in this paper, we will focus our attention on the stereo matching module,

the overall goal consists in mapping all the blocks depicted in Fig. 5.2 into a low-

cost FPGA. As previously pointed out, a similar design would allow for small cost,

size, weight, power requirements, and reconfigurability. Moreover, the upgrade of

the whole project to newer FPGAs (typically cheaper and with better performance in

terms of speed and power consumption compared to previous generation) is almost

straightforward. Finally, we point out that, with the availability of integrated solu-

tions based on reconfigurable logic, plus embedded processors such as the Xilinx

Zynq [ 44 ], a self-contained FPGA module would make feasible the design of a fully

embedded 3D camera with complete onboard processing without any additional

external host computer.

Advances in Embedded Computer Vision

Search WWH ::

Custom Search

Home