Digital Signal Processing Reference
In-Depth Information
algorithm onto an embedded processor involves code tweaking for hardware
interaction, and otherwise tends to be a relatively painless process. Unfortunately,
these processors suffer from performance issues, as they ultimately are still generic
CPUs with lower processing ability. For all but the least complex algorithms, this
platform is insufficient.
Another approach is to use a DSP, such as the Texas Instruments DaVinci
processor [ 12 ] . These allow for a middle ground between an embedded processor
and a PC node in that they still consume significantly less power than PC nodes
while providing optimized instructions which can be exploited by computer vision
algorithms. A simple example found on the majority of DSPs is the inclusion of an
optimized multiply-accumulate instruction. This instruction is applicable to a num-
ber of preprocessing and filtering steps in many vision algorithms, and is typically
available in commodity CPUs, the difference being the DSP unit consumes much
less power while providing this boost in performance. DSP toolchains typically are
similar to those used in embedded processors, with the additional requirement of
the system designer incorporating appropriate code annotations to take advantage
of the DSP instructions.
For some algorithms and system requirements, still more computational potential
is necessary while maintaining a level of low power consumption and small
footprint. In these instances, a custom or reconfigurable platform is considered.
Typically an FPGA will be utilized either as a design and test platform for a
final custom build or as a target reconfigurable fabric for these cases, respectively.
Much like DSPs, these provide optimized processing ability, in this case with
specialized processing units within the FPGA fabric. For example, high-speed
integer multipliers and adders are found in many FPGA cells currently on the
market. Furthermore, many FPGAs incorporate a simple embedded processor for
algorithm flow control. The complexity in using this type of platform is in design
overhead. An FPGA requires significantly more design consideration than the
previously mentioned platforms. This is due to the inherent need for hardware
and software task analysis. Also, the determined hardware tasks typically cannot
be directly mapped to FPGA logic, and instead require algorithmic redesign and
hardware timing scheme design and analysis.
It should be mentioned that all of these platforms raise the need for bandwidth
analysis during architectural design and mapping, in opposition to the previously
mentioned assumptions. This is due to the platforms having a smaller footprint,
and resultant lower amount of available storage on the board, not to mention on
the chip itself. As vision algorithm complexity increases through use of larger
pixel representation schemes, greater temporal frame storage, and also accuracy
requirements which abound in real-time critical vision systems, the amount of
data which needs to be both transferred on and off chip as well as stored on-
chip intermediately increases dramatically. Schlessman [ 25 ] addresses each of these
issues with the MEAD methodology for embedded architectural design, ultimately
providing vision designers with a clearer concept of what platform architecture is
best suited for the algorithm under consideration.
Search WWH ::




Custom Search