Graphics Reference
In-Depth Information
Antagonistic requirements to increase compute performance while keeping power
dissipation at same (or even lower) level can be met by off-loading some of low- and
mid-level vision processing to a hardware accelerator. They can be broadly divided
in two groups: hardwired/fixed and programmable.
Hardwired Hardware Accelerators —Selection of processing primitives to
accelerate in fixed hardware acceleration is not a trivial task as the industry lacks
standards for embedded vision processing algorithms. To preserve flexibility across
different systems, the best candidates for fixed hardware acceleration are low- and
mid-level processing kernels. Some examples of fixed hardware accelerators for
vision include: Pipelined Vision Processor (PVP) on Analog Devices Processors
[ 15 ], image processing accelerators on Toshiba Visconti 3 (affine, filter, histogram,
histogram of gradients, and matching) [ 16 ] or PW and CE engines on MobilEye
EyeQ2 processor [ 17 ].
Programmable Hardware Accelerators —Programmable hardware accelera-
tors give more flexibility than fixed hardware acceleration. To address challenges
of low- and mid-level vision processing, the most deeply embedded vision applica-
tions have used proprietary programmable hardware accelerator architectures with a
strong DSP pedigree, such as the NEC IMAPCAR [ 18 ], Image processor IMP2-X2
on Renesas SH7766 device [ 19 , 20 ], VMP2 onMobileye EyeQ2 [ 17 ], Media Proces-
sor Engine (MPE) on Toshiba Visconti-3 [ 16 ], GPU [ 21 ], Cognivue Apex [ 22 ], and
Texas Instruments EVE [ 14 , 60 ] architectures, augmented by FPGAs for blocks with
extreme compute requirements.
General Purpose Programmable Processors —General purpose CPU such as
ARM Cortex A8, A9, A15, or A53 processors and DSP architectures are best fit for
high-level vision processing. The internal processor architecture, number and preci-
sion of computational units, cache architecture, number and size of the internal and
external data paths all play an instrumental role in how fast the task will be carried out
[ 23 ]. Hardware support for functional safety helps to relieve the embedded proces-
sor from running periodic system and memory checks leaving more programmable
resources for analytics. Large internal memory helps reduce system latencies and
lower power dissipation by minimizing number of accesses to the external double
data rate (DDR) memory.
While each architecture has its own strengths and weaknesses, TDA2x SOC is the
only one in the industry that offers automotive vision developers both a state-of-the-
art DSP along with multiple instances of a fully programmable vision accelerator an
unparalleled level of programmable vision analytics performance.
TDA2x SOC ADAS processor architecture is shown in Fig. 3.6 :
The TDA2x SoC [ 24 ] incorporates a scalable architecture that includes a mix of
TI's fixed and floating-point TMS320C66x digital signal processor (DSP) genera-
tion cores, Vision AccelerationPac (EVE), ARM Cortex-A15 and dual-Cortex-M4
processors. The integration of a video accelerator for decoding multiple compressed
videos treams received over an Ethernet AVB network, along with the graphics accel-
erators (SGX544) for rendering virtual views, and enables a 3D viewing experience
for surround view applications.
Search WWH ::




Custom Search