Hardware Reference
In-Depth Information
Video Processing Unit
Stream-rate domain
Pixel-rate domain
CPU #0
Stream processing
unit ( m=1 )
Image processing unit ( n=2)
Global
DMAC
Symbol
codec 1
TRF 1
FME 1
DEB 1
#1
CME 1
(PIPE)
(PIPE)
(PIPE)
Stream processor
Shift-register-based bus (SBUS)
CABAC accelerator
Symbol
codec 0
TRF 0
FME 0
DEB 0
#0
CME 0
L-MEM
(PIPE)
(PIPE)
(PIPE)
Media
IPs
Memory port
Intermediate
stream
Image
Bit stream
Off-chip memory
PIPE: Programmable image processing element, L-MEM: Line memory,
CME: Coarse motion estimator, TRF: Transformer,
FME: Fine motion estimator/compensator, DEB: De-blocking filter
Fig. 3.76 Block diagram of video processing unit. The stream-rate domain and pixel-rate domain
can access the intermediate stream via the global DMAC
codec performs either encoding or decoding. In decoding mode, the stream processing
unit (SPU) reads bit streams from off-chip memory and outputs a transformed inter-
mediate stream. The image processing units (IPU) read the intermediate streams
produced by the stream processing unit and generate the final decoded image.
The space for the intermediate streams in the off-chip memory serves as a buffer
between the stream-rate domain and the pixel-rate domain. Variable-length coding
inherently lacks fixed processing times. CABAC times have particularly large varia-
tion. Up to 384 symbols of transform coefficients are definable in a macroblock, but
the maximum number of bits changes according to the probability of a syntactic
element in the given context. If the stream processing unit takes more time to process
a frame than is available at the frame rate, the operating frequency must be raised.
Figure 3.77 shows an example of the decoding time and the number of bits for
each picture in an H.264 40-Mbps video stream. As the figure shows, when the
number of bits in the pictures around picture #30 is large, the stream-rate domain's
decoding time is longer than that of the pixel-rate domain. When the number of bits
assigned to the pictures around picture #5 is small, the stream-rate domain's decoding
time is shorter than that of the pixel-rate domain.
The intermediate stream buffer fills the performance gap between the stream
processing unit and the image processing unit in the picture-level pipeline.
Figure 3.78 is the stream and pixel decoding time chart in the picture-level pipeline.
The time slot is defined as the decoding time of image processing in the pixel-rate
Search WWH ::




Custom Search