Graphics Reference
In-Depth Information
Fig. 10.20 Hierarchical memory deployment with VPB-Row level SRAM/DRAM and VPB-level
SRAM for neighboring pixels, and TU-level registers for reference pixels
3. VPB left neighbors: This buffer is implemented using one SRAM containing 128
pixels (64 Y C 64 UV). It is updated every TU with neighboring pixels for the
next TU. Because the TUs are processed in z-scan order, at the end of all TUs in
the current VPB, it automatically contains the left neighbors for the next VPB.
4. VPB top-left neighbors: The TU-based update scheme for VPB top and left
neighbors could overwrite some pixels which will be the top-left neighbor of
some following TUs. The VPB top-left neighbor buffer is introduced to solve
this problem. As shown in Fig. 10.20 ,pixelsonthe4 4 grid are written to the
VPB top-left neighbor buffer (Table 10.12 ).
5. Reference pixels: At the start of every TU, neighbors are read from the VPB-level
SRAMs into registers. Padding and preparation operations are then performed on
the registers to obtain reference pixels. Using registers allows for these operations
and the final intra prediction to be performed at a high throughput. A total of
129 reference pixels (32 bottom-left, 32 left, one top-left, 32 top, 32 top-right)
are needed for all angular modes. But since only one angular mode is used at a
given time, the horizontal modes can be treated as vertical modes by swapping x
and y axes to reduce the number of reference pixels to 99. Reference pixels are
read by both preparation and prediction, and a combined read-out circuit shared
between the two operations can reduce the number of multiplexers by exploiting
similarities in their access patterns.
10.7.2
Reference Preparation and Prediction
As mentioned in Sect. 10.7 , due to the tight dependency loop in Intra processing
it is hard to pipeline the three pixel processing operations of reference padding,
Search WWH ::




Custom Search