Graphics Reference
In-Depth Information
Table 11.2
Fast intra prediction results. All intra configuration
is used
# of candidates
10
12
14
17
BD-Rate[%]
0.21
0.19
0.06
0.03
modes for each PU size. The PU size may range from 4 4 to 64 64 depending
on encoder configuration (for the 64 64 PU, the intra prediction actually happens
at a 32 32 level since the maximum transform size supported is 32 32). The
complexity is much higher and results in considerable hardware costs. In addition,
HEVC allows the use of intra/inter CU, and consequently PU, inside the same CTU.
The prediction dependency among PU is quite complex and limits the hardware
parallelism. For hardware implementation of intra prediction, an architecture for
intra 4 4 prediction with flexible reference sample selection is proposed in [ 28 ].
A full intra architecture is also proposed in [ 33 ]. In high throughput applications,
intra prediction performance still needs to be improved. Here we apply two methods.
Firstly, we use reduced mode search to lower the required computation at algorithm
level. Next, hybrid open-closed loop intra prediction removes the dependency of
reconstructed pixels and enables PU-level parallelism.
11.4.1
Reduced Mode Search
Depending on the configuration, intra PU size in HEVC may range from 4 4 to
64 64. For each size, the prediction directions can be chosen from 35 modes at
most. Due to high number of modes, the search cost is high. To reduce the cost,
we propose fast intra prediction algorithm, which will reduce two thirds of the total
estimated modes per PU size. Only a limited number of modes are searched with
constrained resource and timing budget.
Firstly, we assume the computation budget to be C . We are to find the best
predicted modes within the computation budget of C modes. If the mode search cost
reaches the limit, the search stops. The best result is taken as the predicted mode.
To perform the search with efficiency, we use a two-step fast search algorithm. In
the first step, we perform a coarse search. We search the whole range of directions
but with an angular step size of =8. The cost for each mode is obtained. In the
second step, refinement search is performed at the unsearched angular neighboring
modes that are just around the best one among the searched ones in the first step.
The best mode is updated after each search. After that, an exhaustive search will
be performed for every remaining mode till the budget limit is reached. Analysis
of the trade-off between the quality and the number of searched modes is shown in
Tab le 11.2 . By using this algorithm, the number of searched modes is reduced to
one third of the full search with negligible BD-rate increase.
 
Search WWH ::




Custom Search