Information Technology Reference
In-Depth Information
variable processing times, and also attempt to improve on processor utilization
through our MB selection method.
Finally, most of the techniques for parallel decoding do not consider cache
misses. Consider the working of the wavefront strategy as shown in Figure 2. The
gray shaded MBs are the ones needed in the processor cache for processing the
yellow ones (since worst case 4-neighbour dependency is assumed). For example,
for processing the MB marked 15A in core A, the processor needs to load the
MBs 9A, 11A, 13A and 13B in the local cache. Since the MB marked 13B is
being processed in core B, this means, we need to fetch its data present in core
B of L1 through L2. An important consideration that can reduce cache misses
is to avoid the same MB being reloaded into cache. This can be done by giving
priority to MBs whose parents have a chance of being expelled due the cache
replacement policy. In our schedule heuristic, we keep this under consideration.
4 Cache-Aware Heuristic
In this section, we present the details of our MB scheduling and processor allo-
cation strategy in a multi-processor setting with a private L1 cache, a shared L2
cache and DRAM. We assume the following about the processor model.
- Every cache uses Least Recently Used (LRU) replacement policy.
- All caches are assumed to follow multi-level inclusion policy, i.e., if some data
item is present in L1 cache, then it is also present in L2 cache and DRAM.
Similar policy is also followed for L2 cache. However, the data present in
DRAM may be obsolete.
- L1 cache is assumed to use write-through policy. L2 cache is, however, as-
sumed to use write-back policy.
- Write-invalidate is used at L2 cache to ensure memory consistency. In other
words, when some data is written to the L1 cache of some processor, that
data is immediately invalidated at all other L1 caches, and the updated data
is brought in from L2.
- Bus arbitration of L2 cache is done on First Come First Serve (FCFS) basis.
We further assume that all frames are fully accommodated in the main memory.
Thus, there is no need for disk access at any point of time. We now define the
concept of cache flush time , a key element of our schedule heuristic.
Definition 1. The cache flush time ˄ of a node u is the number of cache misses
forwhichthenodewillbepresentinthecache.
Since we assume a LRU replacement policy, the cache flush times can be obtained
from the LRU counters in a modern architecture. The MB scheduling problem
is modelled as a scheduling problem on a task graph, where the MBs form the
nodes and for each dependency relation between a pair of MBs (u, v), there is
a directed edge from u to v if the MB for v depends on u. We call such a task
graph that is used to model this problem as a frame task graph. A node (i.e. a
MB) of the frame task graph is labelled with three components.
 
Search WWH ::




Custom Search