Methods and Tools for Mapping Process Networks onto Multi-Processor Systems-On-Chip - Signal Processing Systems

Digital Signal Processing Reference

In-Depth Information

The generation and analysis of a system's MPA model is a matter of seconds.

Note that similar times have been reported for alternative performance analysis

methods like trace-based simulation in the Artemis design flow, for instance.

While further reducing this time is desirable, it is a reasonable time frame for

performance analysis within a design space exploration loop.

The one-time calibration to obtain the parameters for the MPA model takes

several seconds albeit being completely automated. Extracting these parameters

manually would be a major effort.

In order to evaluate the accuracy of MPA estimations, the performance bounds

computed with MPA are compared to actual (average-case behavior) quantities

observed during system simulation. The differences are in a range of 10-20%, which

is typical for a compositional performance analysis. Differences in the same range

have been observed for several systems in [ 52 ] , for instance. There are two main

reasons for these differences. First, several operators in the formal performance

analysis do not yield tight bounds. Second, the simulation of a complex system

in general cannot determine the actual worst-case and best-case behavior. The

simulations on the system level do not use exhaustive test patterns and do not cover

all possible corner cases in the interference through joint resources.

Moreover, to illustrate the connection between the worst-case chip temperature

and worst-case latency, we represent eight selected mapping configurations of

the MJPEG decoder application together with their worst-case chip temperature

and worst-case latency calculated in MPA. Interesting here is the effect of the

physical placement that cannot be ignored anymore. So even if the mapping is

already defined, the system designer might still optimize the system (i.e., reduce the

temperature) by selecting an appropriate physical placement. This is highlighted by

solution pairs where only the placement of the processing components has changed

but temperature differences of 8K can still appear [ 45 ] .

Finally, the DOL framework itself is evaluated in terms of code size of the

prototype implementation. The DOL design flow and the associated tools are imple-

mented in Java (Fig. 14 ) . To give an indication about the size of the implementation,

Tab le 4 shows the code size of different parts of the design flow (excluding the

plug-ins for design space exploration and thermal analysis). One can see that apart

from the tool-internal representations of the system specification, the largest part

is the MPA code generator for performance analysis. The software synthesizers

and the monitoring for the MPA model calibration are comparatively small. Similar

observations can be made for other design flows, as well.

5

Concluding Remarks

The mapping of process networks onto multi-processor systems requires a

systematic and automated design methodology. This chapter provides an overview

over different existing methods and tools, which are all starting from a general

Search WWH ::

Custom Search

Home