Information Technology Reference
In-Depth Information
2. Compute the values of g i and L i parameters using the proposed benchmark.
3. Compute the runtime of the algorithm using the theoretical cost model of
tje MultiBSP [8].
4. Run the vector inner product algorithm.
5. Compare the results with the theoretical prediction.
(a) Instance #1: dell32
(b) Instance #2: jolly
Fig. 8. Comparison between the real execution time against the theoretical execution
time
Fig. 8 graphically presents the comparison between the real execution time
against the theoretical execution time for both studied architectures.
The results show that when using a vector with less than 2 8 elements, the
real execution time is larger than the theoretical time. This happens mainly
because with few data, the time for spawning threads adds a significant overhead
compared with the time to calculate a vector slice at level i .For dell32 ,when
computing vectors with more than 2 8 elements, both curves have the same slope,
then we can say that both times are relative and the measure is stabilized. For
jolly , the predicted and execution times have a different behavior. There is an
ideal point where both measures are the same, but when the vector is larger
than 2 8 elements, the execution time increases slower than the predicted time.
The good results in Fig. 8(a) validates the proposed approach, as the values g i
and L i used in the predicted time are very close to the real time. On other hand,
in Fig. 8(b) the predicted time is not as close to the real time as we expect.
However, the theoretical time is always greater than the real time, so it is useful
as an accurate lower bound for predictions.
5 Conclusions and Future Work
This work presented MBSPDiscover 4 , an automatic tool for characterizing mul-
ticore architectures based in the MultiBSP computational model. The proposed
4 Available from http://runtime.bordeaux.inria.fr/sehloc/
Search WWH ::




Custom Search