A Parallel Multilevel Data Decomposition Algorithm for Orientation Estimation of Unmanned Aerial Vehicles - High Performance Computing

Information Technology Reference

In-Depth Information

of no more than 0.10 seconds which is enough time to close the yaw angle control

loop on-board of the UAV.

4.3 Performance Metrics

In this work, we apply two standard metrics to evaluate the performance of

the proposed multilevel data decomposition: speedup and e ciency .Bothare

common metrics used by the research community to evaluate the performance

of parallel algorithms [17].

The speedup evaluates how much faster is a parallel algorithm than its sequen-

tial version. The relative speedup (SRS) is defined as the ratio of the execution

times of the sequential algorithm ( TS 1 ) and the parallel version executed on

m computing elements (threads or processors) ( TP m ) (7). We also evaluate the

parallel capabilities/scalability of the proposed algorithm by comparing the exe-

cution times of the parallel algorithm executing on one ( TP 1 )and m computing

resources ( TP m ), which we call Parallel Relative Speedup (PRS) (8). When ap-

plied to non-deterministic algorithms (i.e., due to non-deterministic situations in

the computing environment or non-deterministic bifurcations in the algorithm

itself), the speedup should compare the mean values of the sequential and paral-

lel execution times, obtained in a reasonable number of independent executions.

The ideal case for a parallel algorithm is to achieve linear speedup ( SPS m = m ),

but the common situation is to achieve sublinear speedup ( SPS m <m ), due to

the times required to communicate and synchronize the parallel processes.

The eciency (9) is the normalized value of the speedup, regarding the num-

ber of computing elements used for execution. This metric allows comparing

algorithms executed in non-identical computing platforms. The linear speedup

corresponds to e m = 1, and in usual situations e m < 1.

SPS m = TS 1

TP m

PRS m = TP 1

TP m

e m = SPS m

m

(7)

(8)

(9)

4.4 Results and Discussion

To validate the parallel implementation, previous to the performance analysis,

the yaw angle estimation of the sequential application and various of the parallel

configurations were compared using the sFly and QA3 data sets, and the results

are reported in Fig 6 and Fig. 7, respectively.

Figure 6a shows that the different estimates are almost coincident and most

of the time are overlapped. In Fig. 7a is also possible to see the overlapping of

the parallel and sequential versions. Black lines show in both cases the actual

orientation. These results validate the parallel implementations and they lead to

use any parallel configuration as a reference baseline, provided that time is not

being measured like in Fig. 6b and Fig. 7b where the absolute error is shown.

High Performance Computing

Search WWH ::

Custom Search

Home