FPGA Computing in Modern Bioinformatics - Parallel Computing for Bioinformatics and Computational Biology

Biomedical Engineering Reference

In-Depth Information

3 filter calculations can be performed

in parallel if the data input can provide four consecutive data words. Besides four

implementations of the 3

As shown in the block diagram up to four 3

×

3 filter calculation blocks, the register matrix and the

delay memory must be extended and reconnected to provide the appropriate pixel

data for each filter calculation block.

This extended parallel architecture has a four times faster execution time compared

to the solution with only one filter block. Mainly because each filter block uses its own

mathematical operations which run completely parallel to the others. Compared to

the first initial approach where the pixel data were processed sequentially the speedup

is 36.

28.4.3

Image Filter Example Conclusions

Starting from the question if the innermost loop of the FIR 3

3 image filter task can

be executed in parallel this section has shown a possible solution in form of a parallel

architecture. The basic principles of a parallel execution and the optimization steps

have been shown in detail.

In principle, the innermost loop of the FIR image filter is an algorithm that has a

fine-grain granularity and therefore it is not suited to be accelerated by using clusters. It

is shown that especially these fine-grain granularity algorithms are very well suited for

an execution by a parallel working architecture, because the operations are connected

directly using dedicated buses and connections.

The described architecture for the 3

×

3 FIR filter can be implemented on an FPGA

processor board where a connection to the local memory is given. The parallel archi-

tecture is implemented within the FPGA processor and the available local memory

can be used for temporary storage of the image data. The integration of the parallel

architecture into the software is done by the API.

In summary it can be ascertained that the direct implementation of the architecture

and the flexibility of the FPGA processors make the FPGA coprocessor an ideal plat-

form for this type of fine-grain granularity algorithms. Furthermore, the modification

of coefficients or the increase of the filter size or the data words can be integrated

easily into the existing architecture and the modified algorithm can be executed within

the same FPGA processor.

×

28.5

CASE STUDY: PROTEIN STRUCTURE PREDICTION

The described FPGA processor is able to execute several kinds of applications. It

can be seen as a general-purpose computing processor. The general usability will be

presented in this section using a concrete application from the field of bioinformatics.

The complete application for protein structure prediction, a view of the parallelized

architecture, and the integration into an existing software environment is described

in this section. This protein structure prediction approach is shown in specific detail

rather than showing numerous application examples that are suitable for an execution

on an FPGA processor.

Parallel Computing for Bioinformatics and Computational Biology

Search WWH ::

Custom Search

Home