Scalable Image Registration and 3D Reconstruction at Microscopic Resolution - High-Throughput Image Reconstruction and Analysis

Biomedical Engineering Reference

In-Depth Information

Figure 8.4 Stacking phenomenon. (a) Mammary images contain sparse adipose tissue interspersed

with duct structures. (b) Nonrigid registration of mammary images by corresponding duct centroids

results in severe structural deformation as all components of duct trajectory within the sectioning

planes are eliminated, resulting in vertical columnar structures. (c) This can be corrected by rigidly

registering the sequence, tracking the duct centroid trajectories, smoothing these trajectories, and

nonrigidly registering the duct centroids to the smoothed trajectories.

8.4 High-Performance Implementation

The size of high-resolution microscope image datasets presents a computational

challenge for automatic registration. Scanning slides with a 20X objective lens

produces images with submicron pixel spacing, often ranging in the gigapixel scale

with tens of thousands of pixels in each dimension. At this scale the amount of data

necessary in quantitative phenotyping studies can easily extend into the terabytes.

This motivates the development of a high-performance computing approach to

registration.

This section discusses high-performance implementation of the two-stage reg-

istration algorithm and introduces solutions at both the hardware and software

layers. At the hardware layer two areas are pursued: parallel systems based on

clusters of multisocket multicore CPUs, and GPUs for hardware acceleration. At

the software layer, performance libraries are used for computing the normalized

correlations by fast Fourier transform (FFT) on both CPU and GPU. Performance

results from the implementation varieties discussed here are presented in further

detail in Section 8.6.

8.4.1 Hardware Arrangement

Both single and multiple node implementations are described below. In the multiple

node implementations, a simple head/workers organization is assumed. Communi-

cation on single node implementations is accomplished with Pthreads, and message

passing interface (MPI) is used for internode communication in multiple node im-

plementations. The details of division of work between CPU and GPU at the level

of individual nodes are discussed further in Section 8.4.3.

8.4.2 Workflow

The workflow for the two-stage registration algorithm is summarized in Figure

8.5. With the exception of computing nonrigid transformations, the CPU-bound

Search WWH ::

Custom Search

Home