Biomedical Engineering Reference
In-Depth Information
Fig. 5. Problem matrix after a move. Particles b8, b9, b10 and b11 have changed positions. Blocks
WG3, WG4, WG5 and WG8 will be recalculated.
Algorithm 2. Tile recalculation
Input : moved scattering particles
Output : updated intensity curve for the scattering momenta
/*Host program*/
Transfer input data to the GPU global memory and queue the kernels involved in the profile
recalculation
/*Kernels executed on the GPU*/
Compute the Debye sum term for the changed tiles (Kernel 5)
Perform the vertical tile sum reduction for each page (Kernel 3)
Perform horizontal margin sum reduction for each page to get the intensity curve (Kernel 4)
/*Host program*/
Retrieve the results from the GPU global memory
point numbers introduce errors, due to the finite precision available. Those errors tend to
accumulate when a large number of operations is performed, as is the case with the dou-
ble sum of the Debye formula. However, the Page-Tile algorithm significantly reduces
this error growth, because its successive partitioning of the problem space results in an
execution pattern resembling pairwise summation [22]. The algorithm can be executed
with SP or DP, paying a performance penalty of a factor of 2 to 4 with DP.
 
Search WWH ::




Custom Search