VLSI Models of Computation - Models of Computation: Exploring the Power of Computing

Information Technology Reference

In-Depth Information

Problem

Dimensions

Area

Time

Shifting of n -vector

O ( n )

O ( √ n )

O ( n )

Summing n items

O ( n )

O ( √ n )

O ( n )

Broadcasting to n locations

O ( n )

O ( √ n )

O ( n )

n -point FFT

O ( n )

2D O ( n ) O ( √ n )

Figure 12.3 Area vs. time performance of VLSI algorithms for four problems.

n matrix-

matrix multiplication each require area A and time T satisfying AT 2 =Ω( n 2 ) .Consequently,

the 2D algorithms cited above for these problems are optimal to within a constant factor.

In the next section we now show that every normal algorithm can be implemented on

the cube-connected cycles (CCC) network in time T satisfying Ω(log n )

In Section 12.6 we show that shifting of an n -vector, the n -point FFT, and n

O ( √ n )

and that the CCC network can be embedded in the plane using area A = O ( n 2 /T 2 ) .In

Theorems 12.7.2 and 12.7.3 we show that these implementations are optimal up to constant

multiplicative factors with respect to area and time for the three problems mentioned above.

≤

12.5.3 Layout of the CCC Network

In Section 7.7.6 we describe the realization of a fully normal algorithm on the canonical CCC

network. The realization extends directly from the canonical CCC network to a general ( k , d ) -

CCC network in which there are 2 d cycles and 2 k vertices on each cycle. (See Fig. 12.4 .)

A fully normal algorithm is simulated on the CCC network by giving the processors on

the j th cycle, 0

2 d

1, the addresses i + j 2 k where 0

2 k

≤

−

≤

−

1. The cycles

are treated as 1D arrays and used to simulate a normal algorithm on the first k dimensions

exactlyasisdoneinSection 7.7.6 . These simulations are done in parallel after which the

swaps across the higher-order d dimensions are simulated by first rotating the leading element

on each cycle to the first of the inter-cycle edges. After executing one swap, each cycle is

advanced one step so that the second elements on each cycle are aligned with the first of the

high-order dimensions. At this point the first elements on each cycle are aligned with the edge

associated with the second of the high-order dimensions. Thus, while swaps are done between

the second elements on each cycle across the first of the high-order dimensions, swaps occur

between leading elements along the second of the high-order dimensions. This rotating and

swapping is done until all cycle elements have been swapped across all high-order dimensions.

This algorithm performs O ( 2 k ) steps on the cycles to perform swaps across low-order

dimensions and align the cycles for swaps at higher dimensions. An additional O ( d ) steps are

used to perform swaps on the d high-order dimensions. Thus, the number of steps used by

this algorithm, T , satisfies T = O ( 2 k + d ) . The number of processors used in ( k , d ) -CCC

network, n , satisfies n = 2 d + k .

Models of Computation: Exploring the Power of Computing

Search WWH ::

Custom Search

Home