Digital Signal Processing Reference
In-Depth Information
=
(
)
=
Thus, for P
6 ROMs are needed. The number of ROMs increases
rapidly as a function of P , and the total size of ROMs grows slowly rather than the
case where 1 ROM is used.
ForacasewherethePUCU index is 0000, the default ROM that provides
PE CU 3 and PE CU 4 is used even though no connection will be made by the
Exchange Controller. A case for 1111, three ROMs are used but not at the same
time. When one ROM is used, the size of ROM becomes 2 4 ( 3 + 2 ) (1M entries). On
the other hand, when 6 ROMs are used, the total size of each ROM becomes 2 2 ( 3 + 2 )
(10K entries).
4,
4
1
!
4
Architecture Evaluation
4.1
Guaranteeing Complete Redistribution
As we have discussed in the previous section, as long as we can maintain N RB i +
N TB i 1 +
N TPR i for all PE CU i , we can guarantee that each PE will
get enough particles for the next iteration. During internal particle balancing, a
replication factor of any particle may have a large value. This replication factor
is modified whenever a particle is transferred (i.e., reduced by the splitting operator)
so that the replication factor becomes small enough to balance the particles among
the PE CU i s.
In addition to the number of particles requirement, the deadlock of internal
particle balancing must be avoided. This occurs when all the PE CU i s become
sources (i.e., they have particles to send out) or destinations (i.e., they all need
particles). This deadlock is completely eliminated by a priority decoder, which
decides the appropriate source and destination pair during the internal particle
balancing. Thus, as long as the resampling provides a total number of replicated
particles ( R
N RT B ui
T ) larger than M , which is guaranteed by the quantization scheme, a
perfect redistribution is guaranteed.
+
4.2
Scalability of the Architecture
Thus far, we implicitly assume that the number of PEs is a power of two. However,
the architecture is perfectly scalable in a sense that the execution time is M
P
without overhead for any number of PEs as long as there are enough busses available
between the PEs and the CU. The only condition, which is a function of the number
of PEs (i.e., P ), is the number of split operations required at the PE and the CU
interface. The number of split operators required is given by
/
log 2 P
S
=
(9)
 
 
Search WWH ::




Custom Search