Information Technology Reference
In-Depth Information
(a) (b)
Fig. 10.3. Preprocessing: (a) original image with marked eye positions; (b) eye positions and
subsampled framed image in three resolutions.
while the blobs in the highest resolution do not overlap, blobs in the lowest resolu-
tion do.
10.3 Network Architecture
The preprocessed images are presented to a hierarchical neural network, structured
as a Neural Abstraction Pyramid. As shown in Figure 10.4, the network consists of
four layers. The resolution of the layers decreases from Layer 0 (48 × 36) to Layer 2
(12 × 9) by a factor of 2 in both dimensions. Layer 3 has only a single hypercolumn.
Each layer has excitatory and inhibitory feature arrays. The number of feature arrays
inhibitory
excitatory
Left eye
Right eye
Output
Input
Layer 0 (48x36)
Layer 1 (24x18)
Layer 2 (12x9)
Layer 3 (1x1)
Fig. 10.4. Sketch of the network used for learning face localization. It is an instance of the
Neural Abstraction Pyramid architecture. The network consists of four layers, shown from
left to right. Each layer contains excitatory and inhibitory feature arrays. Excitatory projec-
tions are drawn with filled circles, open circles indicate inhibitory projections, and projections
labeled with shaded circles can have any sign.
Search WWH ::




Custom Search