We present a new mathematical model of the visual cortex, which takes into account and integrates geometric and probabilistic aspects.
From a geometric point of view, the cortex has been recently described as a noncommutative Lie group, equipped with a sub-Riemannian metric (Hoffman, 1989; Ben Shahar and Zucker, 2003; Bressloff et al., 2001; Citti and Sarti, 2006; Franken et al., 2007; Petitot and Tondut, 1999). The associated Lie algebra is generated by two vector fields that have integral curves in perfect agreement with the shape of association fields of Field, Hayes, and Hess (1993), neurally implemented by the horizontal connectivity of the visual cortex.
The simple cells are able to extract position and orientation of boundaries (i.e., position and momentum variables); therefore, in Sarti, Citti, and Petitot (2008), the structure of the cortex has been identified with a sym-plectic space, which in turn can be interpreted as the phase space of the retinal plane. The elements of this phase space are operators and, in particular, the generators of the group. Then the cortical manifold is formally equivalent to the phase space in quantum field theory.
A similar point of view was already implicitly taken by Daugmann (1985), when he proposed to use the minima of the classical Heisenberg uncertainty principle to model receptive profiles. Here we prove a similar uncertainty principle, but we directly deduce it from the noncommutativity of the geometrical structure. The minima of the principle are functions able to detect position and orientation with the smallest uncertainty; they are called coherent states.
The natural mapping from a world space to its phase space is performed by the Bargmann transform obtained via convolution with the coherent states. Then it will be shown that the action of the simple cells, which is exactly a convolution with the receptive filters, performs such a transform. The norm of the output of the simple cells is generally interpreted as an energy function, always positive, output of the complex cells. Hence, the norm of the Bargmann transform, suitably normalized, will be considered as a probability measure. Consequently, to the image it is associated a natural operator to account for probability distribution, that is, the density operator (Carmichael, 2002).
This approach is in agreement with the probabilistic model proposed by Mumford (1994) and further exploited in August and Zucker (2003); Duits and Franken (2007); Franken, Duits, and ter Haar Romeny (2007); Williams and Jacobs (1997a); and Zucker (2000). They formulated the assumption that the signal in the cortex can be described as a Markov process. This consideration leads, in turn, to a Fokker-Planck equation in the cortical phase space. Its solution expresses the probability that a point with a specific direction belongs to a contour, and it is implemented by the horizontal connectivity in the three-dimensional (3D) cortical space. The output of the Bargmann transform containing information about image boundaries is propagated by the Fokker-Planck equation, resulting in boundary completion and the filling in of the figure.
Projecting the propagated solution from the 3D cortical space to the retinal plane, we obtain the density operator giving a tensorial representation associated to the perceived image.
In the next section, the neurogeometry of the cortex is presented, following Citti and Sarti (2006). In the third section, we will reinterpret the neurogeometric structure from a probabilistic point of view, by replacing vector fields generators of the Lie algebra by the corresponding operators (i.e., performing the “second quantization”).
The Neurogeometrical Structure of the Cortex
The Set of Receptive Profiles and the Lie Group of Rotation and Translation
The retinal plane M will be identified with the two-dimensional (2D) plane R2 with coordinates (x,y). When a visual stimulus I(x,y):MuK2^K+ activates the retinal layer, the cells centered at every point (x,y) of M, process in parallel the retinal stimulus with their receptive profile (RP). The RP of a single cell can be modeled as a Gabor filter (see Figure 7.1):
All the other observed profiles can be obtained by translation and rotation of ^0(|,n) (see Lee, 1996). We will denote Axy,e, the action of the group of rotation and translation on R2, which associates to every vector (|,n) a new vector (^,fj ) according to the following law:
FIGURE 7.1 The real (left) and imaginary (right) parts of a Gabor filter, modeling simple cell receptive profiles.
The action of the group on the set of profiles then becomes
The Action of Simple Cells and the Lie Algebra
The overall output O of the parallel filtering is given by the integral of the signal I(|,n) times the bank of filters:
The selectivity in orientation of the output O is very weak. Several models have been presented to explain the emergence of strong orientation selectivity in the primary visual cortex. Even if the basic mechanism producing strong orientation selectivity is controversial (“push-pull” models [Miller, Kayser, and Priebe, 2001; Priebe et al., 1998], “emergent” models [Nelson, Sur, and Somers, 1995], “recurrent” models [Shelley et al., 2000], to cite only a few), nevertheless, it is evident that the intracortical circuitry is able to filter out all the spurious directions and to strictly keep the direction of maximum response of the simple cells.
For (x,y) fixed, we will denote 0 the point of maximal response:
We will then say that the point (x,y) is lifted to the point (x, y,0) (see Figure 7.2). If all the points of the image are lifted in the same way, the level lines of the 2D image I are lifted to new curves in the 3D cortical space (x,y,d).
Figure 7.2 A two-dimensional curve (in blue) and its three-dimensional cortical lifting in the roto-translation group (in red). The tangent vector to the blue curve is (cos (d), sin(0)), so that the tangent vector to its lifted curve lies in the plane generated by X1 and X2.
The vector (cos(0),sin(0)) is tangent to the level lines of I at the point (x,y), so that the selected value of d is the orientation of the boundaries of I. It has also been proven in Citti and Sarti (2006) that the 3D lifted curves are tangent to the plane generated by the vector fields
The Noncommutative Structure
We explicitly note that the vector fields X1 X1 are left invariant with respect to the group law of rotations and translations, so that they are the generators of the associated Lie algebra. If we compute the commutator, we obtain the vector X3 :
Because it is different from 0, the Lie algebra is not commutative. And, X3 is linearly independent of X1 and X2.
This noncommutative property can be observed starting from the integral curves of the vector fields X.. Starting from a point (0,0,0) and moving first in the direction of the vector field X1 and then in the direction of the vector X2, we reach a point (x, y, 6), different from the point (xx, y1, 61) we could reach moving along the vector field X2 first and then along the vector X1 (see Figure 7.3).
Figure 7.3 The composition of two integral curves of the roto-translation group is noncommutative, depending on the order of application of the vector fields.
Association Fields and Integral Curves of the Structure
The natural curves of the structure are the integral curves of the vector fields X1 and X2, starting at a fixed point (x0, y0, 60):
and obtained by varying the parameter k in R. (Figure 7.4). These curves can be used to model the local association field as described in Field, Hayes, and Hess (1993) and Gove, Grossberg, and Mingolla (1995).
FIGURE 7.4 The association fields of Field, Hayes, and Hess (Field, D.J., Hayes, A., and Hess, R.F., Vision Res., 33, 173-193, 1993) (left) and the integral curves of the vector fields X1 and X2 with constant coefficients, see Equation 7.7 (right).
We can define the distance between two points as the length of the shortest path connecting them. In the Euclidean case, the minimum is obtained within all possible curves, while here we will minimize over the set of integral curves of the vector fields X1 and X2 (Nagel, Stein, and Wainger, 1985). Using the standard definition, we call length of any curve y
It can be proven that the parameter k expresses the curvature of the projection of the curve y on the plane (x, y) (see Citti and Sarti, 2006). Hence, the geodesics of the group structure, which minimize this quantity, are elastica, as introduced by Mumford (1994) for perceptual completion. Shown in Figure 7.5 is the completion of a Kanitza triangle with curved boundaries by means of the geodesics in the group.