We can see from Figure 8.1 that the identification of occluding contours is important for two reasons. First, they separate features of different objects, which have to be kept separate in the processing. Second, they carry information about the shape of the occluding object. Although an occluding contour is also the boundary to a background region, it should not be processed with that region; the shape of the background region is an accidental product of the situation of occlusion (Nakayama, Shimojo, and Silverman, 1989). Thus, the identification of occluding contours involves two tasks: finding the borders and assigning them to the correct side, the foreground side.
For detection, a variety of cues are available. A color discontinuity is probably the most reliable indicator; because foreground and background are usually unrelated, there is a high chance that they differ substantially in color. This is exploited by simple and complex cells that effectively extract contrast borders (see Figure 8.2 top). Another possibility is detecting discontinuity of depth, which is illustrated in Figure 8.4 (cells 2 and 3). Relative motion and dynamic occlusion are important cues, too.
The second task, assigning the contours to the foreground, requires a somewhat different repertoire of mechanisms. Discontinuity of color, which is powerful for detection of contours, provides no clues as to which side is foreground and which is background. The border between a gray and a brown region could be either the contour of a gray object on one side or the contour of a brown object on the other side. Discontinuity of depth, on the other hand, is a perfect cue because it tells which side is in front. Cells that respond to edges in random-dot stereograms are usually (75%) selective for the polarity of the edge (depth ordering of surfaces). An example is cell 3 in Figure 8.4, which responded to the far-near edge in the random-dot stereogram (left-hand edge of square) but not to the near-far edge (right-hand edge of square).
FIGURE 8.5 Assignment of “border ownership” in perception. (a) Physically, the square is just a region of higher luminance surrounded by a region of lower luminance, but perceptually, it is an object, and the border between the regions is “owned” by the object. (b) A figure in which border ownership is ambiguous. (c) On the left, four squares are perceived as four different objects, whereas similar squares on the right are perceived as two crossbars, one of which appears transparent.
Note the different assignment of border ownership (arrows).
Gestalt psychologists made the surprising observation that the visual system assigns borders also in the absence of unequivocal cues such as disparity in a random-dot stereogram. In Figure 8.5a, the white region is perceived as an object and the black-white boundary as its contour. Apparently, the system has rules to decide which regions are likely to be objects and which ground, and applies these “automatically.” This compulsion to assign figure and ground is nicely demonstrated by Rubin’s vase figure (Rubin, 1921). My version is shown in Figure 8.5b. Here, different rules seem to be in conflict, and the assignment flips back and forth between the two alternatives, which both lead to perceptions of familiar shapes.
A change in the assignment of border ownership not only affects the depth ordering of surfaces, it can also imply a restructuring of the perceived objects. Figure 8.5c shows, on the left, four squares with rounded corners. On the right, the same squares are shown without the rounding. It can be seen that perception reorganizes the regions: two bars are now perceived instead of the four squares, and the center region that was ground before, now appears as a transparent overlay.
The perceptual interpretations demonstrated in Figure 8.5 obviously involve the image context. For example, the edges marked by arrows in Figure 8.5c are locally identical in both configurations, but the assignment differs according to the context. Alteration of some details elsewhere in the display led to a new interpretation.
The concepts of border ownership and figure-ground have been used in parallel in the literature to describe those perceptual phenomena. The power of the border ownership concept in interpreting and modeling the perceptual observations was recognized (Nakayama, Shimojo, and Silverman, 1989; Sajda and Finkel, 1995), but how the brain codes border ownership, or figure-ground, has long been a mystery. Sajda and Finkel speculated that synchronous oscillations might be the vehicle of the neural coding. However, recent single-cell recordings have suggested a different answer that is surprisingly simple (Zhou, Friedman, and von der Heydt, 2000). The key finding is that individual neurons have a (fixed) border ownership preference, and that each piece of contour is represented by two groups of neurons with opposite border ownership preferences (Figure 8.6). Thus, the differential activity of the two groups signals the direction of border ownership.
Support for this view is provided, among other observations, by a comparison of the neuronal responses to color-defined figures and random-dot stereograms (Figure 8.7). The raster plots show the responses of a V2 neuron under eight different conditions as illustrated by the stimulus cartoons that should be interpreted as perspective drawings of the stimulus configurations. The four conditions at the top (a,b,c,d) are the tests with color-defined figures: figure locations left and right of the receptive field are illustrated in left and right columns; two rows, A-B and C-D, show tests with the two contrast polarities.
FIGURE 8.6 Border ownership selectivity in neurons of the visual cortex. (a) Example neuron of area V2. Conventions are as in Figure 8.3. Note the response difference between conditions illustrated left and right. The left and right stimuli are locally identical, as shown in (b). (c) Time course of averaged responses of V2 neurons; thick line, preferred side of figure; thin line, nonpreferred side. Each neuron was also tested with displays of reversed contrast (not illustrated), and responses for both contrast polarities were averaged. (d) The differential response between neurons with opposite side preference is thought to signal border ownership.
The bottom part shows the stereoscopic test. Again, left and right columns show left and right figure locations, but in this test, the “figure” was either a square floating in front of a background (e,h), or a window through which a surface in the back was visible (f,g). It can be seen that the neuron preferred figure-left for the color-defined displays. For stereograms, it responded in the conditions where the front surface was to the left of the receptive field (e,f) but was silent when the front surface was to the right (g,h). Thus, with stereograms, which define depth order of surfaces unequivocally, the neuron responded according to the stereoscopic edge in its receptive field and was selective for left border ownership. Under these conditions, the location of the square shape was irrelevant: the edge could be the right-hand edge of a square figure (e), or the left-hand edge of a square window (f). But with color-defined displays, the neuron responded according to the location of the shape: figure-left produced stronger responses than figure-right. Thus, in the absence of the disparity cue, the neuron “assumes” that the square is an object and the surrounding region is the background. This is exactly how we perceive the white square in Figure 8.5a.
FIGURE 8.7 Responses of an example neuron to contours of luminance-defined figures (a,b,c,d) and to disparity-defined figures (e,f,g,h). The depth configurations are illustrated schematically, by depicting the square figures (e,h) and windows (f,g) in perspective. Oval indicates projection of receptive field.
The reorganization of surfaces demonstrated in Figure 8.5c can also be observed in the neuronal responses of V2, as illustrated in Figure 8.8 (Qiu and von der Heydt, 2007). We compared displays of a single square (Figure 8.8a) with the transparent display (Figure 8.8b) and a display of four squares with rounded corners which do not produce perception of transparency (Figure 8.8c). Note that in the top row of displays, the edge under the receptive field (ellipse) is owned left in (a) and (c), but right in (b). The ownership is opposite in the displays below. The curves show the time course of the averaged border ownership signals of V2 neurons. It can be seen that the signal for (b) was reversed compared to (a) and (c). This shows that the neurons assign border ownership(BOS) according to the transparent interpretation. Particularly striking is the difference between the responses to (b) and (c), which differ only in the presence/absence of “X-junctions.” The signal reversal in b occurred with a slight delay (dotted vertical lines). These results show that V2 codes configuration b as two crossed bars rather than representing the four squares that make up the display. It reorganizes the visual information in terms of plausible objects.
FIGURE 8.8 Border ownership signals parallel perceptual reorganization in transparency displays. In (a) and (c), top, the border marked by a dashed oval is owned left, but in display (b), top, it is owned right. In the displays below, ownership is reversed. The local edge is identical in all six displays. The curves show the corresponding border ownership signals (difference between responses to displays at top and below); average of 127 cells, error bands indicate standard error of the mean (SEM).