Advances in Automated Restoration of Archived Video (Digital Imaging) Part 2

Further Refinement

The reader would be forgiven for thinking that the Bayesian approach outlined above encapsulates the essentials of the missing data problem in video sequences and is completely successful at removing degradation. Sadly, its great weakness lies in the underlying image sequence prediction model that is used for the data likelihood. In fact, it is quite often in a real image sequence that motion blur, self-occluding, and non-rigid motion (e.g., cloth, transparent/translucent objects, and specular reflections) all contrive to cause the breakdown of that model. Therefore, while the method above works well in many cases, when it fails, it fails in unusual ways. Removal of single frame specular highlights are possible, and increased blur at the boundaries of motion blurred regions also occur. Furthermore, in a real application, a user may wish to remove dirt but leave the noise intact. In that case the decision-making process that rejects dirt can leave “blobs” of discontinuous material which are not in the right place. Note, however, that when using a spatiotemporal linear prediction (3DAR) model as the image sequence modeling equation, these effects are reduced in the sense that erroneously interpolated areas tend to be smoother.

This problem was called Pathological Motion in the Aurora3 project of 1995-1999. P. van Roosmaalen [4], Raphael Bornard [29], and Andre Rares [30-32] were the first to address the issue. In each case the idea that worked was to turn off any dirt detection when PM was detected. The most reliable PM detector used the density of discontinuity detection to flag regions as showing unusually high self-occluding activity and hence most likely to be false alarms. The PM detection itself was a post-process after detecting temporal discontinuities of any type. To put it another way, we do not expect there to be blotched pixels in the same location in successive frames. If that is observed, then it is likely that something is wrong with the detection process and it is better to turn off the restoration than to risk damage to good pictures. A simpler and practical formulation was developed by Kent et al. [33].


Top row: showing four frames from four different sequences exhibiting pathological motion. Second row from top: PM/blotch detection in red/green respectively. Third row: restoration employing standard detection without PM detection. Incorrectly reconstructed areas are highlighted in cyan. Last row: restoration with PM detection embedded in the blotch removal process. Note the better preservation of the PM regions while still rejecting blotches.

FIGURE 11.7

Top row: showing four frames from four different sequences exhibiting pathological motion. Second row from top: PM/blotch detection in red/green respectively. Third row: restoration employing standard detection without PM detection. Incorrectly reconstructed areas are highlighted in cyan. Last row: restoration with PM detection embedded in the blotch removal process. Note the better preservation of the PM regions while still rejecting blotches.

That is based on a simple global motion estimator which is used to detect foreground regions. The detector takes the conservative approach of reducing the probability of detecting a blotch in foreground regions. The Kent algorithm compromises performance for computational load. The idea of using PM detection to turn off dirt removal has been very successful in general and has been generalized in work by Corrigan et al. [34]. In that work, Corrigan uses two features for PM detection: (1) repeated discontinuities detected in the same position in consecutive frames, (2) unusually divergent motion fields. These constraints were initially built for the discontinuity estimation step and as a result the process does not treat PM detection as a post-process (as in the case of Bornard, Rares, and Roosmalen) but integrates it with the blotch detection optimization in a unified framework. Figure 11.7 shows examples of four sequences in which PM occurs. These examples represent typical problems: clothing, periodically rotating objects, fast moving and warping objects (birds), and motion blur (in the chopper blades). The third row in that figure shows what happens when a standard blotch removal process [2] is used to detect and remove the blotches with a 3D median filter. The highlighted regions exhibit very poor reconstruction since the motion information is not correct at all in PM areas. The second row shows the result of Corrigan’s process, highlighting in red areas detected as PM and in green areas detected as blotches. We see in the last row how much improved the picture reconstruction is because of the reduction of false alarms in PM areas.

Vlachos et al. [15] have attempted to create more robust systems by performing a spatial discontinuity detection first followed by a temporal validation step. This reduces the computation involved and works well for small high-contrast missing areas. In [16], two uncorrelated blotch detectors are fused to improve performance, based on the assumption that the conflict between the two detectors gives false alarms. The detectors employ different features: one is temporal while the other spatial. In contrast with the work in [15], the temporal detection is applied first, leaving the validation of the detection results to the spatial detector. The SROD detector [4] is employed in the first step, looking for outliers of the intensity distribution of each analyzed pixel. The distribution is the one extracted from adjacent frames in the image sequence. A modified morphological spatial filter is used in the validation step, in order to make it suitable also for semi-transparent objects. Fusion is achieved by machine learning algorithms.

Despite these improvements, blotches which are poorly contrasted still evade detection. This has led recent work to consider variants of the corruption model and this is discussed next.

Semi-Transparent Defects

Many missing data defects do not affect the image in a catastrophic way. In fact many blotches and certainly line scratches still retain part of the original information in the defected area [13,35-37]. These are semi-transparent defects. Examples are shown in Figure 11.8.

The mixture model of Equation 11.3 is therefore not appropriate for these defects. Semitransparent blotches can appear as irregularly shaped regions (see Figure 11.8), with variable color and intensity. They are commonly caused by contact with moisture so that underlying image details remain in the affected area, whereas the color and/or average intensity is changed. Note that even with opaque blotches the edge of the blotch region shows a measure of transparency. Despite the huge amount of work done for the restoration of opaque blotches (missing data defects), very little has been done specifically for semi-transparent blotches. As pointed out in [16], semi-transparency, large size and low visibility are the main reasons for which standard methods fail in the processing of that kind of degradation. Pure in-painting methods [38] that synthesize information may seem not suitable for this task [39,40]. Even for semi-transparent blotches, temporal information alone cannot completely characterize the defect. For this reason, some authors have paid particular attention to the spatial characterization of the defect. The spatial result can be then validated using the inter-frame information for avoiding further ambiguities—see [15,16] for a thorough discussion.

Examples of semi-transparent defects. The slightly visible semi-transparent blotch is circled in the right image.

FIGURE 11.8

Examples of semi-transparent defects. The slightly visible semi-transparent blotch is circled in the right image.

In 2008 Bruni et al. [41] were the first to approach the problem of detection of semi-transparent defects by exploiting observations about the HVS response to temporal discontinuities. They consider that the sudden reaction of human visual system (HVS) to the presence of a semi-transparent blotch corresponds to the projection of the degraded image G into a new space where semi-transparent blotches become the most visible objects in the scene. The projection operator P depends on the physical event that generated the blotch. For instance, for blotches caused by contact with moisture, a suitable projection space is the saturation component in the HSV color space, since moisture causes a sort of miscellanea of colors. Moreover, human eyes perceive semi-transparent blotches as uniform areas even though it is not so. It means that blotches emphasize their visibility with respect to the remaining part of the scene at particular resolutions. A low pass filter simulates this effect: it removes redundant frequencies that are not perceived by HVS, providing the visual homogeneity of the degraded region (see Figure 16.12). The optimum level of resolution r is a trade-off between the enhancement of the degraded region and the preservation of its geometrical shape and size. Finally, the recognition and selection of the most visible regions in the previously defined projection space Pr [G] is performed through the definition of suitable distortion measures, that account for both global and local visibility of objects in the whole scene, as human eyes do. Conventional contrast definitions generally account for pixel-wise measures by considering the analyzed region as opaque objects over a uniform background. Unfortunately, this is not the case of semi-transparent blotches since they preserve and inherit background features. The proposed region-based distortion measure is the following

tmp271b278_thumb[2]

where T is a threshold value, Ω is the image domain, Ωτ is the visible region whose intensity value over-exceeds T, |ΩΤ | is its size, while (x, y) is apoint in Ωτ. D1 measures the change of the perception of a given object with respect to the fixed background, before and after its transformation through a threshold based operator, i.e.,

tmp271b279_thumb[2]

where M is the mean of the degraded imagetmp271b280_thumb[2]is    the    result    of    the    clipping transformation on

tmp271b281_thumb[2]In other words, it evaluates how an object of intensity Pr [G] changes if it is substituted for the threshold value T. D2 measures the change of the contrast of the same object over different backgrounds, i.e.,

tmp271b284_thumb[2]

where Mt is the mean of the non-clipped region of Pr [G].

The optimal threshold T is the one such thattmp271b285_thumb[2]is maximum (i.e., the point where occurs a good separation between the foreground and the background of the image). It is the maximum contrast for the image which is able to separate different objects of the image without introducing artifacts. In fact, from that point on, the clipping operator selects pixels belonging to both degradation and background.

The advantages of that kind of approach are its capability of: (1) detecting all blotches in the scene, even the slightly visible one or the ones masked by the underlying image content; (2) fine tuning all the involved thresholds and/or parameters since they are adaptively computed according to both local and global perception-based measurements; (3) being independent of the shape and size of the defect. These features make it a good candidate for spatial detector.

Reconstruction

Handling semi-transparency in reconstruction for video sequences requires explicit modeling of the corruption. In 2009 Mohammed et al. [42] approached this problem from the standpoint of matting. Their work is related to work in 2007 by Crawford et al. [39] and the essential equation is in fact a non-binary version of Equation 11.3. Hence we write instead

tmp271b287_thumb[2]

In this equation, α takes the place of b and is non-binary, but also varying between 0 and 1, while the corruption variable Λ, replaces c(x) and is constant. Setting Λ to 255 (for an 8 bit) image, then allows a blotch to be modeled as a continuous mixture between the underlying original image and a bright white corruption. In the matting problem, observed objects are modeled as a mixture of hidden foreground and background layers and the challenge is to extract the best alpha matte as well as foreground and background object layers. The proposed model is identical to the matting problem except that one of the layers is known, i.e., Λ. Hence the semi-transparency restoration problem is posed as the solution of the above equation for α, In at each pixel site that is corrupted, i.e., the extraction of an α mattte and a background layer In. In their work, Mohammed et al. employ the previous and next frames to model the color distribution of the background layer In. Smoothness of the background layer is an important prior and is imposed using the gradients of the surrounding frames in that region as well as an MRF (Markov Random Field). A Graph Cuts algorithm allows the solution for the variables. This is a very interesting model and in theory can be incorporated into the joint Bayesian framework introduced previously in this topic. It appears to be the most suitable model for these artifacts and will no doubt feature heavily in future work.

From top to bottom, left to right: Original semi-transparent blotch; its saturation component; the smoothed saturation component at the optimal resolution; global distortion as in Equation (11.7); detection mask.

FIGURE 11.9

From top to bottom, left to right: Original semi-transparent blotch; its saturation component; the smoothed saturation component at the optimal resolution; global distortion as in Equation (11.7); detection mask.

Line scratches manifest in much archived footage. They also occur due to accidents in film developing. The color of the scratch depends on which side of the film layer it occurs. It is often the case that not all the image information inside the scratch is lost. They are a challenge to remove because they persist in the same location from frame to frame.

FIGURE 11.10

Line scratches manifest in much archived footage. They also occur due to accidents in film developing. The color of the scratch depends on which side of the film layer it occurs. It is often the case that not all the image information inside the scratch is lost. They are a challenge to remove because they persist in the same location from frame to frame.

Line Scratches

Another very common form of degradation in film is line scratches. These are caused during developing or due to material being stuck in the film gate and abrading or smearing material over many consecutive frames. Figure 11.10 shows some examples. Early work by Kokaram [7,43] on detection and removal of this defect concentrated on spatial detection and reconstruction. Detection was limited to straight scratches and relied on the horizontally impulsive nature of the artifact (it generally is very narrow and well contrasted) combined with the longitudinal correlation. The degradation model was additive and is as follows.

tmp271b290_thumb[2]

where Gn is observed degraded image, In is the original clean image, and there are P lines distributed around location centres xp with an intensity cross section Lp( ) that varies horizontally only. The observed intensity profile of a line scratch is fairly regular and is brighter near the center of the scratch, getting darker near its extremities. Kokaram proposed a damped sinusoid to model this profile. However, considering that the line scratch can probably be explained as a diffraction effect through the narrow vertical slit created by the abrasion of the film emulsion, a more appropriate model is a sinc function. The width of the observed scratch depends on the width of the slit, while the brightness of the observed scratch changes according to the depth of the scratch on the film material. Since the scratch does not penetrate the film material, original image information persists and so the result is a semi-transparent artifact.

In 1996 Kokaram built the line detector by using the Hough Transform to isolate line candidates that are narrow and bright vertical lines. These candidates were then validated using the damped sinusoid model to reject false alarms. Four years later, Bretschneider et al. [44] employed the vertical detail component of the wavelet transform of the image to detect the line. A sinc-like shape was assumed for the horizontal line projectiontmp271b291_thumb[2]in    Equation

11.11, and only the vertical details of the degraded and clean images G, I are used for the degradation model. Given that line visibility is an important aspect for detection, Bruni et al. [36, 45] in 2004 introduced aspects of human perception into the degradation model as follows

tmp271b294_thumb[2]

Horizontal cross-section of an image containing line scratches (indicated by arrows).

FIGURE 11.11

Horizontal cross-section of an image containing line scratches (indicated by arrows).

The normalized parameter γρ balances the semi-transparency of the defect by taking into account its visibility with respect to the surrounding image content, according to Weber’s law [46]. In particular, γρ approaches zero in case of important scratches, while it converges to 1 whenever the scratch is masked by its context in the scene. In that way, even slightly visible defects are represented and they are detected only if their contrast value is up to the just noticeable threshold [36,46], see for example rightmost scratches in Figure 11.11. The resulting algorithm is computationally very fast, making its use in film analysis practical. Moreover, visibility laws allow both to avoid false alarms (or at least limit their number) and to tune the employed perception-based threshold. The same arguments remain valid whenever color films are considered, as shown in [47], where the color of the observed scratch is related to the depth of the slit on the film material. Note that blue scratches were treated in 2005 [48]. In that work intense blue scratches are detected as maxima points of the horizontal projection of a suitable detection mask created by thresholding hue, saturation, and value amplitudes ranges.

Since 1999 Besserer, Joyeux et al. [49-51] took a different approach to detection, exploiting the temporal coherence of the line scratch over several frames. Besserer [49] presented an excellent tracking algorithm (using a Kalman tracker) that connected horizontally impulsive line elements which persist across several frames at a time. Figure 11.12 shows the representation used. Their work yields very high quality detection masks and it seems sensible that a combination of the ideas of Bruni et al. and Besserer et al. are the way forward.

However, removing line scratches is notoriously difficult. While convincing spatial interpolation can be achieved in a single frame, over several frames any error in reconstruction is clearly seen since it is correlated with the same position in many frames. Example based texture synthesis, famously introduced by Efros et al. [52] can achieve very good spatial reconstructions, but temporally the result is poor if simply repeated on multiple frames. Most of the proposed approaches assume the absence of the original information in the degraded region, see for instance [7,44,51,53-57]. Therefore, they propagate neighboring clean information into the degraded area. The neighboring information can be found in the same frame [7, 44, 53, 54] or also in the preceding and successive frame exploiting the temporal coherency, as done in [51,55,56]. The propagation of information can be performed using in-painting methods, as in [53, 54], or interpolation schemes. In [7], an autoregressive filter is used for predicting the original image value within the degraded area. A cubic interpolation is used in [58], by also taking into account the texture near the degraded area (see also [57] for a similar approach), while in [44] a different interpolation scheme is used for low and high frequency components. Finally, in [55] each restored pixel is obtained by a linear regression using the block in the image that better matches the neighborhood of the degraded pixel. Figure 11.12 shows the problem of poor temporal consistency in the region of local motion. The autoregressive interpolator of Kokaram et al. [43] was used here.

Left: The image sequence representation useful for detecting line scratches. Center: Frame with scratch. Right: Reconstruction of the line scratch in the region of local motion. Errors in reconstruction are much more visible in a sequence. Work on temporally coherent interpolation is important.

FIGURE 11.12

Left: The image sequence representation useful for detecting line scratches. Center: Frame with scratch. Right: Reconstruction of the line scratch in the region of local motion. Errors in reconstruction are much more visible in a sequence. Work on temporally coherent interpolation is important.

A smaller class of restoration approaches assumes the presence of the original information in the degraded area. An additive multiplicative model is employed in [59]. In that work, the image content in the degraded area is modified using a linear model in order to match with the mean and variance of the original surrounding information. In [48] blue scratches are removed by comparing their contribution in the blue and green color channels with the one in the red channel, since a blue scratch is assumed to be negligible in the red channel. Visibility laws are used in [45,47] to also guide the restoration process. The idea is very simple: the contribution of a line scratch must be suitably attenuated till it is no more visible in the scene. It means that the contrast value between the degraded area and the surrounding region must be small enough to perceive the whole area as a uniform one. A Wiener filter based shrinkage is then applied to the degraded region, where the defect is modelled as noise, while the original image is derived by the inversion of the equation model (Equation11.12). This approach works only because the horizontal shape of the scratch is well defined, i.e., it is the sinc2 shape in Equation 11.12.

Any successful approach to line removal, however, must enforce temporal coherence of the interpolated region. An extension to the ideas of Efros to spatiotemporal synthesis for line scratch removal was introduced by Bornard et al. [29] and was quite successful. It relied on using local motion around the line scratch to fetch useful information outside that region in the next and previous frames. They used global motion only and hence there were difficulties with objects showing local motion. The problem is how to reconstruct the underlying missing data including reconstructing the motion convincingly over several frames. Recent work by Irani et al. [60], Kokaram et al. [61], and Sapiro et al. [62] has considered the problem of object removal in image sequences. While Kokaram takes a purely motion-based approach, Irani and Sapiro both use a combination of more sophisticated methods relying on substituting 3D cubes of data from other parts of the sequence or 3D inpainting. These ideas could be applied to the removal of line scratches. However, the nature of the defect is more severe than the cases considered for object removal thus far, in that motion around the region may not be periodic or may be highly varying over short distances. Line scratch removal technology in industry therefore still requires manual review of manipulated pictures.

Next post:

Previous post: