Processing Sampled 3D Data: Reconstruction and Visualization Technologies (Digital Imaging) Part 2

Merging

Reconstruction/merging from point samples or range maps has been one of the more active fields on scanning-related research in the last few years. A very important feature of a reconstruction code is to perform a weighted integration of the range maps (or point set) and not just joining them. Since we usually have a high degree of overlap and sampled data are noisy, a weighted integration can significantly improve the accuracy of the final result by reducing the impact of most of the noisy samples. Another important feature of a reconstruction code is the capability to fill up small holes (i.e., region not sampled by the scanner, see subsection 3.2.3). Finally, since reconstruction algorithms require a very large memory footprint on a big dataset, they have to be designed to work locally on subsections of the data, loading only the data subset involved in the generation of a single portion of the final results (out-of-core reconstruction).

Range maps are often taken according to a regular order: an example of circular radial acquisition performed around a statue’s head (left); an example of a raster-scan scanning order adopted for the acquisition of a bas-relief (right).


FIGURE 3.6

Range maps are often taken according to a regular order: an example of circular radial acquisition performed around a statue’s head (left); an example of a raster-scan scanning order adopted for the acquisition of a bas-relief (right).

Fit Results to Application Requirements

Simplification/multiresolution is another key ingredient in scanned data processing. When we choose a reconstruction kernel region approximately equal to the inter-sampling distance used in scanning, the reconstructed models may become huge in size (i.e., many millions of triangles). Most applications require significant complexity reduction in order to manage these models interactively. Two problems arise when we try to simplify such models: we need solutions working on external memory to cope with these big models [27]; and simplification has to be accurate [28,29] if we want to produce high-quality models and accurate interactive visualization (possibly, based on multiresolution data representation schemes). These issues are presented in Section 3.5.

Color Mapping

Color is usually the final stage of the pipeline. Here we have to support the reconstruction of textured meshes from a sampling of the object’s surface reflection properties. The easiest and most common approach is the acquisition and mapping of the apparent color (reflected color, illumination-dependent) using a digital photo camera. But we can also immerse the artifact into a controlled lighting setup, to produce a more sophisticated acquisition of the reflection properties of the object’s surface (see for example the methodology for the acquisition of bi-directional reflectance distribution function – BRDF proposed in [30]). This latter approach is unfortunately often impractical for those applications, such as acquisition of CH artifacts, which require one to perform the acquisition in real-life conditions (i.e., when we cannot move the object to be scanned in an acquisition lab). The issues involved in color acquisition and management are discussed in detail in Section 3.3.

Implementing Range Maps Alignment as an Automatic Process

We have already introduced that range maps alignment is usually the most time-consuming phase in the 3D scanning pipeline. But the goal is obviously to make scanning as much as possible an automatic process. Therefore, completely automatic scanning systems have been proposed. Some of those systems are based on the use of complex and costly positioning/tracking machinery (e.g., see [31] or the several commercial scanners based on optical or magnetic tracking of the scanning head location); others adopt passive silhouette-based approaches which do not extend well to the acquisition of medium- or large-scale artifacts.


An example of four matching point pairs selected on two range maps by an automatic matching algorithm.

FIGURE 3.7

An example of four matching point pairs selected on two range maps by an automatic matching algorithm.

An alternative approach is to design new algorithms for processing the data produced by standard scanning systems, which are able to transform the classical scanning pipeline into a mostly unattended process. In particular, the range maps alignment phase is the only task where a considerable human intervention is required. Several papers proposed methods for automatic alignment, usually based on some form of shape analysis and characterization [32].

The general alignment problem can be made more easy to manage by considering some assumptions which usually hold in practical use. While designing a new automatic registration solution [33], we started from a few initial conditions directly gathered by our experience in 3D scanning. First, the detection of the pairs of overlapping range maps can be reduced to a simpler task, once we notice that 3D acquisition is usually done by following simple scanning pose paths. Users usually acquire range maps in sequences, following either a vertical, horizontal, raster-scan, or circular translation of the scanning system (see Figure 3.6). The different types of sequences share a common property: they contain an ordered set of n range maps, such that range map Ri holds a significant overlapping with at least Ri-I and Ri+i. Vertical, horizontal, or raster-scan stripes are often produced when acquiring objects like bas-reliefs, walls, or nearly planar items. Circular stripes are indeed more useful when acquiring objects like statues, columns, or objects with an axial symmetry.

If we can assume that the acquisition has been performed using one of these stripe-based patterns, then we may reduce the search for overlapping and coarse registration to each pair of consecutive range maps (Ri, Ri+1). An automatic registration module can process each couple (Ri, Ri+1), to produce in output the roto-translation matrix Mi that aligns Ri+1 to Ri. Matrix Mi can be computed by applying some basic geometric processing to the two range maps: feature points can be detected by evaluating a shape descriptor on the two meshes (or point set); potential corresponding feature points pairs can be detected by adopting RANSAC-like approaches: from these possible point pairs, we can select the one which produces, after ICP alignment, the matrix Mi which performs the best alignment (see [33] as an example of this type of solution). Many other approaches for the design of shape feature characterization and matching are also possible [34,35]. As an alternative to geometry-based solutions, it is also possible to work with an image-based approach [36]: correspondent point pairs can be also retrieved by looking to image features in the RGB channel associated to the XYZ samples (under the assumption that the scanner produces self-registered shape and color samples).

The subset of registration arcs is not complete nor sufficient by itself when we restrict the search to consecutive pairs in linear sequences, since we usually have many other potential overlaps between range maps. On the other hand, information on those consecutive pairs is sufficient for the application of an intelligent ICP-based solution that allows one to complete the graph. Designing a smart alignment tool able to complete the needed arcs (interconnecting Ri with all the overlapping range maps, not just Ri-1 and Ri+1) is easy. We can represent the space of the set of range maps with a spatial indexing data structure, a regular grid which encodes for each 3D cell the set of range maps passing through that sub-volume. This data structure allows an easy automatic detection of the potential overlaps between generic pairs of range maps. We can then iterate automatic ICP-based alignment on all those overlapping range map pairs which have not been already processed according to the linear ordering. The alignment tool can therefore introduce all needed arcs (in a completely unattended manner), by selecting and processing only those arcs which satisfy a minimum-overlap factor.

The coarse alignment obtained over a set of range maps representing a bas-relief (top) and the final model (bottom) obtained after automatic completion of all overlapping pairs; alignment has been performed in non-attended mode using the solution presented in [33].

FIGURE 3.8

The coarse alignment obtained over a set of range maps representing a bas-relief (top) and the final model (bottom) obtained after automatic completion of all overlapping pairs; alignment has been performed in non-attended mode using the solution presented in [33].

The automatic registration approach sketched above [33] has been tested on many complex scanning campaigns (where each range map is usually affected by noise, artifacts, and holes). An example concerning a bas-relief is shown in Figure 3.8, whose approximate length is 2.5 meters. In this case two raster-scan (snake-like) stripes were acquired and processed, for a total of 117 meshes (about 45.5M vertices). The overall automatic alignment requires usually much less than the raw scanning time and it is therefore sufficiently fast to run in background during the acquisition, processing all the scans in sequence as soon as they are produced.

Solutions similar to the ones presented in this subsection are unfortunately still not provided by commercial software; inclusion of automatic alignment would speed up the processing of scanned data, reducing the manpower required and the overall costs.

Enhancing Incomplete Surface Sampling

According to 3D scanning experience, obtaining a complete sampling of a complex artifact surface is often impossible [31]. Various are the reasons why we usually end up with an incomplete scan: presence of self-obstructing surfaces; presence of small cavities or folds; sections of the surface which are not cooperative with respect to the optical scanning technology adopted (highly reflective materials like polished metals, transparent components such as glass or gems, surfaces which do not reflect the emitted light probe, etc). In all those cases, we have to decide if the model has to be completed or if we have to keep it incomplete. In the CH domain we are usually asked to produce models which should contain only sampled geometry, i.e., the use of software solutions which fill up the gaps is not allowed. On the other hand, incomplete digital models perform very poor in visualization, since the holes usually attract the observer’s attention much more than the other clean parts. Clean sampled surfaces are therefore needed for visualization, obtained by closing all the gaps with plausible surface patches. A very nice extension to available data formats would be an attribute that could allow us to differentiate between sampled and interpolated geometry (a sort of confidence value assigned to each geometric element), making it possible to make visually evident those two different data components. This would be an interesting addition to provenance data encoding.

Gaps filling can be obtained by two orthogonal approaches: volumetric and surface oriented. In the first case the holes can be filled:

•    At reconstruction time, for example by enhancing volumetric reconstruction approaches based on a discrete distance field with a diffusion process which extends the distance field in regions not covered by scanned samples [37] or by adopting methods based on the Poisson reconstruction formulation [38];

•    After the reconstruction, by putting the model in a volumetric grid and devising solutions able to produce a watertight surface [39, 40].

Unfortunately, these approaches make it very hard to differentiate sampled and interpolated geometry. Moreover, methods based on volumetric diffusion are very complicated to use because steering the diffusion process to obtain a plausible completion surface is not easy; a time-consuming trial and error process is usually needed to find the parameters best fitting a given dataset. Poisson-based reconstruction is more frequently used, due to ease of use and free availability (both as source code distributed by authors and as a feature available in open source tools, e.g., MeshLab [41]).

On the other hand, geometric processing solutions can be devised to detect and fill unsampled regions. Accordingly, surface-oriented approaches try to detect and close holes preserving triangle shape and curvature [42, 43]. The problem is not simple because we need geometrically robust solutions, able to close any gap with a surface patch which should share curvature continuity with respect to the surface regions adjacent to the open border. Some issues are the necessity to deal with self-intersections and isles, the algorithm speed and robustness, and the difficulty in creating fully automatic but reliable methods. Moreover, in some cases a basic concept of curvature continuity is not enough, since a missing region can contain surface features (such as a given texture or a carved detail) that we may want to reproduce in a way conforming with adjacent regions. So called surface inpainting methods have been proposed to support intelligent cut and paste of surface detail from one completely sampled region to a partially sampled region [44, 45].

Color Sampling and Processing

There are many application domains which require not just shape sampling, but also accurate acquisition and management of color data. Sophisticated approaches for sampling the surface reflection characteristics have been proposed (e.g., generic BRFD sampling [30] or technologies for the acquisition of the reflectance of human faces [46, 47]). In some cases, those approaches are unfortunately too complicated to be massively applied in all those fields where we do not have the pleasure to work in controlled lab conditions, as it is the case of CH digitizations (usually performed in crowded museums and under a not proper illumination setup). Let’s describe first the easier approach to gathering color data and then how those data can be encoded in digital models. At the end of the section, we will review the other more sophisticated approaches to gathering samples of the surface reflection properties.

Basic Acquisition of Color Data

The basic approach, acquiring just the so-called apparent color and mapping those samples to the 3D model, is still widely used in most of the practical cases. A series of pictures can be taken with a digital camera, trying to avoid shadows and highlights by taking them under a favorable lighting setup; these photographs are then stitched onto the surface of the object. However, even in this simpler case, the processing needed to build a plausible texture is not straightforward [48]. Naive mapping of apparent color on the mesh can produce severe discontinuities that are due to the varying illumination conditions of the surface sampled by the photos (perceived color depends on specific illumination and direction of the viewer in the instant the photo is shot). Many different approaches have been proposed to reduce the aliasing and to produce seamless color mapping; we cite here only some representative papers: using the range intensities produced by some active optical scanning devices to correct the color information [49]; detecting and removing cast shadows [50], which are usually a major problem in color acquisition in outdoor scenes; devising methods for computing the inverse illumination (i.e., recovering approximate surface reflectance from a sampling of the real surface under known illumination conditions [51,52].)

Recovering Camera Parameters

A basic problem in managing color information is how to register the images with the geometric data in a time-efficient way. Once intrinsic (focal length and distortion of the camera lenses) and extrinsic parameters (view specifications) have been computed for each image by registering it onto the geometric model, many approaches exist to map the color info on the 3D model, based on mesh parameterization or color-per-vertex encoding. The bottleneck in the color pipeline is the image-to-geometry registration phase, a complicated time-consuming phase which requires substantial intervention of a human operator since the current approach is based on the selection of several corresponding point pairs which link each 2D image to the 3D mesh [53].

We designed a tool to support image registration, TexAlign [54], which solves the image-to-geometry registration problem by constructing a graph of correspondences, where: the 3D model and all the images are represented as nodes; a link is created for any correspondence defined between two nodes (implementing either an image-to-geometry or an image-to-image correspondence). This graph of correspondences is used: (a) to keep track of the work done by the user; (b) to infer automatically new correspondences from the instantiated ones; and (c) to find the shortest path, in terms of the number of correspondences that must be provided by the user, to complete the registration of all the images. The goal is to assist the user in the management of large sets of images. This system has been used to map several complex photographic samplings (e.g., in the framework of the Michelangelo’s David restoration we mapped on the digital 3D model 61 images showing the pre-restoration status and 68 images depicting the post-restoration condition [55], see Figure 3.9).

Analogously to the range map alignment problem, this image registration phase should be as much as possible solved automatically. We need fully automatic and robust approaches able to solve the general problem (i.e., a large and complex object, where each image covers only a subset of its overall extent). Finding automatically the correspondences between a set of 2D images and a 3D mesh is not an easy task, also because the geometry features are usually less dense than the image features we can retrieve in the photographs. A possible research direction could be to move from the usual search for image-to-geometry correspondences to a context where we use both image-to-geometry and image-to-image correspondences. As shown in [36] and other recent works, finding correspondences in images is simpler than detecting image-to-geometry correspondences. Since a large number of image-to-image correspondence pairs can be detected in an automatic manner, we can deploy that information to speed up the overall image registration process, or to solve those cases where a single image covers a region where the surface has insufficient shape features to allow an accurate selection of image-to-geometry correspondences. We have recently proposed an approach based on Mutual Information [57], a statistical measure of non-linear correlation between two data sources often used in medical images registration.

The David model is shown with color mapping; on the left is the pre-restoration status (61 images mapped), while the post-restoration status is shown on the right (another set of 68 images). The two colored David models are rendered in real time with the Virtual Inspector system [56].

FIGURE 3.9

The David model is shown with color mapping; on the left is the pre-restoration status (61 images mapped), while the post-restoration status is shown on the right (another set of 68 images). The two colored David models are rendered in real time with the Virtual Inspector system [56].

The image presents a result of the automatic image-to-geometry registration: given the image on the right, the proper projective transformation is computed by finding the best matching between the input image and renderings of the mesh with vertices colored according to a combined normal vector and accessibility shading factor [57].

FIGURE 3.10

The image presents a result of the automatic image-to-geometry registration: given the image on the right, the proper projective transformation is computed by finding the best matching between the input image and renderings of the mesh with vertices colored according to a combined normal vector and accessibility shading factor [57].

The main idea is to use mutual information as a similarity measure between each image to be registered and some renderings of the geometric 3D model, in order to drive the registration in an iterative optimization framework. The problem can be divided into two sub-problems: finding an approximate solution, the rough alignment where we find a view of the geometric model that sufficiently matches the photographic image we want to align. For example, a rough alignment can be found with a simple greedy approach, by generating a large number of suitable initial views and checking which one best matches the input image. Then, this view is refined to find the optimal one, the fine alignment, by an iterative process that slightly modifies the view over the 3D model, produces a rendering, and checks the similarity of the rendering with the input image. We demonstrate that some illumination-related geometric properties, such as surface normals, ambient occlusion, and reflection directions can be efficiently used for this purpose, improving the convergence speed of the search. After a comprehensive analysis of such properties we proposed a way to combine these sources of information in order to improve the performance of our automatic registration algorithm [57]. The proposed approach can robustly cover a wide range of real cases and can be easily extended.

Next post:

Previous post: