In Depth Tutorials and Information

ARC3D: A Public Web Service That Turns Photos into 3D Models (Digital Imaging) Part 1

Introduction

Increasingly, cultural heritage professionals make use of 3D capture technologies, both for study and for documentation. There is a wide range of such technologies to choose from. One problem is that there does not seem to be a single one that is optimal for all objects or scenes. Thus, also taking into account cost and resources, there often is a need to choose different tools for different aspects of a scanning campaign. The overall scene layout may be captured with one, and then the detailed shapes of smaller pieces with another, for instance. An excellent example is the thorough work by Remondino and colleagues [1], who have combined the complementary strengths of LIDAR scanning, interactive photogrammetric reconstruction, and automatic, uncalibrated structure-from-motion. Here we present a tool for the latter, the first one that was available and which is still free for non-commercial use: ARC3D. But, as said, one must not expect this tool to supply all 3D models one desires. Nonetheless, we believe it can be of great use for museums, excavations, conservation, etc. As shown in Strecha et al. [2], under appropriate conditions for such methods to work, they are quite competitive with LIDAR scanning (i.e., time-of-flight based laser scanning), but require a smaller investment, by far, of time and money.

Before we describe the ARC3D tool in more detail, it may be necessary to explain what uncalibrated structure-from-motion actually is. It is a technique that extracts a 3D model from photographs of an object. The nice thing is that the user only needs to load the images into the computer, which then does the rest. In contradistinction with traditional photogrammetric techniques, the process is fully automated. One does not have to know about the camera settings, nor about the viewpoints from where the different images were taken. With the proliferation of digital photography, taking lots of pictures has become common practice anyway and the creation of 3D models may therefore require only very little overhead from the user’s point of view. This said, it helps a lot if some basic guidelines for taking the photos are followed when the goal is to also produce such 3D models. More about all this is to come. It is also important to note that automatically generated 3D data are hardly ever immediately satisfactory for their practical use. Often, some additional filtering or simplification is needed. The goal of such post-processing can be to enhance the quality, but also to reduce the size of the models. In order to support such post-processing, ARC3D is easy to use in combination with MeshLab, a state-of-the-art toolkit for 3D shape processing. MeshLab has been created by the CNR-ISTI group at Pisa and is still regularly updated. It is freely available via sourceforge [3].

The structure of the paper is as follows. Section 4.2 gives an overview of the ARC3D + MeshLab tools. Section 4.3 describes the principles behind the ARC3D design. Section 4.4 gives guidelines on how images should be taken in order to maximize the chance of successful 3D data extraction. Section 4.5 takes a practical case – one of the Mogao caves in China – as a way to illustrate the steps in the process. Section 4.6 gives some further examples of 3D models produced with ARC3D. Section 4.7 concludes the topic.

System Overview

ARC3D (ARC stands for Automated Reconstruction Conduit) is a web service to which users can upload images. The web service returns 3D data extracted from these images. A digital photo-camera and an Internet connection suffice to generate 3D. This web service is meant to bring 3D capture at the fingertips of virtually all cultural heritage professionals. Apart from taking care that good quality pictures are taken, the user does not need any particular skills in terms of computer programming or geometry.

Basically, there are three steps to distinguish in the overall process of turning the images into the final 3D model:

• Images of the object or the scene to be reconstructed are first checked by the user for quality, and then uploaded to the web service [4] which computes dense depth maps. One depth map is created for each image. Such a map shows the distances to the points of the object that are visible for that camera, as the map intensity in each point. This is the basic ARC3D web service.

• A dedicated plug-in puts the dense depth maps into a format that can be read and handled by MeshLab. But this plug-in does actually more than just that. In combination with the confidence values given out by ARC3D for the points in the different depth maps, filtering operations are selected and run automatically, yielding higher quality depth maps as a starting point for further operations on the basis of MeshLab.

• Then the user can apply several tools in MeshLab, for a variety of purposes, including the interactive registration of the depth maps into complete surfaces, as well as mesh refinement and simplification.

The remainder of this topic focuses on the first step (i.e., ARC3D and how it extracts 3D from the images). This web service runs on computers at the University of Leuven. As we want to make the explanation of what it does accessible for various groups of users, we avoid mathematical explanations. The math behind ARC3D can be found in [5] and readers who want to read even more background material can have a look at [6] and [7].

System Components

Figure 4.1 shows a schematic overview of the client-server setup of the 3D web service. The client part (user) is found at the top of the figure. The server side is at the bottom. On his PC, the user runs two programs, the upload tool (A) and the modelviewer tool (B). Details about these tools follow in Sections 4.2.2 and 4.2.3. Both of them can be downloaded from the ARC3D website [4].

In the upload tool, digital images can be imported. Upon registering with the ARC3D web service, the user can transfer these images (indicated C) over the Internet to the Leuven server. There, fully automatically a set of parallel processes are launched, to compute dense 3D information from the uploaded images. The parallel processes are run on a cluster of Linux PC’s (D). When the server is ready with the processing, an email is sent to the user, to inform him that the results can be downloaded from the server through FTP. They consist of a dense depth maps for every image and the calibration parameters of the different cameras (E).

The modelviewer tool shows the results to the user. Every depth map in the image set can be rendered as a 3D surface mesh, where unwanted areas like background or low-quality parts of the scene can be masked out. This filtering is aided by the confidence maps, one coming with each depth map. By interactively setting thresholds on the minimum confidence the user wants each depth value in the map to have, lots of noise can be removed with such simple interaction. The meshes can be saved in a variety of formats.

Upload Tool

The first tool a user of the 3D web service encounters is the upload tool. This is the program that uploads images to the server via the Internet. Figure 4.2 shows a screenshot of the GUI of this tool. First, the user selects the images which he wants to upload. Thumbnails of these images are shown on the left and a large version of the image is shown in the main panel when a thumbnail is clicked.

FIGURE 4.1

The ARC3D client-server setup. Images (C) are uploaded to the server, with the upload tool installed on the PC of the user (A). Connected to the server, a PC cluster (D) extracts the camera parameters and 3D information. The results (E) can be downloaded via FTP and visualized on the user PC with the modelviewer tool (B).

FIGURE 4.2

Upload tool. Thumbnails of the images to be uploaded are shown as well as some statistics. When satisfied, the user uploads the images to the reconstruction server.

FIGURE 4.3

Modelviewer tool. The layout of the tool deliberately resembles that of the upload tool, with thumbnails on the left and a large image in the main panel. The tool allows the user to create textured wireframe representations of the depth maps. (a) The user selects the depth image to be visualized. (b) Two 3D representations of the depthmaps.

FIGURE 4.4

Modelviewer tool. The operator has automatically masked the blue sky with a simple click inside the region.

Images can be removed from the list or extra images can be added. On the bottom left, statistics on the image such as the number of pixels, the size on disk, and the format are shown. When the user is satisfied with the selected images, he can upload them to the reconstruction server. In order to do so, the user first needs to authenticate himself with his user name and password. The upload is started and a progress dialog shows the speed of the upload. In order to limit the upload- and processing time, the user can decide to downscale the images by a certain percentage.

Modelviewer Tool

Upon completion of the 3D reconstruction (i.e., of the per image dense depth maps) the user is notified by email that the results are ready. They are stored in a zip file on the ESAT FTP server in a directory with a random name whose parent is not listable. This makes sure the user’s reconstructions are safe from prying eyes. The zip file contains the original images, the calibration of the cameras, the dense depth maps, and quality maps for every image. The results can be inspected by means of the modelviewer tool, a screenshot of which is shown in Figure 4.3(a). The layout of the modelviewer tool purposely resembles that of the upload tool. A thumbnail is shown for every image in the set. If clicked, a large version of the image is shown in the main window. A triangulated 3D model can be created for the current image, using the depth map and camera of this image. Every pixel of the depth map can be put in 3D and a triangulation in the image creates a 3D mesh, using the texture of the current image. Every depth map has its corresponding confidence map, indicating the quality or certainty of the 3D coordinates of every pixel. The user can select a quality threshold. High thresholds yield models with fewer but more certain points. Lower thresholds yield more complete models including areas with a lower reconstruction quality. If required, certain areas of the image can be masked or grayed out by drawing with a black brush in the image. Areas that are masked will not be part of the 3D model. Homogeneous regions like the sky yield bad reconstruction results and can automatically be discarded by the user. A simple click inside such a region starts a mask-growing algorithm that covers the homogeneous region, as can be seen in Figure 4.4 where the sky above Arenberg Castle in Leuven is automatically covered by a mask. The resulting 3D model is displayed in an interactive 3D widget. Two viewpoints of the Castle example are shown in Figure 4.3(b). The model can be exported in different 3D file formats like VRML2, Open Inventor, OBJ, or OpenSG’s native file format.

Next post: ARC3D: A Public Web Service That Turns Photos into 3D Models (Digital Imaging) Part 2

Previous post: Processing Sampled 3D Data: Reconstruction and Visualization Technologies (Digital Imaging) Part 4