Information Technology Reference
In-Depth Information
is already available in several mobile phones. It is likely that RFID will gain more
prominence in the near future and will supersede visual markers in some domains.
A range of electromagnetic solutions (e.g. Ubisense 7 ) provide for capturing the
physical location of objects. To date, however, they require large markers that cannot
be used when working with thin sheets of paper.
Some research projects have developed individual solutions for identifying ob-
jects. Similarly to RFID, integrated circuits storing a unique identifier are applied
to physical objects. The communication to a reading device is made via a wired
connection (e.g. [130, 49]) or a wireless connection (e.g. [114]).
Content-based Identification and Tracking
Content-based approaches do not require markers. Instead, documents are identified
and tracked solely by their visual appearance. A camera is observing the scene or
documents are scanned, and image processing techniques are used for analyzing the
images.
The most commonly used approach is SIFT features [83]. In order to identify
a document page, it first has to be registered. The system captures specific optical
features of the image and stores these features as a fingerprint of the document page.
When a document page appears in the camera image, its features are captured and
compared against the fingerprints in the database to identify the page. This technique
performs quite well if the camera provides a good resolution and the number of
different pages is not very high. For instance, Kim et al. [60] reached a recognition
rate of more than 90 % if the document width in the camera image was at least
300 pixels. Liu et al. [79] present an improved set of features that was inspired by
the SIFT feature set. They achieved recognition rates of more than 99 %, even if
pages had to be identified from a very large set of pages. However, their evaluation
excluded lighting effects and camera noise, so it must be assumed that in real setups
performance will be lower.
Another approach uses optical character recognition of document text [24]. The
recognized text is used for querying a database of documents and retrieving the
digital version. As an alternative to using text recognition, the white spaces between
words have proven to be quite unique for each text. Brick Wall Coding [26] detects
the white gaps between words and takes them as features. It has lower recognition
performance than SIFT or FIT features, but it can detect a document page even when
the page is only partially visible to the camera.
A drawback of content-based approaches is that the visual appearance must con-
tain enough information for unambiguous identification. For instance, it is not pos-
sible to uniquely identify one of several copies of the same document page. More-
over, content-based identification is challenging if the user modifies documents, for
instance by making annotations.
7
http://www.ubisense.net
Search WWH ::




Custom Search