Databases Reference
In-Depth Information
hearing-disabled people. The superimposed dialogue is useful to find parts of our
interests from news programs because video contents correspond to their
superimposed dialogue. As an example of pattern recognition, there is another
research concerning identifying people using image and voice recognition. Thus,
the description of videos is used for structuring a large amount of video data. It
allows users to retrieve collections of video frames of their interests by means of
retrieving description data instead of video data themselves. Text data
corresponding to sound data of video data are usually created by hand. It is expected
that the image and voice recognition becomes feasible for automatically generating
the text data corresponding to the video data, but the quality of the recognition is
not good enough at present.
Spatial sensors such as GPS (Global Positioning System) and gyro will become
affordable. When the spatial sensors are used with video cameras, the positions
and directions of the cameras can be automatically generated as the spatial
description data of video data. Such spatial description data are useful for retrieving
video data 1) , since the spatial description data are generally cheap and reliable.
This paper proposes a new framework of video data retrieval using spatial
description data and 3D visualization. In addition to the spatial data, time data
are automatically generated as description data of video data. The time data is
also useful for video retrieval and structuring. This paper introduces a new concept
time walk-through for retrieving video data using time dimension based on the
“time” extension of the concept LoD (Levels of Detail) . The new concept temporal
LoD enables users to travel time in a virtual space.
Figure 1: A sequence of time-series video frames and camera movements in the real world
Search WWH ::




Custom Search