Information Technology Reference
In-Depth Information
Fig. 13.3
Overview of our traffic video analysis approach for traffic congestion level estimation
(i.e., by monitoring traffic jams). The problem that is addressed by the VA component
is to discover a traffic jam in the viewing range of a single camera. We perform a global
(i.e., holistic, or macroscopic) analysis making use of particular features extracted
from videos. The VA component receives periodically a video snapshot of 5 s length.
Visual features are extracted in order to generate motion descriptors. The snapshot
is classified into one of the three traffic categories light , medium and heavy using
the motion descriptors. An overview of our holistic approach for the analysis of
traffic videos is given in Fig. 13.3 . The classification result of the current snapshot
is passed to the routing component. In the following, we briefly review methods
which have been proposed to analyze traffic surveillance videos, then introduce our
method, and finally provide and discuss evaluation results on the UCSD highway
traffic dataset [ 5 ].
13.3.2.1 Related Work
Computer vision has been successfully applied to the analysis of traffic scenes with
the aim of determining traffic congestion, counting vehicles, identifying license
plates, and detecting incidents among others. The survey by Buch et al. [ 4 ]pro-
vides a comprehensive review of solutions used to tackle those problems. Our aim is
to use computer vision for congestion level estimation in traffic videos. Taking into
account the relevant literature, we observed that the methods for traffic video analysis
for congestion estimation can be classified into two main categories. Namely, those
based on object detection and tracking, and those based on a global (i.e., holistic, or
macroscopic) analysis making use of particular features extracted from the videos.
In the methods of the first category, the focus is on detecting and tracking indi-
vidual objects (e.g., a single vehicle) in order to infer higher level traffic information
(e.g., counting or congestion level estimation). Such a chain of operation—detection,
tracking, and inference—appears as a natural way of processing, as this is how a
human operator would interpret a scene appearing in a video.
Lien and Tsai [ 14 ] have proposed an approach falling into this first category.
Their method uses vehicle detection and tracking to count vehicles and detect traffic
jams. Detection is performed in a multistep fashion, where frame differencing is
used to extract moving regions in the video, followed by morphological processing,
and completed by a dual (short-term and long-term) background modeling. Orienta-
tion histogram of motion vectors of extracted moving objects is finally analyzed to
Search WWH ::




Custom Search