Graphics Reference
In-Depth Information
11.5 Peer-to-Peer Distributed Algorithms
Client-server architectures do not scale for a wide range of vision problems. The
community has now developed a range of distributed algorithms for important vision
tasks, but we still have much to learn.
Long-term tracking using cameras with nonoverlapping views has received a great
deal of attention. Kim and Wolf [ 13 ] developed a distributed algorithm for Markov
chain Monte Carlo tracking. Cameras estimate local paths based on local communi-
cations; the local paths are then transmitted through the network and concatenated
to build longer tracks of the target. Esterle et al. [ 7 ] use autonomous self-interested
agents to learn the vision graph during operation. The algorithm does not require
multi-camera calibration since it does not rely on a priori camera topology informa-
tion. Cameras bid for the right to track an object; the utility of a tracked object to a
camera depends on the visibility of the target to the camera and its confidence in its
tracking estimate. Sales of a target from one camera to another are used to build the
structure of the vision graph. As the systemmakes more observations, the communi-
cations between cameras can become more targeted based on their understanding of
the vision graph structure. Wan and Li [ 27 ] formulated an online algorithm for asso-
ciating observations with targets; at each new observation, nodes trade information
with neighbors on observations and networks, then update their inferences.
Sek Chai [ 26 ] describes a distributed smart camera system based on smartphone
processors called visual sensor networks (VSNs). Nodes distribute metadata for
search purposes while video data remains on the local nodes. Metadata is organized
into a video catalog accessible using key search indices including time, location,
and object description. Nodes decide when to activate their cameras in order to
manage their energy consumption; a node may be turned on at a predefined time or
by an external event such as sound or vibration detection. Their implementation on a
Qualcomm Snapdragon MSM8960 used 1.6W for capture, processing, and storage
of compressed video, with motion tracking and analytics generation requiring an
additional 0.4W.
An early example of in-network search was provided by Yan et al. [ 32 ], who
developed a distributed image search system for a sensor network based on iMote2
sensor nodes. Their system uses SIFT to generate feature vectors that are clustered
into visterms (representation of an image feature). They optimized both their vocab-
ulary tree and inverted index for flash memory. They used buffered lookup to reduce
the performance penalty of storing the vocabulary tree in flash, which has much
longer access times than RAM. The read the tree in subtree increments that fit into
ram, then buffer a collection of SIFT vectors for lookup in the subtree. Optimizations
for the inverted index were designed to compensate for the write characteristics of
flash: writes are slower than reads, and an entire block must be erased and rewritten
to change any value in the block [ 19 ]. They stored the inverted index in a log-
structure, which uses an appended log to record changes to the file, which requires
multiple file reads for a data access. They minimize read overhead by writing only
visterms with many associated image IDs to flash; Zipf's Law predicts that a small
Search WWH ::




Custom Search