Graphics Reference
In-Depth Information
12.5.6 Summary
In this section, we have examined in some detail a concrete case study of emerging
methods for data-driven, ASM system design targeted to embedded computer vision.
In particular, we have discussed a contextual bandit framework (CBF) for learning
contention and congestion conditions in object or face recognition viawirelessmobile
streaming and cloud-based processing. Analytic results show that the CBF frame-
work converges to the value of the oracle solution (i.e., the solution that assumes
full knowledge of congestion and contention conditions). Simulations within a
cloud-based face recognition system demonstrate that the CBF approach outper-
forms Q-learning, as it quickly adjusts to contention and congestion conditions. For
more details on the CBF approach, we refer the reader to [ 2 ].
12.6 Future Directions in Stream Mining Systems
for Computer Vision
Most existing solutions for designing and configuring computer vision and
stream-mining systems based on the extracted visual data offload their processing to
the cloud and assume that the underlying characteristics (e.g., visual characteristics)
are either known, or that simple-yet-accurate models of these characteristics can
be built. However, in practice, this knowledge is not available and models of such
computer vision applications or the associated processing mechanisms are very dif-
ficult to build and calibrate for specific environments, since these characteristics are
dynamically varying over time. Hence, despite applying optimization, these solu-
tions tend to result in highly sub-optimal performance since the models they use
for the experienced dynamics are not accurate. Hence, reinforcement learning (i.e.,
learning how to act based on past experience) becomes a vital component in all such
systems. Some of the best-performing online reinforcement learning algorithms are
Q-learning and structural-based reinforcement learning. In these, the goal is to learn
the state-value function, which provides a measure of the expected long-term per-
formance (utility) when it is acting optimally in a dynamic environment. It has been
proven that online learning algorithms converge to optimal solutions when all the
possible system states are visited infinitely often [ 36 ].
However, these methods have to learn the state-value function at every possible
state. As a result, they incur large memory overheads for storing the state-value
function and they are typically slow to adapt to new or dynamically changing
environments (i.e., they exhibit a slow convergence rate), especially when the state
space is large—as in the considered wireless transmission and recognition prob-
lem of Sect. 12.5 . These memory and speed-of-learning deficiencies are alleviated
in structural-based learning solutions. Despite this, a key limitation still remains:
all these schemes provide only asymptotic bounds for the learning performance—no
speed-of-learning guarantees are provided . Nevertheless, in most computer vision
Search WWH ::




Custom Search