Database Reference
In-Depth Information
Figure 12.20 Beginning of the process of gradient descent
Consider line (1). It shows the initial value of w = [0, 1]. Recall that we use u and v for
the components of w , so u = 0 and v = 1. We also see the initial value of b = −2. We must
use these values of u and v to evaluate the conditions in Fig. 12.19 . The first of the condi-
tions in Fig. 12.19 is u + 4 v + b ≥ +1. The left side is 0 + 4 + (−2) = 2, so the condition is
satisfied. However, the second condition, 2 u + 2 v + b ≥ +1 fails. The left side is 0 + 2 + (−2)
= 0. The fact that the sum is 0 means the second point (2, 2) is exactly on the separating
hyperplane, and not outside the margin. The third condition is satisfied, since 0 + 4 + (−2)
= 2 ≥ +1. The last three conditions are also satisfied, and in fact are satisfied exactly. For
instance, the fourth condition is u + v + b ≤ −1. The left side is 0 + 1 + (−2) = −1. Thus, the
pattern oxoooo represents the outcome of these six conditions, as we see in the first line of
Fig. 12.20 .
We use these conditions to compute the partial derivatives. For ∂f / ∂u , we use u in place
of w j in Equation 12.6 . This expression thus becomes
The sum multiplying C can be explained this way. For each of the six conditions of Fig.
12.19 , take 0 if the condition is satisfied, and take the value in the column labeled “for u
if it is not satisfied. Similarly, for v in place of w j we get ∂f / ∂v =
Finally, for
b we get (−1) + 0 + 0 + 0 + 0) = −2.1.
We can now compute the new w and b that appear on line (2) of Fig. 12.20 . Since we
chose η = 1/5, the new value of u is
the new value of v is
and the new value
of b is
To compute the derivatives shown in line (2) of Fig. 12.20 we must first check the con-
ditions of Fig. 12.19 . While the outcomes of the first three conditions have not changed,
the last three are no longer satisfied. For example, the fourth condition is u + v + b ≤ −1,
but 0.04 + 0.84 + (−1.58) = −0.7, which is not less than −1. Thus, the pattern of bad points
becomes oxoxxx. We now have more nonzero terms in the expressions for the derivatives.
For example ∂f / ∂u =
The values of w and b in line (3) are computed from the derivatives of line (2) in the
same way as they were computed in line (2). The new values do not change the pattern of
bad points; it is still oxoxxx. However, when we repeat the process for line (4), we find
that all six conditions are unsatisfied. For instance, the first condition, u +4 v + b ≥ +1 is not
satisfied, because (−0.118+4×0.502+(−1.083) = 0.807, which is less than 1. In effect, the
Search WWH ::




Custom Search