Geoscience Reference
In-Depth Information
44 CHAPTER 5. GRAPH-BASED SEMI-SUPERVISEDLEARNING
by edges that connect labeled vertices to unlabeled vertices. The graph edges are usually undirected.
An edge between two vertices x i , x j represents the similarity of the two instances. Let w ij be the
edge weight. The idea is that if w ij is large, then the two labels y i ,y j are expected to be the same.
Therefore, the graph edge weights are of great importance. People often specify the edge weights
with one of the following heuristics:
￿ Fully connected graph, where every pair of vertices x i , x j
is connected by an edge. The edge
weight decreases as the Euclidean distance
x i
x j
increases. One popular weight function
is
= exp
,
2
x i
x j
w ij
(5.1)
2 σ 2
where σ is known as the bandwidth parameter and controls how quickly the weight decreases.
This weight has the same form as a Gaussian function. It is also called a Gaussian kernel or
a Radial Basis Function (RBF) kernel. The weight is 1 when x i =
x j , and 0 when
x i
x j
approaches infinity.
￿ kNN graph. Each vertex defines its k nearest neighbor vertices in Euclidean distance. Note
if x i is among x j 's kNN, the reverse is not necessarily true: x j may not be among x i 's kNN.
We connect x i , x j if one of them is among the other's kNN. This means that a vertex may
have more than k edges. If x i , x j are connected, the edge weight w ij is either the constant 1,
in which case the graph is said to be unweighted, or a function of the distance as in (5.1). If
x i , x j
=
0. kNN graph automatically adapts to the density of instances
in feature space: in a dense region, the kNN neighborhood radius will be small; in a sparse
region, the radius will be large. Empirically, kNN graphs with small k tends to perform well.
are not connected, w ij
￿ NN graph. We connect x i , x j if
x i
x j
. The edges can either be unweighted or
weighted. If x i , x j are not connected, w ij
=
0. NN graphs are easier to construct than kNN
graphs.
These are very generic methods. Of course, better graphs can be constructed if one has knowledge
of the problem domain, and can define better distance functions, connectivity, and edge weights.
Figure 5.1 shows an example graph, where the edges are sparse. Let x 1 , x 2 be the two labeled
instances (vertices). Recall that the edges represent the “same label” assumption. For an unlabeled
instance x 3 , its label y 3 is assumed to be similar to its neighbors in the graph, which in turn are
similar to the neighbor's neighbors. Through this sequence of unlabeled data stepping stones, y 3 is
assumed to be more similar to y 1 than to y 2 . This is significant because x 3 is in fact closer to x 2 in
Euclidean distance; without the graph, one would assume y 3 is more similar to y 2 .
Formally, this intuition corresponds to estimating a label function f on the graph so that
it satisfies two things: (1) the prediction f( x ) is close to the given label y on labeled vertices; 2)
f should be smooth on the whole graph. This can be expressed in a regularization framework,
where the former is encoded by the loss function, and the latter is encoded by a special graph-based
Search WWH ::




Custom Search