Graph-Based Semi-Supervised Learning - Introduction to Semi-Supervised Learning

Geoscience Reference

In-Depth Information

44 CHAPTER 5. GRAPH-BASED SEMI-SUPERVISEDLEARNING

by edges that connect labeled vertices to unlabeled vertices. The graph edges are usually undirected.

An edge between two vertices x i , x j represents the similarity of the two instances. Let w ij be the

edge weight. The idea is that if w ij is large, then the two labels y i ,y j are expected to be the same.

Therefore, the graph edge weights are of great importance. People often specify the edge weights

with one of the following heuristics:

Fully connected graph, where every pair of vertices x i , x j

is connected by an edge. The edge

weight decreases as the Euclidean distance

x i −

x j

increases. One popular weight function

= exp

−

x i −

x j

w ij

(5.1)

2 σ 2

where σ is known as the bandwidth parameter and controls how quickly the weight decreases.

This weight has the same form as a Gaussian function. It is also called a Gaussian kernel or

a Radial Basis Function (RBF) kernel. The weight is 1 when x i =

x j , and 0 when

x i −

x j

approaches infinity.

kNN graph. Each vertex defines its k nearest neighbor vertices in Euclidean distance. Note

if x i is among x j 's kNN, the reverse is not necessarily true: x j may not be among x i 's kNN.

We connect x i , x j if one of them is among the other's kNN. This means that a vertex may

have more than k edges. If x i , x j are connected, the edge weight w ij is either the constant 1,

in which case the graph is said to be unweighted, or a function of the distance as in (5.1). If

x i , x j

0. kNN graph automatically adapts to the density of instances

in feature space: in a dense region, the kNN neighborhood radius will be small; in a sparse

region, the radius will be large. Empirically, kNN graphs with small k tends to perform well.

are not connected, w ij

NN graph. We connect x i , x j if

x i −

x j ≤

. The edges can either be unweighted or

weighted. If x i , x j are not connected, w ij

0. NN graphs are easier to construct than kNN

graphs.

These are very generic methods. Of course, better graphs can be constructed if one has knowledge

of the problem domain, and can define better distance functions, connectivity, and edge weights.

Figure 5.1 shows an example graph, where the edges are sparse. Let x 1 , x 2 be the two labeled

instances (vertices). Recall that the edges represent the “same label” assumption. For an unlabeled

instance x 3 , its label y 3 is assumed to be similar to its neighbors in the graph, which in turn are

similar to the neighbor's neighbors. Through this sequence of unlabeled data stepping stones, y 3 is

assumed to be more similar to y 1 than to y 2 . This is significant because x 3 is in fact closer to x 2 in

Euclidean distance; without the graph, one would assume y 3 is more similar to y 2 .

Formally, this intuition corresponds to estimating a label function f on the graph so that

it satisfies two things: (1) the prediction f( x ) is close to the given label y on labeled vertices; 2)

f should be smooth on the whole graph. This can be expressed in a regularization framework,

where the former is encoded by the loss function, and the latter is encoded by a special graph-based

Introduction to Semi-Supervised Learning

Search WWH ::

Custom Search

Home