Database Reference
In-Depth Information
EXERCISE 10.2.3 Using the betweenness values from Exercise 10.2.2 , determine reason-
able candidates for the communities in Fig. 10.9 by removing all edges with a betweenness
above some threshold.
Speeding Up the Betweenness Calculation
If we apply the method of Section 10.2.4 to a graph of n nodes and e edges, it takes O ( ne ) running time to compute
the betweenness of each edge. That is, BFS from a single node takes O ( e ) time, as do the two labeling steps. We must
start from each node, so there are n of the computations described in Section 10.2.4 .
If the graph is large - and even a million nodes is large when the algorithm takes O ( ne ) time - we cannot afford to
execute it as suggested. However, if we pick a subset of the nodes at random and use these as the roots of breadth-first
searches, we can get an approximation to the betweenness of each edge that will serve in most applications.
10.3 Direct Discovery of Communities
In the previous section we searched for communities by partitioning all the individuals in a
social network. While this approach is relatively efficient, it does have several limitations.
It is not possible to place an individual in two different communities, and everyone is as-
signed to a community. In this section, we shall see a technique for discovering communit-
ies directly by looking for subsets of the nodes that have a relatively large number of edges
among them. Interestingly, the technique for doing this search on a large graph involves
finding large frequent itemsets, as was discussed in Chapter 6 .
10.3.1
Finding Cliques
Our first thought about how we could find sets of nodes with many edges between them
is to start by finding a large clique (a set of nodes with edges between any two of them).
However, that task is not easy. Not only is finding maximal cliques NP-complete, but it is
among the hardest of the NP-complete problems in the sense that even approximating the
maximal clique is hard. Further, it is possible to have a set of nodes with almost all edges
between them, and yet have only relatively small cliques.
EXAMPLE 10.10 Suppose our graph has nodes numbered 1 , 2 , . . . , n and there is an edge
between two nodes i and j unless i and j have the same remainder when divided by k . Then
the fraction of possible edges that are actually present is approximately ( k −1) /k . There are
many cliques of size k , of which {1 , 2 , . . . , k } is but one example.
Yet there are no cliques larger than k . To see why, observe that any set of k +1 nodes
has two that leave the same remainder when divided by k . This point is an application of
Search WWH ::




Custom Search