Database Reference
In-Depth Information
the adjacency matrix representation of a graph, and therefore we are going to assume
in the following that m = n , unless otherwise noticed.
There are three types of operations in the previous formula:
1. combine2 : multiply m i,j and v j .
2. combineAll : sum n multiplication results for node i .
3. assign : overwrite the previous value of v i with the new result to make
v i .
We introduce an abstraction of the basic matrix-vector multiplication, called
generalized iterative matrix-vector multiplication. The corresponding programming
primitive is the GIM-V primitive on which PEGASUS is based. The “Iterative” in
GIM-V denotes that we apply the × G operation until a convergence criterion is met.
Specifically, let us define the operator × G as follows:
v ′ = M × G v
where
v i = assign ( v i , combineAll i ({ x j | j = 1.. n , and x j = combine2 ( m i,j , v j )})).
The functions combine2 (), combineAll (), and assign () have the follow-
ing interpretation, generalizing the product, sum, and assignment of the traditional
matrix-vector multiplication:
1. combine2 ( m i,j , v j ) : combine m i,j and v j .
2. combineAll i ( x 1 ,…, x n ) : combine all the results from combine2 () for
node i .
3. assign ( v i , v new ) : decide how to update v i with v new .
In the following sections we show how different choices of combine2 (),
combineAll i (), and assign () allow us to solve several important graph mining tasks.
Before that, we want to highlight the strong connection of GIM-V with SQL. When
combineAll i () and assign () can be implemented by user defined functions, the
operator × G can be expressed concisely in terms of SQL. This viewpoint is important
when we implement GIM-V in large-scale parallel processing platforms, including
Hadoop, if they can be customized to support several SQL primitives including JOIN
and GROUP BY. Suppose we have an edge table E ( sid, did, val ) and a vector
table V(id, val) , corresponding to a matrix and a vector, respectively. Then, × G
corresponds to the SQL statement in Table 8.1. We assume that we have (built-in or
user-defined) functions, combineAll i () and combine2 (), and we also assume that
the resulting table/vector will be fed into the assign () function (omitted, for clarity).
TABLE 8.1
GIM-V in Terms of SQL
SELECT E.sid, combineAll E.sid ( combine2 (E.val,V.val))
FROM E, V
WHERE E.did=V.id
GROUP BY E.sid
 
Search WWH ::




Custom Search