Cross-Network Social Multimedia Computing - User-centric Social Multimedia Computing

Information Technology Reference

In-Depth Information

5.4.2 Topic Association

5.4.2.1 Transition Probability-Based Topic Association

With the derived heterogeneous topic spaces, topic association is to discover corre-

lation, i.e., an association matrix A between them. Recall that the basic idea is: if

many overlapped users who take interests in the i th YouTube topic also follow the

j th Twitter topic, the association between the two topics a ij tends to be strong.

By examining the collaborative involvement of cross-network topics among over-

lapped users, we view topic association as a probabilistic transition problem and

calculate the association matrix A by aggregating over all the overlapped users:

z Twi

j

z Yo u

i

z Twi

j

z Yo u

i

a ij =

p

(

|

) =

p

(

|

u

) ·

p

(

u

|

)

∈ U

u

z Yo u

i

where the prior p

indicates the i th YouTube topic distribution for user u .

By calculating all cross-network topic pairs and subsequent normalization, we can

obtain the topic association matrix A

(

|

u

)

={

a ij }

.

5.4.2.2 Regression-Based Topic Association

The above probability-based method directly calculates over all overlapped users,

where noisy user topic distributions will deteriorate the derived association matrix.

Alternative way to obtain the association matrix is to formulate it as an optimization

problem. Specifically, we interpret the topic association as a linear regression between

the two user distribution matrices U Yo u and U Twi .

Formally, the regression objective function is:

U Twi

AU Yo u

2

mi A ||

−

||

+ ʻ 1 ||

A

|| q

(5.4)

where the first term represents the regression error, the second term is the regular-

ization penalty used to avoid overfitting, and

ʻ 1 ∈[

0

,

1

]

is the weighting parameter.

When q

=

1, Eq. ( 5.4 ) is a lasso problem and can be effectively solved by LARS [ 10 ].

When q

=

2, Eq. ( 5.4 ) is a ridge regression problem with analytical solution as:

U Twi U Yo u T

U Yo u U Yo u T

) − 1

A

=

(

+ ʻ 1 I

(5.5)

where I is the identity matrix. We denote the regression-based association strategy

when q

=

1 and q

=

2as Regression_l1 and Regression_l2 .

User-centric Social Multimedia Computing

Search WWH ::

Custom Search

Home