Community Detection in Collaborative Tagging Systems - Community-Built Databases: Research and Development

Database Reference

In-Depth Information

structure. The proposed framework is applied on three real-world tag datasets and

the results are presented in Sect. 5.4 . Finally, Sect. 5.5 concludes the chapter.

5.2 Background

This section presents background material that is necessary for the subsequent

discussion. Some mathematical notation is provided in Sect. 5.2.1 and several

recent works that are pertinent to the chapter subject are discussed in Sect. 5.2.2 .

5.2.1 Notation

For folksonomies, we employ the definition presented in [ 2 ]. However, we do not

include the subtag/supertag relation nor the personomy construct that appear in the

original definition.

Definition 1. A folksonomy is a tuple

, where U, R, T are finite sets

comprising respectively the users, resources and tags of the Collaborative Tagging

Systems under study, and Y is a ternary relation between them, i.e., Y

F f

;

called tag assignments (TAS).

Since folksonomies are commonly represented in the form of networks, we will

adopt the common graph notation, according to which G

( V , E ) is a graph

consisting of the set V of nodes and the set E of edges. A natural way to model a

folksonomy is by use of a hypergraph, where V

[

T and E

{{ u , r , t }|

( u , r , t )

Y }. However, the hypergraph model is very rarely used in practice due to

its complexity, as well as due to the lack of efficient techniques for analyzing its

structure. Instead, the tripartite graph model, in which each hyper-edge { u , r , t }

∈

is reduced to three simple edges {( u , r )

T }, is

used as an approximate representation for folksonomies. Further simplifications of

the model, for example, to bipartite graphs and to one-mode networks [ 1 ], are even

more frequently used for tackling specific analysis problems.

For instance, a very common folksonomy-derived graph is the tag co-occurrence

graph, G T ¼

∈

R ,( u , t )

∈

T ,( r , t )

∈

{ V T , E T }, where nodes represent tags, V T

T , and edges depict co-

occurrences between pairs of tags, E T ¼

T }. Tag co-occurrence is

usually defined in the context of resources, i.e., when two tags are used together to

annotate the same resource, they are considered co-occurring. The number of times

that two tags co-occur in the context of some resource can be used as a weight of

their relation on the graph, c ( t i , t j )

{( t i , t j )| t i , t j ∈

. 4 There

c ij ¼j

{

∃

∈

R |( r , t i ),( r , t j )

∈

T }

4 In the following, we refer to this kind of co-occurrence as resource-based tag co-occurrence.

Community-Built Databases: Research and Development

Search WWH ::

Custom Search

Home