Database Reference
In-Depth Information
Table 5.4 Examples of unrelated tags that were assigned by CNM to the same community
Dataset
Examples of unrelated tags placed in the same community
BIBSONOMY-
200K
Hannover, nutritional, ebusiness, bishop, vivaldi, sunsets, skyscapes, recycle,
antiracist, patentbibliometrics
Information retrieval, magnetic, robotics, kolmogorov, wordnet, darmstadt,
socialinformatics, changemanagement, thermodynamics, metaphysics
webdesign, windows, torrent, puzzle, vmware, geotagging, mov, techcrunch,
cpplib, baseballplayers
FLICKR-1M
Spanien, common chimpanzee, star wars, renault, restaurant, prostitution,
olympicstadium, large windows, infrared, president of the usa
Barcelona, watermelon, photon awards, birthday, mediterranean, palm tree,
fine arts, volkswagen, building, logistics
Roma, double bass, crowd surfing, environment, lomography, flickr babes,
sombrero, basketball, bruce springsteen, design for children
DELICIOUS-
7M
Geekiness, telepathy, scifihorror, britneyspears, theflintstones, sportculture,
onlinepokergames, environmentalhealth, uspatent, argentina
Education, capetown, flashwebsites, businessanalyst, alcoholicsanonymous,
newjournalism, adventuretravel, countrycallingcodes, musicnetwork,
scienceastrophysics
Food, island, bike, jersey, federal, climate, ghosts, athletics, enviroment,
imperialism
Examples from the three largest communities of each dataset are presented
Close examination of the tags contained in them reveals their close semantic and
contextual association. In the case of CNM, these communities are contained in the
aforementioned gigantic communities together with numerous unrelated tags, thus
their utility is limited.
Although the tag communities detected by HCD contain tags that are closely
related to each other, there are cases in which they appear to be fragmented: there
are multiple tag communities that refer to the same topic, but are split in different
communities. Such an example is presented in Table 5.6 . In this case, the CNM
algorithm managed to assemble all tags related to recipes and ingredients to a single
community, 10 while HCD had them dispersed in four different groups.
Subsequently, we also computed the conductance values for the communities
derived by CNM and HCD for every dataset. Figure 5.6 presents the conductance
distributions characterizing the community structure produced by the two methods
under comparison. It appears that CNM produces communities of lower conduc-
tance than HCD, which in terms of graphs means that the CNM communities
are better separated than their HCD counterparts from the rest of the network.
However, this seemingly superior performance of CNM in terms of conductance
comes at the cost of creating highly unbalanced tag communities that do not
correspond well to the topics connoted by the tags.
10 However, it is obvious from the CNM community tags that also irrelevant tags were placed in
the same community.
Search WWH ::




Custom Search