Database Reference
In-Depth Information
efforts in the area of Web 2.0 data mining as well
as by carrying out an analysis of a large dataset
collected from Digg. The next section provides
an account of existing studies on the analysis of
content consumption behavior within Web 2.0 ap-
plications, organized around four research tracks.
The third section formalizes a framework for the
analysis of content-related phenomena in SBS.
The application of this framework is illustrated in
the fourth section, with Digg used as a case study.
The final two sections of the chapter present an
outlook on future trends in this area and conclude
the chapter respectively.
within SBS provides insights into the inter-
ests of the masses and has broad implica-
tions in the design of effective Information
Retrieval (IR) systems.
4.
Social network effects: The influence of the
users' online social environment on their
content consumption behavior is increas-
ingly important for describing diffusion
processes and viral phenomena arising in
the SBS user communities.
In order to gain a high-level understanding of
the phenomena emerging in complex systems such
as SBS, researchers commonly employ statisti-
cal analysis techniques; more specifically, they
inspect and analyze the distributional properties
of the variables observed in the system under
study. Previous studies of social, biological and
computer systems have confirmed in a series
of phenomena the emergence of highly-skewed
distributions, frequently taking the form of a
power law . Power-law distributions - commonly
referred to as Zipf's laws or Pareto distributions
- provide a statistical model for describing the
“rich-get-richer” phenomena frequently appear-
ing in complex systems. Two noteworthy survey
studies on power laws are provided by Newman
(2005) and Mitzenmacher (2004). In an attempt
to explain the emergence of such distributions in
complex systems, a series of generative models
have been recently proposed; among those, one of
the most prominent is the preferential attachment
model by (Barabási & Albert, 1999). Later in the
chapter, we will confirm that the interest attracted
by online resources in Digg, as well as the vot-
ing patterns of SBS users follow highly-skewed
patterns that can be often well approximated by
a power law.
Furthermore, analysis of the temporal aspects
of phenomena similar to the ones appearing
within SBS-like environments provides further
insights into the evolution of variables such as the
intensity of user activity, or the number of votes
that an online resource collects. For instance, the
BAckground
Considerable research interest has been recently
developed in the analysis and modeling of the
content consumption behavior of Web 2.0 ap-
plication users. Much of this research is focused
on the mining of web log data where the user
transactions are recorded. In our study, we have
identified four major research tracks addressing the
study of phenomena that arise through the online
content consumption by masses of users:
1.
Statistical analysis: The monitoring of the
activities of large user masses enables the
application of powerful statistical analyses
in order to study the distributional properties
of observed variables and to make inferences
about the recorded data.
2.
Temporal data mining: The analysis of
content consumption patterns over time is
crucial for in-depth understanding of the
dynamics emerging in the phenomena that
take place within social bookmarking ap-
plications. Discovering trends and periodic
patterns as well as producing summaries of
multiple data streams is the focus of this
perspective.
3.
Content semantics: The lexical and semantic
analysis of the content that is consumed
Search WWH ::




Custom Search