Information Technology Reference
In-Depth Information
type of explanation is easily amendable to preferential attachment models, also
known as 'rich get richer' explanations, which are well-known to produce power-
law distributions. Intuitively, the earliest studies of tagging observed that users
imitate other pre-existing tags (Golder and Huberman 2006). Golder and Huberman
proposed that the simplest model that results in a “power-law” would be the classical
Polya urn model (2006). Imagine that there is an urn containing balls, each being
one of some finite number of colors. At every time-step, a ball is chosen at random.
Once a ball is chosen, it is put back in the urn along with another ball of the same
color, which formalizes the process of feedback given by tag suggestions. As put by
Golder and Huberman, “replacement of a ball with another ball of the same color
can be seen as a kind of imitation” where each color of a ball is made equal to a
natural language tag and since “the interface through which users add bookmarks
shows users the tags most commonly used by others who bookmarked that URL
already; users can easily select those tags for use in their own bookmarks, thus
imitating the choices of previous users” (2006). Yet, this model is too limited to
describe tagging, as it features only reinforcement of existing tags, not the addition
of new tags.
5.3.1.2
Imitation and the Yule-Simon Model
The first model that formalized the notion of new tags was proposed by Cattuto
(2006). In order for new tags to be added, a single parameter p must be added
to the model, which represents the probability of a new tag being added, with the
probability p
that an already-existing tag is reinforced by random uniform
choice over all already-existing tags. This results in a Yule-Simon model, a model
first employed by Yule (1925) to model biological genera and later Simon to model
the construction of a text as a stream of words (Simon 1955). This model has been
shown to be equivalent to the famous Barabasi and Albert algorithm for growing
networks (Bornholdt and Ebel 2001). Yet the standard Yule-Simon process does not
model vocabulary growth in tagging systems very well, as noticed by Cattuto as it
produces exponents “lower than the exponents we observe in actual data” (Cattuto
2006).
Cattuto hypothesize that this is because the Yule-Simon model assumes users are
choosing to reinforce ( p ) tags uniformly from a distribution of all tags that have
been used previously, so Cattuto concludes that “it seems more realistic to assume
that users tend to apply recently added tags more frequently than old ones” (Cattuto
2006). This behavior could be caused by the exposure of a user to a feedback
mechanism, such as the del.icio.us tag suggestion system. This suggestions exposes
the user only to a subset of previously existing tags, such as those most recently
added. Since the tag suggestion mechanism only encourages more recently-added
tags to be re-enforced with a higher probability, Cattuto added a memory kernel
with a power-law exponent to standard Yule-Simon model. This means that the
weight of a previously existing tag being reinforced is weighted according to a
power-law itself, so that a tag that has been applied x steps in the past is chosen
=(
1
p
)
Search WWH ::




Custom Search