Information Technology Reference
In-Depth Information
Relative frequency of tags per position and the derived power law
(log−log scale)
−1
−2
−3
−4
−5
−6
−7
−8
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Position of a tag in the distribution (log 2 scale)
Fig. 5.3 Average relative frequency of tag usage, for the set of 500 “Popular” sites from above.
On the y-axis, the logarithm of the relative frequency (probability) is given (The plot uses a double
logarithmic (log-log) scale, thus on the y-axis values are negative since relative frequencies are
less than one)
LMS error rate in the power law regression of 3.8% over the total number of tags in
the distribution, which is low enough to allow us to conclude that tag distributions
do follow power laws.
We note, however, that there is a deviation from a perfect power law in the
del.icio.us data in the sense that there is a change of slope after the top seven or
eight positions in the distribution. This effect is also relatively consistent across the
sites in the data set. This may be due to the cognitive constraints of the users
themselves or an artifact of the way the del.icio.us interface is constructed, since
that number of tags are offered to the users as a suggestion to guide their search
process. Nevertheless, given that the LMS regression error is rather low, we argue
the effect is not strong enough to change the overall conclusion that tag distributions
follow power laws.
5.2.4
The Dynamics of Tag Distributions
While earlier we provided a method for detecting a power-law distribution in the
tags of a site or collection of sites, now we move to another aspect of the problem,
namely how the shape of these distributions develops in time from the tagging
 
Search WWH ::




Custom Search