Information Technology Reference
In-Depth Information
to be found in a variety of phenomena. This imbalance is ultimately interpretable as the
implicit unfairness found in complex webs.
Let us set the words of a text in order of frequency of appearance, and rank them
assigning the rank r
=
1 to the most frequently used word, the rank r
=
2totheword
(
)
with the highest frequency after the first, and so on. The function W
r
denotes the
number of times the word of rank r appears. Zipf found
K
r η ,
W
(
r
) =
(2.58)
where
1 and K is simply determined from ( 2.54 ).
Let us imagine that the number of times a word appears, W , can be interpreted as the
wealth of that word. This makes it possible to define the probability
η
, namely the
probability that a wealth larger than W exists. According to Pareto the distribution of
wealth (actually Pareto's data were on the distribution of income, but we shall not dwell
on the distinction between income and wealth here) is
(
W
)
A
W k .
(
W
) =
(2.59)
The distribution density
ψ(
W
)
is given by the derivative of the probability with respect
to the web variable W ,
d
dW (
ψ(
W
) =−
W
),
(2.60)
which yields
B
W a ,
ψ(
W
) =
(2.61)
with the normalization constant given by
B
=
kA
(2.62)
and the new power-law index related to the old by
a
=
k
+
1
.
(2.63)
Now let us take ( 2.58 ) into account in the distribution of wealth. Imagine that we
randomly select, with probability density p
a word of rank r from the collection of
words with distribution number given by ( 2.58 ). A relation between the wealth variable
and a continuous version of the rank variable is established using the equality between
the probability of realizing wealth in the interval ( W
(
r
),
,
W
+
dW ) and having the rank in
the interval ( r
,
r
+
dr ),
ψ(
W
)
dW
=
p
(
r
)
dr
.
(2.64)
We are exploring the asymptotic condition r
1, and this makes it possible for us to
move from the discrete to the continuous representation. The equality ( 2.64 ) generates
a relation between the two distribution densities in terms of the Jacobian between the
two variates,
 
Search WWH ::




Custom Search