Database Reference
In-Depth Information
TABLE 2.9:
BEP metric distances.
AJ
CNN
DN
IHT
AJ
-
0.6165
0.6709
0.6852
CNN
0.6165
-
0.5682
0.5735
DN
0.6709
0.5682
-
0.4663
IHT
0.6852
0.5735
0.4663
-
FIGURE 2.3 : This plot shows the relative distance between news outlets,
using the BEP metric described in the text.
2.6.2 Distance Based on Choice of Topics
The second approach for calculating the distance was based on the inter-
section of topics which were covered by each of the news outlets. To discover
these intersections we described each news outlet by a vector of binary fea-
tures where each feature corresponds to one news article from our collection.
A feature in the vector of a news outlet has value 1 if the article corresponding
to that feature originates from the news outlet or if the article is a mate of an
article from the news outlet. Otherwise the value of the feature is 0. We then
used the cosine similarity to calculate the similarity between the vectors.
The effect of such representation is that we effectively compare two news
outlets based on their choice of which events to cover. If news outlets A and
B both covered the same event then there is a news article a 1 from A and b 1
from B which both covered that event. If our matching algorithm discovered
that these two articles are mates, then both news outlets A and B will have
a value of 1 for the features corresponding to the a 1 and b 1 and therefore will
be more similar. If there is a news outlet C which also covered the event with
article c 1 and our algorithm only managed to discover that c 1 is a mate with
b 1 , then this approach will still manage to match news outlets A and C on
this event since they both have mate articles to the article b 1 . However, the
score of matching will be lower than between A and B or B and C. Results of
 
Search WWH ::




Custom Search