Information Technology Reference
In-Depth Information
6.1 Experimentation protocol
We fused TV program descriptions acquired on the DVB stream of meta data and an on-line
TV magazine, from sixteen TV channels during 24 hours.
We used the TV program descriptions provided by the INAthèque as reference data to evaluate
our fusion. The INA records the descriptions of all the programs broadcast on the french TV
and radio. Thereby, we know whether a fused program corresponds to the program that was
really played.
For our experimentation, we request every 5 minutes the two sources of information to give
us the next program scheduled on one channel. The two provided TV program descriptions
are then fused using one of the fusion strategies.
After fusion, we compare the fused TV program descriptions to the INA reference data. If
the titles, subtitles, channels etc. are compatible, the fused program description is considered
to be correctly found with regards to reality. The results that we obtained are detailed in the
following sections.
6.2 Fusion strategies
The quality of the fusion that we obtained using different strategies was measured. To this
aim, we launched our experimentations using the fusion platform first combined with no
strategy and then with three different ones. The first experiment -no fusion strategy- is
equivalent to using the initial maximal join operator for information fusion.
The strategies that encode domain knowledge are the following ones:
Strategy 1 extends dates compatibility. Two dates are compatible if the difference between
the two is less than five minutes. If two dates are compatible but different, the fused date
should be the earliest one if it is a “begin date” and the latest one otherwise.
Strategy 2 extends dates and titles compatibility. The dates compatibility is the same as for
strategy 1. Two titles are compatible if one of them is contained in the other one. If two
titles are compatible but different, the fused title should be the longest one.
Strategy 3 extends dates and titles compatibility. The dates compatibility is the same as for
strategy 1. Two titles are compatible if the length of the common substrings exceeds a
threshold. If two titles are compatible but different, the fused title should be the longest
one.
6.3 On the usefulness of fusion strategies
As first interpretation, we compared the percentage of programs that were correctly found
after fusion, to the reference data, and looked at the variations resulting of the use of the
different strategies. Figure 16 shows the results that we obtained on a representative selection
of TV channels. As expected, we can see that the fusion of observations using the maximal
join operation only is not sufficient. Only the descriptions with strictly identical values are
fused. Applying the three previously cited fusion strategies, we can see that the more the
compatibility constraints between two values are relaxed, the better the results are.
It is
equivalent to inject more and more domain knowledge in the fusion process.
The different experimentations that we carried out showed that the quality of the fusion
process is heterogeneous, according to several parameters. One of these parameters on which
the fusion results can be dependent, is the period of the day and the specificity of the channel.
For non-popular channels (BFM...) and at periods of low audience (early morning), we
observed a lot of errors in the programs given by the TV magazine.
Search WWH ::




Custom Search