Java Reference
In-Depth Information
51. double lCategoryFreq
(double)(((Integer)(lCategories.get(lCategory))).
52. intValue())/(double)(lTotalSize);
53. if (lOtherCategories.containsKey(lCategory)) {
54. lOtherCategoryFreq
(double)(((Integer)(lOtherCategories.get(lCategory))).
55.
intValue())/(double)lOtherTotalSize;
56. }
57. report("Category: " lCategory ": " lCategoryFreq
58. " to be compared to: " lOtherCategoryFreq);
59. lDistance java.lang.Math.abs(lCategoryFreq -
lOtherCategoryFreq);
60.
}
We also scan through the second map to ensure that we do not
neglect the categories present in this second map that do not appear
in the first one.
61. Set lOtherCategoryNames lOtherCategories.keySet();
62. for (lIter lOtherCategoryNames.iterator();
63. lIter.hasNext(); ) {
64. String lOtherCategory (String)lIter.next();
65. double lCategoryFreq 0.0;
66. if (!lCategories.containsKey(lOtherCategory)) {
67. double lOtherCategoryFreq (double)(((Integer)
68. (lOtherCategories.get(lOtherCategory))).
69. intValue())/(double)(lOtherTotalSize);
70. report("Category: " lOtherCategory ": " lCategoryFreq
71. " to be compared to: " lOtherCategoryFreq);
72. lDistance java.lang.Math.abs(lCategoryFreq -
lOtherCategoryFreq);
73.
}
74.
}
75.
return lDistance;
76. }
The returned distance can be used to sort the attributes, showing,
for example, the ones that are the most different for the two popula-
tions. Another implementation could define an object that contains
the compared profile for later graphic display of histograms, as
shown in Figure 12-5, for example, which takes the values shown in
the example of “PurchaseA” statistics.
This code can be extended through profiling of continuous
attributes using binned ranges of possible values, or just report the dif-
ference between the minimums, maximums, averages and standard
Search WWH ::




Custom Search