Practical Problem Solving - Java Data Mining: Strategy, Standard, and Practice

Java Reference

In-Depth Information

35.

lResultSetCount

36.

lStatement.executeQuery(lSQLCountQuery);

37.

int lOtherTotalSize 0;

38.

Map lOtherCategories new HashMap();

39.

while (lResultSetCount.next()) {

40.

int lCategorySize lResultSetCount.getInt(1);

41.

String lCategory lResultSetCount.getString(2);

42.

lOtherCategories.put(lCategory, new Integer(lCategorySize));

43.

lOtherTotalSize lCategorySize;

44.

}

To compute and compare frequencies in the two cases, we have

used hash maps to store the results and we have collected the global

number of customers in each case. For this, we scan through the two

maps and compare both categories and report this to the user. If we

go back to the situation in which we apply this to “PurchaseA,” we

could have received the following results for cluster 1:

Count PurchaseA

2345 1

45603 2

21342 3

And the following results for the population not in cluster 1:

Count PurchaseA

17546 1

31846 2

4275

3

These two result sets have been saved into two Java maps in order

to compute the profile in terms of relative frequencies and compute

the distance between these two distributions. The distance is based

on the average between the frequencies of the categories for each

population, as shown at line 58.

45.

Set lCategoryNames lCategories.keySet();

46.

Iterator lIter;

47.

for (lIter lCategoryNames.iterator();

48.

lIter.hasNext(); ) {

49.

String lCategory (String)lIter.next();

50.

double lOtherCategoryFreq 0.0;

Java Data Mining: Strategy, Standard, and Practice

Search WWH ::

Custom Search

Home