Database Reference
In-Depth Information
for 1
N. The constraint (3) expresses the upper bound restriction on the
available storage space viz. S . The contribution made by the frequent item set X j to
the profit may be expressed as p j f j z j where z j =1 if and only if X j is covered by the set
ˆ . The constraint (4) ensures a minimum revenue contribution from the set of all
covered frequent item sets. The constraints of (5) express the 0-1 restrictions of the
decision variables y i and z j . The objective function in (6) maximizes the total profit
contribution expressed as the total net revenue.
i
4 Experimental Results
To verify our ILP formulation of the OIPP, we implemented and experimentally
tested our model with real life market transaction data obtained from a Belgian retail
store [16]. The dataset (retail.txt) stores five months of transaction data collected over
four separate periods.
Retail data characteristics:
Total number of transactions 88,163
Item ID range 1- 16470
Number of items (N) 3,151
Total number of customers 5,133
Average basket size 13
Data collection period 5 months total (in four separate periods)
For further details of the data refer to [16]. Since not all characteristics of the data
are publicly available (presumed to be confidential), we supplemented them with
values for such fields as storage space required per item (s i ), revenue from selling
item package X j (p j ) and cost attributed to item i (c i ).
4.1 Data Preparation Stages
As discussed in section 3, before building the ILP model of the market data we need
to know the data characteristics. Therefore, the data is preprocessed using the
following steps to prepare it for input to the mathematical programming software:
1. Each transaction record is organized as an ascending sequence of item Ids;
2. A count of the number of items (n j ) in each transaction is inserted as the first
field of the record;
3. The records in the database are then sorted in ascending order according to the
count of items and then by the item Ids as minor keys;
4. Finally, the distinct frequent transactions are listed with their frequencies (f j ).
In the present context, a transaction is frequent if its frequency is greater than
or equal to 2. Note that the total number of distinct frequent transactions (item
sets) is denoted by k.
5. This final dataset is fed to a program (createLP) which builds the ILP model
corresponding to the current problem.
This model is then submitted to a mathematical programming application to be
solved as an ILP with binary integer variables (y i 's and z j 's).
Search WWH ::




Custom Search