Information Technology Reference
In-Depth Information
250
200000
200
150000
150
100000
100
50000
50
0
50
100
150
200
250
Fig. 8.1 Co-occurrences of features in the data set. The figure represents features both as rows and
columns. The order of rows and columns conforms with itself. Each point of the resulting grid shows
the number of co-occurring values for each combination of features. The color intensity reflects the
number according to the scheme on the right-hand side . We observe a large fraction of the space
with relatively light coloring. This indicates high levels of sparsity
describing a transaction. Each such transaction represents an itembeing sent to a user.
Subsequently, users decide to either keep or return the item.
Sparsity represents a vital factor for recommender systems. Figure 8.1 illustrates
the data set's sparsity levels. We compute the co-occurrences of feature values.
Hereby, we refrain from considering individual values. We distinguish present values
from missing values. Hence, we obtain a 263
×
263 matrix whose values correspond
to the number of co-occurring non-missing values. In other words, the more often
two features exhibit non-missing values in transactions, the higher the count. We
represent counts in terms of a color scheme detailed on the right-hand side.
We observe that a relatively large fraction of space shows little to no counts. Thus, we
consider our data to be highly scarce. Note that the darker regions reflect two types
of phenomena. First, some features are available for all transactions. These features
include identifiers for customer and article, references to the date and time, as well
as the user's decision to keep or return the item. Second, we observe that articles
of the same kind commonly exhibit values for a subset of features. For instance,
some features refer to shoes in particular. These attributes will lack values for all
articles other than shoes. Although, shoes will likely carry values of those features
even though the values may differ.
We categorize our features into five groups:
transaction-related features
type-related features
descriptive features
customer-related features
 
Search WWH ::




Custom Search