Database Reference
In-Depth Information
following code. These groupings are the basis for the new ordinal factor, spender,
with levels {small, medium, big}.
# build an empty character vector of the same length as
sales
sales_group <- vector(mode="character",
length=length(sales$sales_total))
# group the customers according to the sales amount
sales_group[sales$sales_total<100] <- "small"
sales_group[sales$sales_total>=100 & sales$sales_total<500]
<- "medium"
sales_group[sales$sales_total>=500] <- "big"
# create and add the ordered factor to the sales data frame
spender <- factor(sales_group,levels=c("small", "medium",
"big"),
ordered = TRUE)
sales <- cbind(sales,spender)
str(sales$spender)
Ord.factor w/ 3 levels "small"<"medium"<..: 3 2 1 2 3 1 1 1
2 1 …
head(sales$spender)
big medium small medium big small
Levels: small < medium < big
The cbind() function is used to combine variables column-wise. The rbind()
function is used to combine datasets row-wise. The use of factors is important
in several R statistical modeling functions, such as analysis of variance, aov() ,
presented later in this chapter, and the use of contingency tables, discussed next.
Contingency Tables
In R, table refers to a class of objects used to store the observed counts across the
factors for a given dataset. Such a table is commonly referred to as a contingency
table and is the basis for performing a statistical test on the independence of the
factors used to build the table. The following R code builds a contingency table
based on the sales$gender and sales$spender factors.
# build a contingency table based on the gender and spender
factors
sales_table <- table(sales$gender,sales$spender)
Search WWH ::




Custom Search