Graphics Reference
In-Depth Information
weight group
4.17
ctrl
5.58
ctrl
4.81
trt1
6.31
trt2
5.12
trt2
In this example, we'll recode the continuous variable
weight
into a categorical variable,
wtclass
, using the
cut()
function:
pg$wtclass
<-
cut(pg$weight, breaks
=
c(
0
,
5
,
6
, Inf))
pg
weight group wtclass
4.17
ctrl (
0
,
5
]
5.58
ctrl (
5
,
6
]
4.81
trt1 (
0
,
5
]
4.17
trt1 (
0
,
5
]
6.31
trt2 (
6
,Inf]
5.12
trt2
(
5
,
6
]
Discussion
For three categories we specify four bounds, which can include
Inf
and
-Inf
. If a data value
falls outside of the specified bounds, it's categorized as
NA
. The result of
cut()
is a factor, and
you can see from the example that the factor levels are named after the bounds.
To change the names of the levels, set the
labels
:
pg$wtclass
<-
cut(pg$weight, breaks
=
c(
0
,
5
,
6
, Inf),
labels
=
c(
"small"
,
"medium"
,
"large"
))
pg
weight group wtclass
4.17
ctrl small
5.58
ctrl medium
4.81
trt1 small
4.17
trt1 small
6.31
trt2 large
5.12
trt2 medium
As indicated by the factor levels, the bounds are by default openon the left and closedon the
right. In other words, they don't include the lowest value, but they do include the highest value.
For the smallest category, you can have it include both the lower and upper values by setting