Database Reference
In-Depth Information
The working directory contains a CSV file ( sample1.csv ). The file has a header
row, followed by 14 rows of training data. The attributes include Age , Income ,
JobSatisfaction , and Desire . The output variable is Enrolls , and its value
is either Yes or No . Full content of the CSV file is shown next.
Age,Income,JobSatisfaction,Desire,Enrolls
<=30,High,No,Fair,No
<=30,High,No,Excellent,No
31 to 40,High,No,Fair,Yes
>40,Medium,No,Fair,Yes
>40,Low,Yes,Fair,Yes
>40,Low,Yes,Excellent,No
31 to 40,Low,Yes,Excellent,Yes
<=30,Medium,No,Fair,No
<=30,Low,Yes,Fair,Yes
>40,Medium,Yes,Fair,Yes
<=30,Medium,Yes,Excellent,Yes
31 to 40,Medium,No,Excellent,Yes
31 to 40,High,Yes,Fair,Yes
>40,Medium,No,Excellent,No
<=30,Medium,Yes,Fair,
The last record of the CSV is used later for illustrative purposes as a test case.
Therefore, it does not include a value for the output variable Enrolls , which
should be predicted using the naïve Bayes classifier built from the training set.
Execute the following R code to read data from the CSV file.
# read the data into a table from the file
sample <- read.table("sample1.csv",header=TRUE,sep=",")
# define the data frames for the NB classifier
traindata <- as.data.frame(sample[1:14,])
testdata <- as.data.frame(sample[15,])
Two data frame objects called traindata and testdata are created for the naïve
Bayes classifier. Enter traindata and testdata to display the data frames.
The two data frames are printed on the screen as follows.
Search WWH ::




Custom Search