Database Reference
In-Depth Information
2) Note that we don't need to remove the Replace Missing Values operator, because it is not
removing any observations in our data set. It only changes the values in the
Online_Gaming attribute, which won't affect our next operator. Use the search feature in
the Operators tab to find an operator called Replace. Drag this operator into your stream.
If your splines had been disconnected during the deletion of the sampling and filtering
operators, as is the case in Figure 3-30, you will see that your splines are automatically
reconnected when you add the Replace operator to the stream.
3) In the parameters pane, change the attribute filter type to single, then indicate Twitter as
the attribute to be modified. In truth, in this data set there is only one instance of the value
99 across all attributes and observations, so this change to a single attribute is not actually
necessary in this example, but it is good to be thoughtful and intentional with every step in
a data mining process. Most data sets will be far larger and more complex that the Chapter
3 data set we are currently working with. In the 'replace what' field, type the value 99, since
this is the value we're looking to replace. Finally, in the 'replace by' field, we must decide
what we want to have in the place of the 99. If we leave this field blank, then the
observation will have a missing (?) when we run the model and switch to Data View in
results perspective. We could also choose the mode of 'N', and given that 80% of the
survey respondents indicated that they did not use Twitter, this would seem a safe course
of action. You may choose the value you would like to use. For the topic's example, we
will enter 'N' and then run our model. You can see in Figure 3-31 that we now have nine
values of 'N', and two of 'Y' for our Twitter attribute.
Figure 3-31. Replacement of inconsistent value with a consistent one.
Search WWH ::




Custom Search