• It divides up the data on each branch point without losing any of the
data (the number of total records in a given parent node is equal to
the sum of the records contained in its two children).
• Most importantly, the outputs are simple rules and are extremely easy
to understand by the business users. You may also build some intu-
itions about your customer base. For example, "Are customers with
different family sizes truly different?".
It turns out that we are collecting very similar records at each leaf. So, we can use
median or mean of the records at a leaf as the predictor value for all the new records
that obey similar conditions. Such trees are called regression trees.
Decision trees are robust to errors, both errors in classifications of the training ex-
amples and errors in the attribute values that describe these examples. Decision tree
methods can be used even when some training examples have unknown values (for
example, if the age is known for only some of the training examples). Every starting
or terminating point in a decision tree is called a node and the connections between
nodes are branches .
There are three types of nodes and two types of branches in decision trees of data
• Decision node : A decision node is represented by a square and rep-
resents a point in the tree where a decision needs to be taken.
• Event node : An event node represents a point where the choice of
option ends. Event nodes are represented by a circle.
• Terminal node : These nodes represent the final outcome for every
flow and this is where the tree ends.
• Decision branch : These branches are the connections that start
from a decision branch and connect to an event or decision or a ter-
• Event branch : These branches connect an event node to another
event or decision or terminal node