This is an old revision of the document!
Decision trees are the most commonly used base technique in classifications. To describe the idea of the decision trees a simple data set might be considered:
In this dataset, xn indicates the n-th observation; each column refers to a particular factor, while the last column, “Call for technical assistance” refers to the class variable with values Yes or No respectively;
To build a decision tree for the given problem of calling the technical assistance, one might consider constructing a tree where each path from the root to tree leaves represents a separate example xn with a complete set of factors and their values corresponding to the given example. This solution would provide the necessary outcome – all examples will be classified correctly. However, there are two significant problems:
Referring to Occam’s razor principle [1] the most desirable model is the most compact one, i.e., using only the factors necessary to make a valid decision. This means that one needs to select the most relevant factor and then the next most relevant factor until the decision is made without a doubt.