3) Pruning to Reduce Overfitting
This is Part 3 of our article on how to reduce overfitting. Let’s begin: By default, the decision tree model is allowed to grow to its full depth. Pruning refers to a technique to remove the parts of the decision tree to prevent growing to its full depth. By tuning the hyper parameters of the decision tree model one can prune the trees and prevent them from overfitting. Pruning is the process of reducing the size of the tree by turning some branch nodes into leaf nodes (opposite of Splitting) and removing the leaf nodes under the original branch. This process of trimming trees is called Pruning.
There are two types of pruning:
- Pre-pruning
- Post-pruning
a. Pre-pruning
The pre-pruning technique refers to the early stopping of the growth of the decision tree. Pre-Pruning means to stop the growing tree before a tree is fully grown.
Pre-pruning a decision tree involves using a ‘termination condition’ to decide when it is desirable to terminate some of the branches prematurely as the tree is generated.
There are various approaches for pre-pruning.:
Minimum no of object pruning: In this method of pruning, the minimum no of the object is specified as a threshold value. Whenever the split is made which yields a child leaf that represents less than the threshold from the data set, the parent node and children node is compressed to a single node.
Chi-square pruning: It is based only on the distribution of classes induced by the single decision of splitting at the node and not by the decisions made as a result of growing a full sub-tree below this node as in the case of post pruning.
b. Post-pruning
Post pruning allows the tree to perfectly classify the training set, and then post prune the tree.
In post-pruning, it first goes deeper and deeper into the tree to build a complete tree.
In this, first, generate the decision tree and then remove non-significant branches. Post-pruning a decision tree implies that we begin by generating the (complete) tree and then adjusting it with the aim of improving the accuracy on unseen instances.
There are two principal methods of doing this. One method that is widely used begins by converting the tree to an equivalent set of rules. Another commonly used approach aims to retain the decision tree but to replace some of its subtrees with leaf nodes, thus converting a complete tree to a smaller pruned one which predicts the classification of unseen instances at least as accurately.
To continue learning related concepts, check out this article on Pruning:
https://towardsdatascience.com/build-better-decision-trees-with-pruning-8f467e73b107