Similar Posts
Gradient Boosting
In gradient boosting decision trees, we combine many weak learners to come up with one strong learner. The weak learners here are the individual decision trees. All the trees are connected in series and each tree tries to minimize the error of the previous tree. Due to this sequential connection, boosting algorithms are usually slow…
Model-based methods
So far, we have seen how we can use deletion methods and imputation methods to handle missing values in a dataset. These univariate methods used for missing value imputation are simplistic ways of estimating the value and may not always provide an accurate picture. For example, let us say we have variables related to the…

Min-Max Normalization
Min-max normalization is one of the most popular ways to normalize data. For every feature, the minimum value of that feature gets transformed into a 0, the maximum value gets transformed into a 1, and every other value gets transformed into a value between 0 and 1. It is calculated by the following formula: v’…
Everything you need to know about Model Fitting in Machine Learning
What is Model Fitting? Model fitting is a measure of how well a machine learning model generalizes to similar data to that on which it was trained. The generalization of a model to new data is ultimately what allows us to use machine learning algorithms every day to make predictions and classify data. The definition of a…
BOOSTING
The term ‘Boosting’ refers to a family of algorithms that converts weak learners to strong learners. Let’s understand this definition in detail by solving a problem: Let’s suppose that, given a data set of images containing images of cats and dogs, you were asked to build a model that can classify these images into two…
Confusion Matrix for Model Selection
Before we jump into calculating Accuracy, Precision, and Recall for our classification model, we first need to understand what a Confusion matrix is. In machine learning, Classification is used to split data into categories. But after cleaning and preprocessing the data and training our model, how do we know if our classification model performs well?…