Similar Posts
Basic Filter Methods
Constant Features that show single values in all the observations in the dataset. These features provide no information that allows ML models to predict the target. Quasi-Constant Features in which a value occupies the majority of the records. Duplicated Features, which is self-explanatory—the same feature.
Confusion Matrix for Model Selection
Before we jump into calculating Accuracy, Precision, and Recall for our classification model, we first need to understand what a Confusion matrix is. In machine learning, Classification is used to split data into categories. But after cleaning and preprocessing the data and training our model, how do we know if our classification model performs well?…
2. Data Integration
Data Integration is a data preprocessing technique that combines data from multiple sources such as databases (relational and non-relational), data cubes, files, etc., and provides users a unified view of these data. It gives a complete picture of key performance indicators (KPIs), customer journeys, market opportunities, etc. The data sources can be homogeneous or heterogeneous….

Gradient Descent (now with a little bit of scary maths)
Buckle up Buckaroo because Gradient Descent is gonna be a long one (and a tricky one too). The whole article would be a lot more “mathy” than most articles as it tries to cover the concepts behind a Machine Learning algorithm called Linear Regression. If you don’t know what Linear Regression is, go through this article once. It would help…
C. Recursive Feature Elimination
It is a greedy optimization algorithm that aims to find the best-performing feature subset. It repeatedly creates models and keeps aside the best or the worst performing feature at each iteration. It constructs the next model with the left features until all the features are exhausted. It then ranks the features based on the order…
Forward selection
Forward selection uses searching as a technique for selecting the best features. It is an iterative method in which we start with having no feature in the model. The step forward feature selection procedure begins by evaluating all feature subsets that consist of only one input variable. It selects the “best” feature and afterwards, adds…