These days there is a hype in the world that if one has more features then we have better discrimination in the model but this hypothesis will not hold in all the situations. Actually the performance of the model vs no of features will decrease when the number of features increases and it will look like

As in Instance-Based learning methods (like k-means) where features are very important in the case of k-means extra features will add noise in the distance calculation. To handle this problem we should have to avoid irrelevant features while selecting the features because they will add the noise to the model learning. These extra features will generally affect the performance when we have a limited training dataset. In the data science world, we call this a curse of dimensionality
Curse of dimensionality in simple words is when we have training dataset with large number of training features and more computational resource to learn.
The only solution to this big threat is feature reduction. There are two ways to do feature reduction
- Feature Selection
- Feature Extraction
In this module, we are going to explore the feature selection method of feature reduction. The feature selection is the way in which we have to do select the feature set from all the features such that it will be a subset of the bigger one.
There are 2^n possibilities of the feature reduction set. If n is the size of the features set. For starting this procedure you have to first create the dataset in which all the features are highly uncorrelated.
Feature Selection:
To do the process there are two ways exists:
- Forward Selection: In this, we will start with the empty feature list and we will try each feature combination and select the best combination.
- Backward Selection: In this, we will start with the full feature list and we will try each feature combination and select the best combination.
As till now, we are working with the multiple combinations of the features and trying for optimization of the model. we can use the feature by feature and give them score then select those feature which will clear the cutoff those feature will be the best for the optimum solution/model.
There are many ways to give score when we are trying the single wise features which will help to select the best features which are :
- Pearson correlation coefficient
- F-score
- Chi-square
- Signal to noise ration
- mutual information
These methods will help you to select the features which are really useful.
I hope all this will help you to select the feature space and built the finest model.
Comments
Post a Comment