Feature selection techinques in R

Image for post
Image for post
Feature selection Techniques with R

Working in machine learning field is not only about building different classification or clustering models. It’s more about feeding the right set of features into the training models.

This process of feeding the right set of features into the model mainly take place after the data collection process.

Once we have enough data, We won’t feed entire data into the model and expect great results. We need to pre-process the data.

In fact, the challenging and the key part of machine learning processes is data preprocessing.

Below are the key things we indented to do in data preprocessing stage.

  • Feature transformation
  • Feature selection

Feature transformation is to transform the already existed features into other forms. Suppose using the logarithmic function to convert normal features to logarithmic features.

Feature selection is to select the best features out of already existed features. In this article, we are going to learn the basic techniques to pick the best features for modeling.

Before we drive further. Let’s have a look at the table of contents.

Table of contents:

  • Why modeling is not the final step
  • The role of correlation
  • Calculating feature importance with regression methods
  • Using caret package to calculate feature importance
  • Random forest for calculating feature importance
  • Conclusion

Read more of this post

Written by

Data Science Blog

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store