SELECTION OF OPTIMAL FEATURES IN STATISTICAL MODELLING
View/ Open
Date
2021Author
Gachoki, P. K.
Njoroge, G. G.
Muraya, M. M.
Metadata
Show full item recordAbstract
In statistical modelling, selection of optimal features entails making a selection of relevant predictor variables to be used
in development of statistical models. Most modelling studies have focused on construction of statistical models skipping
out or failing to put on record the process of selection of best features which is an integral part of statistical modeling.
This failure might lead to use of duplicated features, features that are less relevant or other that have low variance in
addition to random features which could result to poor performing prediction models. This study seeks to discuss how
feature selection can be done as a pre-requisite for statistical modeling. Some of the methods used in selection of best
features include; forward selection, backward elimination, recursive elimination, entropy selection, variance threshold
elimination, chi-square statistics, tree based selection, feature importance and correlation matrix with heat maps. This
study is vital to researchers building statistical models since use of optimal features in statistical modeling would lead to
high performing statistical models.