Introduction
Feature selection is the process of selecting features from the ones generated by the feature extraction method. A goal of feature selection is to avoid selecting too many or too few features than it is necessary. If two few features are selected, there is a good chance that the information content in the set of features is low. On the other hand, if too many (irrelevant) features are selected , the noise present in data may obscure the information present. So there is a trade off in selecting the features [PIR04].
It should be noted that a given feature might provide more information when present with other features than when considered by itself. [PIR04] provides some references regarding the importance of feature selection as a set, rather than selecting the best features to form the supposedly best set. They have shown that the best individual features do not necessarily constitute the best set of features.
In most real-world applications, it is not known what the best set of features is. The number of features in the best set is also not known. Currently, there is no means to obtain the value of the number of features, which partially depends on the objective of interest.
Methods
Feature selection can generally be done in three ways: filtering, wrapper and hybrid.
Filtering
In this approach, features are ranked first based on some metric (which we can refer to it as fitness). Features with lower fitness are then discarded (filtered out) and the rest of the features are used for the application. Filters are much faster than the wrapper methods and they can handle large datasets [DAS97].
Various Algorithms have been implemented for filtering irrelevant features. They include:
References
[DAS97] M. Dash, and H. Liu, "Feature Selection for Classification", Intelligent Data Analysis, Vol.`, pp.1-27, 1997.
[PIR04] S. Piramuthu, "Evaluating feature selection methods for learning in data mining applications", European Journal of Operational Research, Vol.156, No.2 , pp.483-494 ,2004.





