The Computer Journal Advance Access published online on May 23, 2007
The Computer Journal, doi:10.1093/comjnl/bxm012
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
User-Oriented Feature Selection for Machine Learning
1 The Key Laboratory of Complex Systems and Intelligence Science Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
2 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2
* Corresponding author. E-mail address: hongli.liang{at}ia.ac.cn
Received 14 June 2006; revised 6 February 2007
The effectiveness of any machine learning algorithm depends, to a large extent, on the selection of a good subset of features or attributes. Most existing methods use the syntactic or statistical information of the data, relying on a heuristic criterion to select features. In this paper, we investigate an alternative less-studied approach called user-oriented feature selection by exploiting the domain-specific semantic information. Given any two features, a user is able to express which one is more important based on the semantic consideration. Such user requirements are formally described by a preference relation on the set of features. Algorithms are proposed to construct a subset of features that is most consistent with the user requirements. Their properties and computational complexity are analysed. User-oriented feature selection offers a new view for machine learning and its potentials need to be further investigated and explored.
Key Words: User-oriented feature selection NP-hard reduct