Attribute Reduction with Test Cost Constraint
In many machine learning applications,data are not free, and there is a test cost for each data item. For the economical reason, some existing works try to minimize the test cost and at the same time,preserve a particular property of a given decision system. In this paper, we point out that the test cost one can afford is limited in some applications. Hence, one has to sacrifice respective properties to keep the test cost under a budget. To formalize this issue, we define the test cost constraint attribute reduction problem, where the optimization objective is to minimize the conditional information entropy. This problem is an essential generalization of both the test-cost-sensitive attribute reduction problem and the 0-1 knapsack problem,therefore it is more challenging. We propose a heuristic algorithm based on the information gain and test costs to deal with the new problem. The algorithm is tested on four UCI (University of California - Irvine) datasets with various test cost settings. Experimental results indicate the appropriate setting of the only userspecified parameter λ.
Author: Fan Min William Zhu
作者单位: Lab of Granular Computing,Zhangzhou Normal University, Fujian 363000, China
年,卷(期): 2011, 09(2)
分类号: TP18
机标分类号: TN6 TP1
在线出版日期: 2011年12月16日
基金项目: the National Natural Science Foundation of China