Sparse Nonlinear Feature Selection Algorithm via Local Structure Learning

Jiaye Li, Guoqiu Wen, Jiangzhang Gan, Leyuan Zhang, Shanwen Zhang

Abstract


In this paper, we propose a new unsupervised feature selection algorithm by considering the nonlinear and similarity relationships within the data. To achieve this, we apply the kernel method and local structure learning to consider the nonlinear relationship between features and the local similarity between features. Specifically, we use a kernel function to map each feature of the data into the kernel space. In the high-dimensional kernel space, different features correspond to different weights, and zero weights are unimportant features (e.g. redundant features). Furthermore, we consider the similarity between features through local structure learning, and propose an effective optimization method to solve it. The experimental results show that the proposed algorithm achieves better performance than the comparison algorithm.


Keywords


Feature Selection; Kernel Function; Local Structure Learning; High-Dimensional.

References


X. Zhu, H. Suk and D. Shen, "Matrix-Similarity Based Loss Function and Feature Selection for Alzheimer's Disease Diagnosis," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 3089-3096. doi: 10.1109/CVPR.2014.395.

Zhu, Xiaofeng, Zi Huang, Yang Yang, Heng Tao Shen, Changsheng Xu, and Jiebo Luo. “Self-Taught Dimensionality Reduction on the High-Dimensional Small-Sized Data.” Pattern Recognition 46, no. 1 (January 2013): 215–229. doi:10.1016/j.patcog.2012.07.018.

Zhu, Xiaofeng, Shichao Zhang, Zhi Jin, Zili Zhang, and Zhuoming Xu. “Missing Value Estimation for Mixed-Attribute Data Sets.” IEEE Transactions on Knowledge and Data Engineering 23, no. 1 (January 2011): 110–121. doi:10.1109/tkde.2010.99.

Bolón-Canedo, Verónica, Noelia Sánchez-Maroño, and Amparo Alonso-Betanzos. “Feature Selection for High-Dimensional Data.” Progress in Artificial Intelligence 5, no. 2 (February 15, 2016): 65–75. doi:10.1007/s13748-015-0080-y.

Ling, Charles X., Qiang Yang, Jianning Wang, and Shichao Zhang. “Decision Trees with Minimal Costs.” Twenty-First International Conference on Machine Learning - ICML ’04 (2004). doi:10.1145/1015330.1015369.

Zhang, Shichao, Zhi Jin, and Xiaofeng Zhu. “Missing Data Imputation by Utilizing Information Within Incomplete Instances.” Journal of Systems and Software 84, no. 3 (March 2011): 452–459. doi:10.1016/j.jss.2010.11.887.

Nie F, Zhu W, Li X. (2016). Unsupervised feature selection with structured graph optimization[C]// Thirtieth AAAI Conference on Artificial Intelligence. AAAI Press, 2016:1302-1308.

Almusallam, Naif, Zahir Tari, Jeffrey Chan, and Adil AlHarthi. “UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features.” Lecture Notes in Computer Science (2018): 495–507. doi:10.1007/978-3-319-93037-4_39.

Xue, Wei, and Wensheng Zhang. “Online Weighted Multi-Task Feature Selection.” Lecture Notes in Computer Science (2016): 195–203. doi:10.1007/978-3-319-46672-9_23.

Xinwang Liu, Lei Wang, Jian Zhang, Jianping Yin, and Huan Liu ,“Global and Local Structure Preservation for Feature Selection.” IEEE Transactions on Neural Networks and Learning Systems 25, no. 6 (June 2014): 1083–1095. doi:10.1109/tnnls.2013.2287275.

Wan, Yuan, Xiaoli Chen, and Jinghui Zhang. “Global and Intrinsic Geometric Structure Embedding for Unsupervised Feature Selection.” Expert Systems with Applications 93 (March 2018): 134–142. doi:10.1016/j.eswa.2017.10.008.

Li, Yun, Jennie Si, Guojing Zhou, Shasha Huang, and Songcan Chen. “FREL: A Stable Feature Selection Algorithm.” IEEE Transactions on Neural Networks and Learning Systems 26, no. 7 (July 2015): 1388–1402. doi:10.1109/tnnls.2014.2341627.

Tsagris, Michail, Vincenzo Lagani, and Ioannis Tsamardinos. “Feature Selection for High-Dimensional Temporal Data.” BMC Bioinformatics 19, no. 1 (January 23, 2018). doi:10.1186/s12859-018-2023-7.

Zhao, Hong, Ping Wang, and Qinghua Hu. “Cost-Sensitive Feature Selection Based on Adaptive Neighborhood Granularity with Multi-Level Confidence.” Information Sciences 366 (October 2016): 134–149. doi:10.1016/j.ins.2016.05.025.

Wang, Wei, Yan Yan, Stefan Winkler, and Nicu Sebe. “Category Specific Dictionary Learning for Attribute Specific Feature Selection.” IEEE Transactions on Image Processing 25, no. 3 (March 2016): 1465–1478. doi:10.1109/tip.2016.2523340.

Sheeja, T.K., and A. Sunny Kuriakose. “A Novel Feature Selection Method Using Fuzzy Rough Sets.” Computers in Industry 97;111–121 (May 2018): 111–116. doi:10.1016/j.compind.2018.01.014.

Zhang, Tao, Biyun Ding, Xin Zhao, and Qianyu Yue. “A Fast Feature Selection Algorithm Based on Swarm Intelligence in Acoustic Defect Detection.” IEEE Access 6 (2018): 28848–28858. doi:10.1109/access.2018.2833164.

Almusallam, Naif, Zahir Tari, Jeffrey Chan, and Adil AlHarthi. “UFSSF - An Efficient Unsupervised Feature Selection for Streaming Features.” Lecture Notes in Computer Science (2018): 495–507. doi:10.1007/978-3-319-93037-4_39.

Xue, Wei, and Wensheng Zhang. “Online Weighted Multi-Task Feature Selection.” Lecture Notes in Computer Science (2016): 195–203. doi:10.1007/978-3-319-46672-9_23.

Liu X, Wang L, Zhang J, Yin Jianping and Liu Huan. “Global and Local Structure Preservation for Feature Selection.” IEEE Transactions on Neural Networks and Learning Systems 25, no. 6 (June 2014): 1083–1095. doi:10.1109/tnnls.2013.2287275.

Wan, Yuan, Xiaoli Chen, and Jinghui Zhang. “Global and Intrinsic Geometric Structure Embedding for Unsupervised Feature Selection.” Expert Systems with Applications 93 (March 2018): 134–142. doi:10.1016/j.eswa.2017.10.008.

Li, Yun, Jennie Si, Guojing Zhou, Shasha Huang, and Songcan Chen. “FREL: A Stable Feature Selection Algorithm.” IEEE Transactions on Neural Networks and Learning Systems 26, no. 7 (July 2015): 1388–1402. doi:10.1109/tnnls.2014.2341627.

Tsagris, Michail, Vincenzo Lagani, and Ioannis Tsamardinos. “Feature Selection for High-Dimensional Temporal Data.” BMC Bioinformatics 19, no. 1 (January 23, 2018). doi:10.1186/s12859-018-2023-7.

Zhao, Hong, Ping Wang, and Qinghua Hu. “Cost-Sensitive Feature Selection Based on Adaptive Neighborhood Granularity with Multi-Level Confidence.” Information Sciences 366 (October 2016): 134–149. doi:10.1016/j.ins.2016.05.025.

Wang, Wei, Yan Yan, Stefan Winkler, and Nicu Sebe. “Category Specific Dictionary Learning for Attribute Specific Feature Selection.” IEEE Transactions on Image Processing 25, no. 3 (March 2016): 1465–1478. doi:10.1109/tip.2016.2523340.

Sheeja, T.K., and A. Sunny Kuriakose. “A Novel Feature Selection Method Using Fuzzy Rough Sets.” Computers in Industry 97 (May 2018): 111–116. doi:10.1016/j.compind.2018.01.014.

Zhang, Tao, Biyun Ding, Xin Zhao, and Qianyu Yue. “A Fast Feature Selection Algorithm Based on Swarm Intelligence in Acoustic Defect Detection.” IEEE Access 6 (2018): 28848–28858. doi:10.1109/access.2018.2833164.

Zhang, Chengqi, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang, and Shichao Zhang. “Clustering-Based Missing Value Imputation for Data Preprocessing.” 2006: 1081-1086 IEEE International Conference on Industrial Informatics (August 2006). doi:10.1109/indin.2006.275767.

Qin, Yongsong, Shichao Zhang, Xiaofeng Zhu, Jilian Zhang, and Chengqi Zhang. “Semi-Parametric Optimization for Missing Data Imputation.” Applied Intelligence 27, no. 1 (January 18, 2007): 79–88. doi:10.1007/s10489-006-0032-0.

Wang, Suhang, Jiliang Tang, and Huan Liu. "Embedded unsupervised feature selection." In Twenty-ninth AAAI conference on artificial intelligence. 2015: 470-476.

Du, Liang, and Yi-Dong Shen. “Unsupervised Feature Selection with Adaptive Structure Learning.” Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’15 (2015). doi:10.1145/2783258.2783345.

Li, Zechao, Yi Yang, Jing Liu, Xiaofang Zhou, and Hanqing Lu. "Unsupervised feature selection using nonnegative spectral analysis." In Twenty-Sixth AAAI Conference on Artificial Intelligence. 2012: 1026-1032.

Li, Jundong, Xia Hu, Liang Wu, and Huan Liu. "Robust unsupervised feature selection on networked data." In Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 387-395. Society for Industrial and Applied Mathematics, 2016: 387-395.

Chen, Xiaojun, Guowen Yuan, Wenting Wang, Feiping Nie, Xiaojun Chang, and Joshua Zhexue Huang. “Local Adaptive Projection Framework for Feature Selection of Labeled and Unlabeled Data.” IEEE Transactions on Neural Networks and Learning Systems 29, no. 12 (December 2018): 6362–6373. doi:10.1109/tnnls.2018.2830186.

Nie, Feiping, Heng Huang, Xiao Cai, and Chris H. Ding. "Efficient and robust feature selection via joint ℓ2, 1-norms minimization." In Advances in neural information processing systems, pp. 1813-1821. 2010.

Zhu, Pengfei, Wangmeng Zuo, Lei Zhang, Qinghua Hu, and Simon C.K. Shiu. “Unsupervised Feature Selection by Regularized Self-Representation.” Pattern Recognition 48, no. 2 (February 2015): 438–446. doi:10.1016/j.patcog.2014.08.006.

Zhou, Peng, Xuegang Hu, Peipei Li, and Xindong Wu. “Online Feature Selection for High-Dimensional Class-Imbalanced Data.” Knowledge-Based Systems 136 (November 2017): 187–199. doi:10.1016/j.knosys.2017.09.006.

Nesterov, Yurii. “Nonsmooth Convex Optimization.” Introductory Lectures on Convex Optimization (2004): 111–170. doi:10.1007/978-1-4419-8853-9_3.


Full Text: PDF

DOI: 10.28991/esj-2019-01175

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 Jiaye Li, Guoqiu Wen, Jiangzhang Gan, Leyuan Zhang, Shanwen Zhang